Principal Performance Engineer - Cortex Cloud

Principal Performance Engineer – Cortex Cloud

Company	Palo Alto Networks
Location	Santa Clara, CA, USA
Salary	$Not Provided – $Not Provided
Type	Full-Time
Degrees	Bachelor’s, Master’s
Experience Level	Expert or higher

Requirements

10+ years of experience in software engineering or performance engineering, with a strong focus on testing and optimizing distributed cloud-native systems.
Proven track record in building and executing performance testing strategies in complex, large-scale environments.
Strong programming and scripting skills in Python (preferred), along with experience using performance tools like JMeter, Locust, or similar.
Expertise in performance profiling, diagnostics, and tuning across microservices architectures.
Deep understanding of cloud platforms (AWS, GCP, Azure), Kubernetes orchestration, and cloud-native service architectures.
Hands-on experience with observability and monitoring tools such as Prometheus, Grafana, OpenTelemetry etc.
Strong knowledge of CI/CD systems and infrastructure automation, including Gitlab, Jenkins or similar.
Experience with Java JVM tuning, Golang profiling is a plus.
Excellent analytical, debugging, and troubleshooting skills at the system and application levels.
Exceptional communication skills with the ability to influence architecture and drive performance culture across engineering.
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
Experience in chaos engineering, fault injection, and resilience testing.
Familiarity with capacity planning, system tuning, and infrastructure sizing for cloud-native applications.
Strong knowledge of database performance, caching strategies, and distributed systems principles.
Prior experience leading performance initiatives in multi-cloud or hybrid-cloud environments.

Responsibilities

Design and implement end-to-end performance testing strategies for distributed cloud-native systems, ensuring scalability, reliability, and responsiveness.
Build and maintain robust, reusable performance test frameworks and pipelines using tools like JMeter, Locust, or custom Python-based solutions.
Develop and execute load, stress, soak, and failover tests to simulate real-world usage patterns, edge cases, and peak load scenarios.
Identify system bottlenecks, resource contention, and inefficiencies across services, infrastructure, and code; work cross-functionally to drive resolution.
Collaborate with Product and Customer Success teams to understand key customer workflows and usage patterns, translating them into performance test scenarios.
Integrate performance tests into CI/CD pipelines and staging environments to enable continuous performance validation and pre-release gatekeeping.
Define and track key performance metrics (e.g., latency, throughput, system resource usage) and build dashboards using Prometheus, Grafana, or other observability platforms.
Perform deep-dive analysis of performance test results, system telemetry, and application profiling data (e.g., Flamegraphs, heap dumps).
Advocate for performance-first design principles across engineering teams; influence architectural decisions to improve system efficiency and testability.
Contribute to chaos engineering and fault injection efforts to validate system resilience under adverse conditions.
Lead incident retrospectives related to performance degradation and provide guidance for proactive tuning and improvements.
Thrive in a fast-paced environment, owning performance initiatives from inception through implementation and continuous iteration.

Preferred Qualifications

Experience with Java JVM tuning, Golang profiling is a plus.