Principal Performance Engineer - Cortex Cloud

Principal Performance Engineer – Cortex Cloud

Company	Palo Alto Networks
Location	Santa Clara, CA, USA
Salary	$Not Provided – $Not Provided
Type	Full-Time
Degrees	Bachelor’s, Master’s
Experience Level	Expert or higher

Requirements

10+ years of experience in software engineering or performance engineering, with a strong focus on testing and optimizing distributed cloud-native systems
Proven track record in building and executing performance testing strategies in complex, large-scale environments
Strong programming and scripting skills in Python (preferred), along with experience using performance tools like JMeter, Locust, or similar
Expertise in performance profiling, diagnostics, and tuning across microservices architectures
Deep understanding of cloud platforms (AWS, GCP, Azure), Kubernetes orchestration, and cloud-native service architectures
Hands-on experience with observability and monitoring tools such as Prometheus, Grafana, OpenTelemetry etc
Strong knowledge of CI/CD systems and infrastructure automation, including Gitlab, Jenkins or similar
Excellent analytical, debugging, and troubleshooting skills at the system and application levels
Exceptional communication skills with the ability to influence architecture and drive performance culture across engineering
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.

Responsibilities

Design and implement end-to-end performance testing strategies for distributed cloud-native systems, ensuring scalability, reliability, and responsiveness
Build and maintain robust, reusable performance test frameworks and pipelines using tools like JMeter, Locust, or custom Python-based solutions
Develop and execute load, stress, soak, and failover tests to simulate real-world usage patterns, edge cases, and peak load scenarios
Identify system bottlenecks, resource contention, and inefficiencies across services, infrastructure, and code; work cross-functionally to drive resolution
Collaborate with Product and Customer Success teams to understand key customer workflows and usage patterns, translating them into performance test scenarios
Integrate performance tests into CI/CD pipelines and staging environments to enable continuous performance validation and pre-release gatekeeping
Define and track key performance metrics (e.g., latency, throughput, system resource usage) and build dashboards using Prometheus, Grafana, or other observability platforms
Perform deep-dive analysis of performance test results, system telemetry, and application profiling data (e.g., Flamegraphs, heap dumps)
Advocate for performance-first design principles across engineering teams; influence architectural decisions to improve system efficiency and testability
Contribute to chaos engineering and fault injection efforts to validate system resilience under adverse conditions
Lead incident retrospectives related to performance degradation and provide guidance for proactive tuning and improvements
Thrive in a fast-paced environment, owning performance initiatives from inception through implementation and continuous iteration.

Preferred Qualifications

Experience with Java JVM tuning, Golang profiling is a plus
Experience in chaos engineering, fault injection, and resilience testing
Familiarity with capacity planning, system tuning, and infrastructure sizing for cloud-native applications
Strong knowledge of database performance, caching strategies, and distributed systems principles
Prior experience leading performance initiatives in multi-cloud or hybrid-cloud environments