Posted in

Principal Performance Engineer – Cortex Cloud

Principal Performance Engineer – Cortex Cloud

CompanyPalo Alto Networks
LocationSanta Clara, CA, USA
Salary$Not Provided – $Not Provided
TypeFull-Time
DegreesBachelor’s, Master’s
Experience LevelExpert or higher

Requirements

  • 10+ years of experience in software engineering or performance engineering, with a strong focus on testing and optimizing distributed cloud-native systems.
  • Proven track record in building and executing performance testing strategies in complex, large-scale environments.
  • Strong programming and scripting skills in Python (preferred), along with experience using performance tools like JMeter, Locust, or similar.
  • Expertise in performance profiling, diagnostics, and tuning across microservices architectures.
  • Deep understanding of cloud platforms (AWS, GCP, Azure), Kubernetes orchestration, and cloud-native service architectures.
  • Hands-on experience with observability and monitoring tools such as Prometheus, Grafana, OpenTelemetry etc.
  • Strong knowledge of CI/CD systems and infrastructure automation, including Gitlab, Jenkins or similar.
  • Experience with Java JVM tuning, Golang profiling is a plus.
  • Excellent analytical, debugging, and troubleshooting skills at the system and application levels.
  • Exceptional communication skills with the ability to influence architecture and drive performance culture across engineering.
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
  • Experience in chaos engineering, fault injection, and resilience testing.
  • Familiarity with capacity planning, system tuning, and infrastructure sizing for cloud-native applications.
  • Strong knowledge of database performance, caching strategies, and distributed systems principles.
  • Prior experience leading performance initiatives in multi-cloud or hybrid-cloud environments.

Responsibilities

  • Design and implement end-to-end performance testing strategies for distributed cloud-native systems, ensuring scalability, reliability, and responsiveness.
  • Build and maintain robust, reusable performance test frameworks and pipelines using tools like JMeter, Locust, or custom Python-based solutions.
  • Develop and execute load, stress, soak, and failover tests to simulate real-world usage patterns, edge cases, and peak load scenarios.
  • Identify system bottlenecks, resource contention, and inefficiencies across services, infrastructure, and code; work cross-functionally to drive resolution.
  • Collaborate with Product and Customer Success teams to understand key customer workflows and usage patterns, translating them into performance test scenarios.
  • Integrate performance tests into CI/CD pipelines and staging environments to enable continuous performance validation and pre-release gatekeeping.
  • Define and track key performance metrics (e.g., latency, throughput, system resource usage) and build dashboards using Prometheus, Grafana, or other observability platforms.
  • Perform deep-dive analysis of performance test results, system telemetry, and application profiling data (e.g., Flamegraphs, heap dumps).
  • Advocate for performance-first design principles across engineering teams; influence architectural decisions to improve system efficiency and testability.
  • Contribute to chaos engineering and fault injection efforts to validate system resilience under adverse conditions.
  • Lead incident retrospectives related to performance degradation and provide guidance for proactive tuning and improvements.
  • Thrive in a fast-paced environment, owning performance initiatives from inception through implementation and continuous iteration.

Preferred Qualifications

  • Experience with Java JVM tuning, Golang profiling is a plus.