Principal Site Reliability Engineer – Prisma Access
Company | Palo Alto Networks |
---|---|
Location | Plano, TX, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | Bachelor’s, Master’s |
Experience Level | Senior, Expert or higher |
Requirements
- Must be a US Citizen to be considered
- 7+ years of experience in Infrastructure, SRE, or DevOps roles required
- BS or MS in Computer Science, a related field, or equivalent professional experience required or equivalent military experience required
- 4+ years of experience with AWS and GCP and expertise in their architecture, services, advanced cloud networking, and PKI concepts
- Expertise in troubleshooting and resolving cloud infrastructure and service issues, identifying root cause and devising effective solutions for high volume transactions
- Proficiency with Python and shell scripting for automation; Golang is a plus
- Proficiency in Infrastructure as Code (IaC) with Terraform and Helm, leveraging AI tools for development
- Solid experience with Kubernetes, container networking, and container workloads
- Strong Linux administration skills
- Proficiency with CI/CD pipelines, GitOps principles, GitLab, and Jenkins
- Excellent written and verbal communication skills, with the ability to collaborate effectively and rally support across teams
- Self-disciplined, self-managed, and highly driven with a strong sense of ownership and urgency
- Ability to adapt quickly to evolving cloud technologies, security threats, and advancements through continuous learning
- Able to understand and address customer needs effectively, and provide RCA to customers
- Understanding how technical decisions impact the business and aligning cloud operations with business goals
Responsibilities
- Design, build, and operate reliable, secure Cloud infrastructure across multi-cloud environments
- Ensure applications are production-ready, scalable, and resilient, collaborating closely with developers, researchers, data scientists, and security experts
- Develop expertise in new technologies and rapidly integrate them into our existing infrastructure, embracing continuous learning and the adoption of AI tools
- Develop tools and automation frameworks, championing Infrastructure as Code (IaC) and Monitoring as Code (MaC) principles
- Automate robust deployments and orchestrate end-to-end monitoring and alerting solutions
- Participate in on-call rotations with SRE and Dev teams to support critical business and production systems
- Lead root cause analysis of critical business and production issues, driving improvements and preventing recurrence
- Contribute to the success of SRE and DevOps initiatives, aligning technical decisions with business goals and understanding their impact
Preferred Qualifications
-
No preferred qualifications provided.