Director of Site Reliability Engineering
Company | Stellar Development Foundation |
---|---|
Location | San Francisco, CA, USA |
Salary | $210000 – $310000 |
Type | Full-Time |
Degrees | |
Experience Level | Senior |
Requirements
- 3+ years of experience working as a Site Reliability Engineer
- 3+ years of experience managing an SRE team
- Strong track record of collaborating with dev teams at all stages of product development (design, development/CI, beta testing, production)
- Strong track record collaborating on defining, measuring and driving improvements in KPIs
- Strong track record assisting teams during Root Cause Analysis and post mortems
- Designing and building out the infrastructure for large distributed systems
- Maintaining highly-available infrastructure
- Troubleshooting and understanding complex technical problems
- Using configuration Management or IaC tooling such as Terraform, Ansible, Puppet
- Building and maintaining infrastructure using Kubernetes
- Highly autonomous; able to find clarity in ambiguous circumstances
- Excellent communicator; comfortable working with remote team members
Responsibilities
- Establish a clear vision and mandate for the Site Reliability Engineering team
- Define the SRE team’s quarterly OKRs to best align with the company’s goals
- Define processes of collaboration between SREs and development teams throughout the software development lifecycle
- Define a career growth path for the SRE team, as well as coach and mentor individual contributors on the team
- Define and track metrics across engineering and help hold engineering teams accountable for their KPIs
- Coordinate priorities with other teams and areas of the organization
- Participate in sprint planning and execution, track progress and oversee day-to-day tactical decisions
- Design and build reliable systems, and infrastructure that is easy to use by software engineers
- Monitor and troubleshoot systems in production
- Define and participate in 24/7 on-call rotations alongside the team
- Mediate technical discussions and review PRs
- Jump in as needed with code fixes, troubleshooting and hands-on contributions
- Collaborate across the Stellar ecosystem, engaging with key partners and advising on their integration to set them up for success
Preferred Qualifications
- 3+ years of experience writing code in a major programming language
- You have worked on an open source project
- You have managed a distributed team
- You build things for fun in your spare time