Senior Staff Software Engineer – Reliability Engineering
Company | Airbnb |
---|---|
Location | United States |
Salary | $244000 – $304000 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s, PhD |
Experience Level | Senior, Expert or higher |
Requirements
- BS, MS, or PhD in computer science, related field, or equivalent work experience.
- 12+ years of software engineering experience, with a significant portion dedicated to system architecture and design in consumer-facing technology companies.
- Strong leadership skills, with 5+ years of experience as a senior-level technical lead or architect, driving the technical direction and strategy across multiple teams or projects.
- Excellent communication and collaboration skills, with a proven track record of working effectively across teams and organizations.
- Demonstrated expertise in building and scaling high-availability systems and platforms, with a deep understanding of multi-cloud environments.
Responsibilities
- Develop a roadmap with a longer-term vision for Reliability and serve as a strategic thought partner within the organization.
- Design, implement and influence company-wide SRE architecture, innovation, engineering, and standards.
- Create incident management processes that can scale with the organization as it continues its rapid growth.
- Foster the SRE/Reliability model that takes into consideration the nuances of an engineering culture that has a great sense of ownership over their services.
- Bring a strong customer focus to the Reliability function, centered on optimizing the infrastructure and platform, and ensuring systems are highly available and performant.
- Develop Production Readiness standards to ensure service reliability.
- Automate as much as possible and always configure as code.
- Predict future failures and work proactively to mitigate them.
- Advocate and implement reliable design patterns (circuit breakers, graceful degradation, etc.).
- Create a culture where Reliability is a state of mind, instilling a proactive approach to seeing patterns and opportunities to increase leverage and tooling.
- Build deep partnerships with engineering leaders.
- Work closely with product engineering teams on design and implementation choices of large-scale distributed systems.
- Partner with the broader organization to learn from incidents through a blameless post mortem process.
- Mentor and lead other Site Reliability Engineers. Uplevel and support others with servant leadership, mentorship, advocacy, and allyship.
Preferred Qualifications
-
No preferred qualifications provided.