Skip to content

Senior Site Reliability Engineer
Company | MongoDB |
---|
Location | New York, NY, USA |
---|
Salary | $127000 – $249000 |
---|
Type | Full-Time |
---|
Degrees | Bachelor’s |
---|
Experience Level | Senior |
---|
Requirements
- Experience running mission critical services at scale
- Experience with observability of large scale distributed systems
- An understanding of information security issues
- Firm grasp of at least one modern programming language, beyond basic scripting
- Solid understanding of web and network protocols and standards (HTTP, TLS, DNS, etc)
- Bachelor’s degree in Computer Science or equivalent experience
Responsibilities
- Define standards and vision for the mission-critical observability platform leveraged by all parts of the engineering organization
- Design, architect, build and deliver core pieces of our observability services in collaboration with other vested parties
- Design, implement, and troubleshoot the monitoring of services that seamlessly spans the globe – including several cloud providers
- Build for reliability, making services and infrastructure available, resilient, fault tolerant and self-healing
- Identify and configure key metrics to detect incidents and quantify service health, availability and performance.
- Participate in a week-long on-call rotation and blameless post-mortem process
- Improve our observability capabilities, optimizing for cost, ease of use, and maintainability
Preferred Qualifications
- Experience with at least one of the major cloud providers (Amazon Web Services, Google Compute, Microsoft Azure)
- Experience working in a kubernetes-based environment kubernetes clusters