Posted in

Senior Site Reliability Engineer

Senior Site Reliability Engineer

CompanyMongoDB
LocationNew York, NY, USA
Salary$127000 – $249000
TypeFull-Time
DegreesBachelor’s
Experience LevelSenior

Requirements

  • Experience running mission critical services at scale
  • Experience with observability of large scale distributed systems
  • An understanding of information security issues
  • Firm grasp of at least one modern programming language, beyond basic scripting
  • Solid understanding of web and network protocols and standards (HTTP, TLS, DNS, etc)
  • Bachelor’s degree in Computer Science or equivalent experience

Responsibilities

  • Define standards and vision for the mission-critical observability platform leveraged by all parts of the engineering organization
  • Design, architect, build and deliver core pieces of our observability services in collaboration with other vested parties
  • Design, implement, and troubleshoot the monitoring of services that seamlessly spans the globe – including several cloud providers
  • Build for reliability, making services and infrastructure available, resilient, fault tolerant and self-healing
  • Identify and configure key metrics to detect incidents and quantify service health, availability and performance.
  • Participate in a week-long on-call rotation and blameless post-mortem process
  • Improve our observability capabilities, optimizing for cost, ease of use, and maintainability

Preferred Qualifications

  • Experience with at least one of the major cloud providers (Amazon Web Services, Google Compute, Microsoft Azure)
  • Experience working in a kubernetes-based environment kubernetes clusters