Posted in

Lead Site Reliability Engineer

Lead Site Reliability Engineer

CompanyBumble
LocationAustin, TX, USA
Salary$198000 – $250000
TypeFull-Time
Degrees
Experience LevelSenior, Expert or higher

Requirements

  • Excellent problem solving, analytical skills
  • Strong communication and collaboration skills are a must
  • Proficiency in at least Python or Golang programming languages
  • Experience with CI/CD pipelines
  • Strong Proficiency with Kubernetes architecture
  • Prior experience in SRE, System administration or DevOps roles
  • Strong proficiency with Linux/Unix operating systems, including hands-on experience in configuration and troubleshooting
  • Proficiency with using Puppet for configuration management, automation and system provisioning
  • Hands-on experience in Monitoring and observability platforms such as: Grafana, Prometheus, Elasticsearch, jaeger
  • Experience with Cloud architectures such as GCP or AWS
  • Familiarity with SQL databases and broker systems such as Kafka

Responsibilities

  • Design and build new tools and services from the ground up to solve complex problems
  • Build automation frameworks to streamline repetitive tasks
  • Design and maintain scalable, highly available and fault-tolerant systems
  • Build and maintain observability tooling including logging, Monitoring, tracing and alerting systems
  • Develop and maintain automation tooling to reduce manual intervention
  • Implement infrastructure as code (IaC) for infrastructure provisioning
  • Monitor system health and performance, identifying and fixing issues
  • Respond to system outages, troubleshooting root causes and implementing preventative measures
  • Collaborate with engineering teams and security engineers to improve system reliability, security and performance
  • Participate in on-call rotations
  • Create and maintain documentation to improve knowledge sharing across teams

Preferred Qualifications

  • You are a solution-orientated professional with a passion for problem-solving
  • You take pride in ensuring systems are performant, stable and efficient
  • You thrive in a collaborative environment
  • Continuous learning is important to you and your activity explores new tools and techniques
  • You are curiosity-driven and are constantly seeking new ways to improve processes and implement new modern solutions
  • You are committed to ensuring quality is at the heart of every project