Posted in

Senior Site Reliability Engineer – SRE

Senior Site Reliability Engineer – SRE

CompanyUniswap
LocationNew York, NY, USA
Salary$198000 – $220000
TypeFull-Time
DegreesBachelor’s, Master’s
Experience LevelSenior

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
  • 5+ years of experience in site reliability engineering, DevOps, or a related field
  • Strong understanding of reliability engineering principles, practices, and tools.
  • Proficiency in monitoring and alerting tools (e.g., Prometheus, Grafana, Nagios).
  • Experience with cloud platforms (AWS, Azure, GCP) and container orchestration systems (Kubernetes, Docker).
  • Proficiency in scripting and automation tools, such as Python, Bash, Ansible, or Terraform.
  • Excellent problem-solving skills and the ability to work under pressure in a fast-paced environment.
  • Strong communication and interpersonal skills, with the ability to influence and lead teams.

Responsibilities

  • Design, implement, and maintain systems and processes that enhance the reliability, availability, and performance of our services.
  • Design, implement and maintain CICD tools and processes to increase reliability
  • Design, implement and maintain cloud constructs to increase reliability
  • Develop and manage monitoring, alerting, and incident response strategies to minimize downtime and ensure rapid recovery from incidents.
  • Conduct root cause analysis of system failures and implement preventative measures.
  • Optimize system performance and automate repetitive tasks to improve operational efficiency.
  • Work closely with software engineering, infrastructure, and product teams to integrate reliability practices into the development lifecycle.
  • Advocate for SRE best practices and foster a culture of reliability and operational excellence across the organization.
  • Communicate effectively with stakeholders, providing regular updates on reliability metrics, incidents, and improvement initiatives.
  • Stay abreast of the latest industry trends and technologies in SRE, reliability, and performance.
  • Continuously evaluate and improve existing systems and processes to enhance reliability and efficiency.
  • Drive the adoption of new tools and technologies that can improve operational capabilities.

Preferred Qualifications

  • Experience with continuous integration and continuous deployment (CI/CD) practices and tools.
  • Knowledge of configuration management tools (e.g., Puppet, Chef).
  • Experience with database management and optimization.
  • Familiarity with compliance frameworks and security best practices.
  • Relevant certifications such as AWS Certified DevOps Engineer, Google Professional SRE, or equivalent.