Posted in

Site Reliability Engineering Lead

Site Reliability Engineering Lead

CompanyBank of Montreal
LocationToronto, ON, Canada
Salary$74800 – $138600
TypeFull-Time
DegreesBachelor’s
Experience LevelSenior

Requirements

  • Typically between 5 – 7 years of relevant experience and post-secondary degree in related field of study or an equivalent combination of education and experience
  • Strong technical background in DevOps, Site Reliability, Observability, Coding
  • Strong skills in Node.JS, Python or Javascript
  • CI/CD, Dynatrace, Configuration management (Ansible), AWS CDK, Lambda, Kubernetes, ECS, Openshift

Responsibilities

  • Deploys, configures, and monitors code as well as the availability, latency, change management, emergency response, and management capacity of services in production.
  • Helps the development and operations teams establish Service level indicators (SLIs), Service level objectives (SLOs) and Error budgets.
  • Performs automation to increase efficiency and decrease risk like log analysis, performance tuning, patch application, testing of production settings, incident response, and post-mortem analysis.
  • Supports in system design consulting, platform management, and capacity planning.
  • Debugs production issues across services and levels of the technology stack.
  • Improves service health visibility by recording metrics, logs, and traces across all services in order to pinpoint the reasons of an incident.
  • Computes the cost of SLA breaches and assists management in calculating the impact of system reliability. Helps development and operations teams understand the cost of downtime.

Preferred Qualifications

    No preferred qualifications provided.