Manager – Site Reliability Engineering
Company | Centene |
---|---|
Location | New Mexico, USA, Pennsylvania, USA, Iowa, USA, Texas, USA, Florida, USA, South Carolina, USA, Georgia, USA, Tennessee, USA, Arkansas, USA, Minnesota, USA, Utah, USA, Wisconsin, USA, North Carolina, USA, Oklahoma, USA, Missouri, USA, Ohio, USA, Michigan, USA, Illinois, USA, Alabama, USA, United States |
Salary | $100900 – $186800 |
Type | Full-Time |
Degrees | Bachelor’s |
Experience Level | Senior |
Requirements
- Requires a Bachelor’s degree and 5+ years of related experience.
- Or equivalent experience acquired through accomplishments of applicable knowledge, duties, scope and skill reflective of the level of this position.
- One or more of the following skills are desired: SRE experience.
Responsibilities
- Manages the team who designs, integrates, and implements optimum platform infrastructure performance, reliability, and security using continuous integration, continuous delivery (CI/CD) tools, processes, and designs.
- Designs services to automate monitoring activities and oversee the deployment of standardized and scalable software tools to ensure that systems operate without interruption at optimum performance.
- Reviews service disruptions to determine the root cause of issues and design solutions for improved reliability.
- Designs strategies that increase system reliability and performance through process optimization.
- Builds innovation in the areas of distributed system flow and resilience, continuous feedback, and delivery.
- Designs and implements service mesh systems while leveraging strategies to package platforms and services.
- Works collaboratively with development and other team leads to investigate issues and create new solutions to mitigate them.
- Proactively partners and communicates with IT and Product stakeholders to achieve SRE goals and facilitate stakeholder business goals.
- Reviews the stress, security, and performance testing output and performs periodic system validation.
- Oversees and ensures regular deployment of new versions of the systems and their subcomponents.
- Performs post incident reviews and document findings for future informed decision making.
- Develops a roadmap and plan for enterprise-wide cloud reliability and scalability initiatives.
- Builds platforms that teams can leverage to accelerate innovation in the areas of reliability, scalability, and velocity.
- Provides technical leadership for software platform development, software operation transformation initiatives, and mentors team members.
- Manages the hiring and training of new and existing staff, conducts performance / salary reviews, and provides leadership, technical guidance and coaching.
- Shares knowledge and develops staff capabilities to strengthen understanding of business issues and best practices.
- Develops and communicates departmental objectives; inspires and motivates team members to achieve results.
- Performs other duties as assigned.
- Complies with all policies and standards.
Preferred Qualifications
- SRE experience.