Skip to content

Lead Technical Program Manager – Site Reliability Engineering
Company | Thousand Eyes |
---|
Location | San Francisco, CA, USA |
---|
Salary | $223800 – $266000 |
---|
Type | Full-Time |
---|
Degrees | Bachelor’s, Master’s |
---|
Experience Level | Senior, Expert or higher |
---|
Requirements
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- 8+ years of experience in Site Reliability Engineering, with a focus on managing SaaS applications.
- Proven experience in technical program management, preferably within a cloud infrastructure context.
- Demonstrated ability in managing global cloud infrastructure with significant monthly COGS.
- Strong analytical and problem-solving skills, with a focus on data-driven decision-making.
- Excellent communication and collaboration skills, with the ability to influence and drive change across teams.
- Proficiency in cloud platforms (especially AWS) and related technologies.
- Experience with infrastructure as code, automation, and modern DevOps practices.
- Experience with migrating and maintaining both commercial and federal environments.
Responsibilities
- Lead and manage complex SRE projects and programs that align with organizational goals and priorities.
- Develop and execute strategies for operational efficiency, reliability, and scalability of our cloud infrastructure.
- Analyze and optimize infrastructure costs, aiming for a reduction in COGS while maintaining or improving service quality.
- Collaborate with finance and engineering teams to develop cost management dashboards and reporting tools.
- Oversee the development and implementation of best practices for incident management, change management, and capacity planning.
- Drive initiatives to improve system uptime, performance, and reliability, ensuring adherence to SLAs.
- Enhance operational visibility through the development of comprehensive monitoring and alerting systems.
- Lead operational reviews and post-mortems, ensuring actionable insights and continuous improvement.
- Work closely with software engineering teams to ensure infrastructure changes and dependencies are effectively communicated and executed.
- Facilitate cross-team coordination to support large-scale infrastructure projects and initiatives.
- Mentor and guide SRE team members, fostering a culture of technical excellence and innovation.
- Stay abreast of the latest industry trends and technologies to drive continuous improvement.
Preferred Qualifications
No preferred qualifications provided.