Posted in

Program Manager – Site Reliability Engineering

Program Manager – Site Reliability Engineering

CompanyVeeam Software
LocationCalifornia, USA
Salary$136500 – $195000
TypeFull-Time
DegreesBachelor’s
Experience LevelSenior

Requirements

  • 4-5 years managing operations focused projects in a SaaS environment
  • 5+ years of experience in a program or project management role managing complex, technical projects
  • Experience implementing and coordinating Incident Management processes
  • B.S. degree in Business Management, Information Systems, Engineering, Computer Science or a related field (or equivalent experience) is highly desired
  • Comfortable working with loosely defined, custom agile/scrum/kanban processes and driving process improvements
  • Strong sense of ownership and desire to be successful
  • Ability to drive programs working with highly effective people on cutting-edge technology
  • Ability to lead cross-functional, globally dispersed teams and influence stakeholders without direct authority
  • Strong verbal and written communication skills, with demonstrated ability to influence teams and to communicate succinctly

Responsibilities

  • Collaborate with leaders and teams across SRE, Engineering, Support, and Product to organize, plan, and execute projects from inception to post-launch analysis
  • Utilize just enough process to create visibility into progress, risks, and milestones for all stakeholders without overwhelming the team
  • Proactively identify potential roadblocks and keep projects on track
  • Act as the key point of contact between SRE and other teams, to ensure alignment, resolve dependencies, and drive products to completion
  • Drive improvements in incident management process, identifying areas for automation, process refinement, and faster resolution times
  • Act as an Incident Leader for large-running incidents
  • Provide training and mentoring to empower the team in incident management best practices
  • Oversee post-incident reviews, ensuring root cause analysis is conducted and corrective actions are tracked to completion
  • Define and track key incident metrics (MTTR, incident recurrence, etc.) to drive accountability and operational improvements
  • Use a data-driven mindset into all aspects of status reporting, and clearly and proactively communicate, across all levels of the organization, status as well as risks and mitigation plans

Preferred Qualifications

  • Experience managing work and reporting with JIRA
  • Software Development or Product Management experience
  • Experience working with SaaS or data protection products
  • Experience using incident management tool like Incident.io or Rootly.com