Posted in

Senior Software Engineer – Site Reliability

Senior Software Engineer – Site Reliability

CompanyWorkday
LocationMcLean, VA, USA
Salary$145900 – $259200
TypeFull-Time
DegreesBachelor’s
Experience LevelSenior, Expert or higher

Requirements

  • 8+ years of SRE/DevOps experience in a distributed systems environment.
  • 8+ years’ proven experience in managing and fixing distributed systems. (AWS, GCP, Kubernetes, Docker)
  • 5+ years of experience with Linux.
  • 5+ years’ demonstrated ability with at least one of (GoLang, Python, Ruby), preferably GoLang (Go)
  • 5+ years of experience with Bash or Shell scripting, plus understanding of software development standard methodologies such as code management, CI/CD.

Responsibilities

  • Updating the platform continuously in line with the major and minor release cycles of open source projects such as Kubernetes, Istio, Calico.
  • Providing level support for the buildout of new Customer environments.
  • Collaborating with multi-functional teams to come up with automation solutions Leading new Service team onboarding engagements, in partnership with Platform Engineering Architects and Product Managers.
  • Overall system health, and holding engineering teams accountable to meet agreed SLO’s such as latency and error rates.
  • Weekly platform release preparation, including evolving the automation towards zero touch. Follow through on improvements identified post-patch.
  • Vulnerability management, holding teams accountable to meet customer facing Service Level Agreements (SLAs)

Preferred Qualifications

  • BS in Computer Science or related job experience
  • Ability to work independently
  • Skills and passion to operate, maintain, support and sustain the platform.
  • Excited by working in a fast-paced environment. Experience collaborating with multi-functional global and remote teams with a diverse set of backgrounds.
  • Excellent documentation skills, experience with developing detailed runbooks, processes