Posted in

Senior Devops Infrastructure Engineer – Open-Source CI and CD

Senior Devops Infrastructure Engineer – Open-Source CI and CD

CompanyNVIDIA
LocationNewark, NJ, USA
Salary$168000 – $333500
TypeFull-Time
DegreesBachelor’s, Master’s
Experience LevelSenior

Requirements

  • B.S. or M.S. in Computer Science, Computer Engineering, or a related field (or equivalent experience)
  • 7+ years of proven experience in infrastructure, DevOps, or platform engineering
  • Strong Kubernetes expertise (running, debugging, and scaling workloads)
  • Experience with GitOps tools (ArgoCD or similar)
  • Proficiency in Linux administration and troubleshooting
  • Experience with Infrastructure as Code using Terraform/Terragrunt
  • Proficiency in Golang, Python, and TypeScript
  • Hands-on experience with monitoring, logging, and tracing (Prometheus, Grafana, OpenTelemetry, etc.)
  • Solid understanding of CI/CD pipelines, particularly GitHub Actions
  • Ability to work and collaborate effectively with a fully remote, distributed team

Responsibilities

  • Manage and scale self-hosted GitHub Actions runners using Kubernetes
  • Help expand runner support for various hardware and operating system combinations, including Linux, Windows, single-GPU, multi-GPU, NVLink, and more
  • Use Infrastructure as Code (Terraform and ArgoCD) to deploy and maintain infrastructure both on-premise and in AWS
  • Build and maintain runner VM images using HashiCorp Packer
  • Connect distributed services securely using mTLS, PKI, and HashiCorp Vault
  • Develop, package, and deploy custom Golang tools to support platform observability, stability, and efficiency
  • Configure alerting and monitoring to identify and address issues quickly, using tools like Prometheus and Grafana
  • Contribute upstream to open-source tools and libraries that our team depends on
  • Periodically update platform dependencies and address CVEs

Preferred Qualifications

  • Experience instrumenting telemetry for distributed systems
  • Strong background in GPU workloads on Kubernetes with experience writing custom Kubernetes controllers
  • Deep understanding of KubeVirt and/or virtualization
  • Experience with self-hosted GitHub Actions runners
  • Contributions to open-source Kubernetes-related projects