Posted in

Director of Engineering – Compute

Director of Engineering – Compute

CompanyGroq
LocationPalo Alto, CA, USA
Salary$300000 – $375000
TypeFull-Time
Degrees
Experience LevelExpert or higher

Requirements

  • 10+ years in large-scale infrastructure engineering, including 3+ years leading teams that run business-critical, globally distributed fleets.
  • Proven leadership experience in highly technical engineering environments, with a track record of delivering innovative platform solutions and effectively leading, motivating, and developing high-performing engineering teams.
  • Demonstrated excellence in communication, planning, negotiation, and interpersonal interactions across executives, cross-functional stakeholders, and team members, with a strong ability to influence and drive organizational change.
  • Cloud & hybrid experience: History of building, deploying, and operating compute in data centers in addition to augmenting with Cloud-based workloads – ideally GCP.
  • Hands-on lineage: you once built or operated clusters yourself (writing Golang operators, CRDs, and CLI tools), graduated into org leadership, and still dive deep when needed.
  • Experience with on-prem storage technologies and their Kubernetes integrations.
  • CI/CD leadership: design and run pipelines (GitHub Actions, Buildkite, or similar) that build, test, sign, and promote container images at hyperscale velocity.

Responsibilities

  • Architect the development of Groq’s hyperscale compute platform, ensuring scalability, reliability, and security.
  • Plan: define and execute technical roadmaps that advance Groq’s capability to manage large-scale general and specialized compute infrastructure efficiently.
  • Lead highly technical engineering teams focused on container orchestration, hardware provisioning, and platform automation.
  • Build and grow the organization: attract, hire, mentor, and retain top-tier engineers; shape a culture of automation, simplicity, rapid learning and operational excellence.
  • Operate the fleet: own production Kubernetes clusters and Storage solutions distributed across several geographic regions, driving SLOs, incident response, and continual improvement.
  • Ship continuously: enforce robust CI/CD—with container image scanning, automated integration tests, and progressive roll-outs—to keep the platform secure and rapidly evolving.
  • Collaborate globally with data-center, hardware, and hardware teams to ensure seamless capacity expansions, hardware refreshes, and energy-efficiency initiatives.
  • Advanced low-latency networking: partner closely with our networking to ensure we champion modern data-plane technologies (Cilium/eBPF, BGP-based service routing, advanced load balancing) for low-latency throughput and high security.

Preferred Qualifications

  • Humility – Egos are checked at the door
  • Collaborative & Team Savvy – We make up the smartest person in the room, together
  • Growth & Giver Mindset – Learn it all versus know it all, we share knowledge generously
  • Curious & Innovative – Take a creative approach to projects, problems, and design
  • Passion, Grit, & Boldness – no limit thinking, fueling informed risk taking