Skip to content

Director of Engineering – Compute
Company | Groq |
---|
Location | Palo Alto, CA, USA |
---|
Salary | $300000 – $375000 |
---|
Type | Full-Time |
---|
Degrees | |
---|
Experience Level | Expert or higher |
---|
Requirements
- 10+ years in large-scale infrastructure engineering, including 3+ years leading teams that run business-critical, globally distributed fleets.
- Proven leadership experience in highly technical engineering environments, with a track record of delivering innovative platform solutions and effectively leading, motivating, and developing high-performing engineering teams.
- Demonstrated excellence in communication, planning, negotiation, and interpersonal interactions across executives, cross-functional stakeholders, and team members, with a strong ability to influence and drive organizational change.
- Cloud & hybrid experience: History of building, deploying, and operating compute in data centers in addition to augmenting with Cloud-based workloads – ideally GCP.
- Hands-on lineage: you once built or operated clusters yourself (writing Golang operators, CRDs, and CLI tools), graduated into org leadership, and still dive deep when needed.
- Experience with on-prem storage technologies and their Kubernetes integrations.
- CI/CD leadership: design and run pipelines (GitHub Actions, Buildkite, or similar) that build, test, sign, and promote container images at hyperscale velocity.
Responsibilities
- Architect the development of Groq’s hyperscale compute platform, ensuring scalability, reliability, and security.
- Plan: define and execute technical roadmaps that advance Groq’s capability to manage large-scale general and specialized compute infrastructure efficiently.
- Lead highly technical engineering teams focused on container orchestration, hardware provisioning, and platform automation.
- Build and grow the organization: attract, hire, mentor, and retain top-tier engineers; shape a culture of automation, simplicity, rapid learning and operational excellence.
- Operate the fleet: own production Kubernetes clusters and Storage solutions distributed across several geographic regions, driving SLOs, incident response, and continual improvement.
- Ship continuously: enforce robust CI/CD—with container image scanning, automated integration tests, and progressive roll-outs—to keep the platform secure and rapidly evolving.
- Collaborate globally with data-center, hardware, and hardware teams to ensure seamless capacity expansions, hardware refreshes, and energy-efficiency initiatives.
- Advanced low-latency networking: partner closely with our networking to ensure we champion modern data-plane technologies (Cilium/eBPF, BGP-based service routing, advanced load balancing) for low-latency throughput and high security.
Preferred Qualifications
- Humility – Egos are checked at the door
- Collaborative & Team Savvy – We make up the smartest person in the room, together
- Growth & Giver Mindset – Learn it all versus know it all, we share knowledge generously
- Curious & Innovative – Take a creative approach to projects, problems, and design
- Passion, Grit, & Boldness – no limit thinking, fueling informed risk taking