Skip to content

Software Engineer – Frontier Systems
Company | OpenAI |
---|
Location | San Francisco, CA, USA |
---|
Salary | $295000 – $440000 |
---|
Type | Full-Time |
---|
Degrees | |
---|
Experience Level | Senior, Expert or higher |
---|
Requirements
- 7+ years of industry experience in software engineering
- Proficiency with Python and shell scripting
- A high degree of comfort digging into noisy data with SQL, PromQL, and Pandas or any other tool necessary
- Experience developing reproducible analyses
- A balance of strengths in building and operationalizing
Responsibilities
- Own and improve the system health checks that keep our hyperscale supercomputers stable during model training.
- Lead deep dives into hardware failures and system-level bugs to understand how things break at scale.
- Build automation that monitors and fixes issues across thousands of machines – so researchers can keep moving without interruption.
Preferred Qualifications
- Experience with low level details of hardware components, protocols, and associated Linux tooling (e.g., PCIe, Infiniband, networking, power management, kernel perf tuning)
- Experience with visualization of large data centers and networks.
- Expertise with network operations and tooling
- Expertise with power management and stabilization