Posted in

Research Infrastructure Engineer – Post-Training

Research Infrastructure Engineer – Post-Training

CompanyOpenAI
LocationSan Francisco, CA, USA
Salary$310000 – $460000
TypeFull-Time
Degrees
Experience LevelMid Level, Senior

Requirements

  • Strong technical background in data technologies
  • Deep expertise in ML system optimization, distributed systems, or full-stack application development for internal tools
  • Ability to analyze and troubleshoot complex system issues
  • Experience collaborating with ML researchers in an applied setting is highly valued

Responsibilities

  • Ensure that systems which power ChatGPT training and development run smoothly
  • Dive into large ML codebases to understand and debug systems issues
  • Work with researchers to build tools for data management, model configuration, evaluation, and more
  • Create reusable Python libraries with great abstractions usable across ML projects

Preferred Qualifications

  • Experience working in complex technical environments
  • Experience debugging ML systems
  • Experience with reinforcement learning and or transformers
  • Experience with python
  • Experience with kubernetes / distributed infrastructure
  • Experience with GPU’s
  • Experience with 1 or more large scale data systems such as beam or spark