Research Engineer - Post-Training Evals

Research Engineer – Post-Training Evals

Have a deep understanding of RL or LLMs
Have a working knowledge of relevant models, and building evaluations for model capability improvement.
Are comfortable diving into a large ML codebase to debug.

Investigate the relationship between evaluation metrics and model behavior: Design and execute studies to understand how various metrics correlate with user satisfaction, task success, and broader model capabilities.
Develop novel evaluation methodologies: Create experimental frameworks and evaluation methods to measure complex model attributes, including alignment, robustness, and generalization.
Collaborate cross-functionally: Work with data scientists to analyze user interactions, partner with research teams to implement evaluations and inform model development.

Thrive in a dynamic and technically complex environment.
Are excited to explore the capabilities and limitations of cutting-edge AI models.