Posted in

Senior High-Performance LLM Training Engineer

Senior High-Performance LLM Training Engineer

CompanyNVIDIA
LocationSanta Clara, CA, USA
Salary$184000 – $356500
TypeFull-Time
DegreesMaster’s, PhD
Experience LevelSenior, Expert or higher

Requirements

  • PhD in Computer Science, Electrical Engineering or Computer Engineering and 5+ years; or MS (or equivalent experience) and 8+ years of meaningful work experience.
  • Strong background in deep learning and neural networks, in particular training.
  • A deep background in computer architecture and familiarity with the fundamentals of GPU architecture.
  • Proven experience analyzing and tuning application performance & processor and system-level performance modelling.
  • Programming skills in C++, Python, and CUDA.

Responsibilities

  • Understand, analyze, profile, and optimize AI training workloads on innovative hardware and software platforms.
  • Understand the big picture of training performance on GPUs, prioritizing and then solving problems across all state-of-the-art neural networks.
  • Implement production-quality software in multiple layers of NVIDIA’s deep learning platform stack, from drivers to DL frameworks.
  • Build and support NVIDIA submissions to the MLPerf Training benchmark suite.
  • Implement key DL training workloads in NVIDIA’s proprietary processor and system simulators to enable future architecture studies.
  • Build tools to automate workload analysis, workload optimization, and other critical workflows.

Preferred Qualifications

    No preferred qualifications provided.