Posted in

Senior Deep Learning Software Engineer – LLM Performance

Senior Deep Learning Software Engineer – LLM Performance

CompanyNVIDIA
LocationSanta Clara, CA, USA
Salary$184000 – $356500
TypeFull-Time
DegreesBachelor’s, Master’s, PhD
Experience LevelSenior, Expert or higher

Requirements

  • Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Engineering, Computer Science, EECS, AI)
  • At least 8 years of relevant software development experience
  • Excellent Python/C/C++ programming, software design and software engineering skills
  • Experience with a DL framework like PyTorch, JAX, TensorFlow.

Responsibilities

  • Performance optimization, analysis, and tuning of LLM, VLM and GenAI models for DL inference, serving and deployment in NVIDIA/OSS LLM frameworks.
  • Scale performance of LLM models across different architectures and types of NVIDIA accelerators.
  • Scale performance for max throughput, minimum latency and throughput under latency constraints.
  • Contribute features and code to NVIDIA/OSS LLM frameworks, inference benchmarking frameworks, TensorRT, and Triton.
  • Work with cross-collaborative teams across generative AI, automotive, image understanding, and speech understanding to develop innovative solutions.

Preferred Qualifications

  • Prior experience with a LLM framework or a DL compiler in inference, deployment, algorithms, or implementation
  • Prior experience with performance modeling, profiling, debug, and code optimization of a DL/HPC/high-performance application
  • Architectural knowledge of CPU and GPU
  • GPU programming experience (CUDA or OpenCL)