Senior Performance Software Engineer – Deep Learning Libraries
Company | NVIDIA |
---|---|
Location | Austin, TX, USA, Redmond, WA, USA, Santa Clara, CA, USA, Durham, NC, USA, Hillsboro, OR, USA |
Salary | $184000 – $425500 |
Type | Full-Time |
Degrees | Master’s, PhD |
Experience Level | Senior |
Requirements
- Masters or PhD degree or equivalent experience in Computer Science, Computer Engineering, Applied Math, or related field
- 6+ years of relevant industry experience
- Demonstrated strong C++ programming and software design skills, including debugging, performance analysis, and test design
- Experience with performance-oriented parallel programming, even if it’s not on GPUs (e.g. with OpenMP or pthreads)
- Solid understanding of computer architecture and some experience with assembly programming
Responsibilities
- Writing highly tuned compute kernels, mostly in C++ CUDA, to perform core deep learning operations (e.g. matrix multiplies, convolutions, normalizations)
- Following general software engineering best practices including support for regression testing and CI/CD flows
- Collaborating with teams across NVIDIA:
- CUDA compiler team on generating optimal assembly code
- Deep learning training and inference performance teams on which layers require optimization
- Hardware and architecture teams on the programming model for new deep learning hardware features
Preferred Qualifications
- Tuning BLAS or deep learning library kernel code
- CUDA/OpenCL GPU programming
- Numerical methods and linear algebra
- LLVM, TVM tensor expressions, or TensorFlow MLIR