Solutions Architect – Generative AI Inference and Deployment
Company | NVIDIA |
---|---|
Location | Washington, USA, Texas, USA, Santa Clara, CA, USA, Tennessee, USA, Colorado, USA |
Salary | $148000 – $235750 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s, PhD |
Experience Level | Senior |
Requirements
- BS, MS, or PhD in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering or related fields (or equivalent experience)
- 5+ years of hands-on experience with Deep Learning frameworks such as PyTorch and TensorFlow
- Strong fundamentals in programming, optimizations, and software design, especially in Python
- Proficiency in problem-solving and debugging skills in GPU orchestration and Multi-Instance GPU (MIG) management within Kubernetes environments
- Experience with containerization and orchestration technologies, monitoring, and observability solutions for AI deployments
- Strong knowledge of the theory and practice of LLM and DL inference
- Excellent presentation, communication and collaboration skills
Responsibilities
- Partnering with other solution architects, engineering, product and business teams. Understanding their strategies and technical needs and helping define high-value solutions
- Dynamically engaging with developers, scientific researchers, and data scientists, gaining experience across a range of technical areas
- Strategically partnering with lighthouse customers and industry-specific solution partners targeting our computing platform
- Working closely with customers to help them adopt and build creative solutions using NVIDIA technology and MLOps solutions
- Analyzing performance and power efficiency of AI inference workloads on Kubernetes
- Some travel to conferences and customers may be required
Preferred Qualifications
- Prior experience with DL training at scale, deploying or optimizing DL inference in production
- Experience with NVIDIA GPUs and software libraries such as NVIDIA NIM, Dynamo, TensorRT, TensorRT-LLM
- Excellent C/C++ programming skills, including debugging, profiling, code optimization, performance analysis, and test design
- Familiarity with parallel programming and distributed computing platforms