Skip to content

Inference Runtime Systems – Software Engineer – Staff
Company | d-Matrix |
---|
Location | Santa Clara, CA, USA |
---|
Salary | $142500 – $230000 |
---|
Type | Full-Time |
---|
Degrees | Bachelor’s, Master’s |
---|
Experience Level | Senior, Expert or higher |
---|
Requirements
- Bachelor’s with a minimum of 6+ years of professional experience in software development with a focus on C++
- Master’s degree preferred in computer science, Engineering, or a related field with 3+ years of professional experience in software development with a focus on C++
- Experience in architecting and building complex software systems
- Experience with distributed systems or high-performance computing (HPC) applications
- Familiarity with PyTorch internals or similar machine learning frameworks
- Strong proficiency in modern C++ (C++11 and above) and Python
- Solid understanding of software design patterns and best practices
- Experience with parallel and concurrent programming
- Proficient in CMake, Pytest, and other development tools
- Knowledge of GPU programming and acceleration techniques is a plus
- Proficient in using development tools and frameworks for building and deploying large-scale applications
- Excellent problem-solving and analytical skills
- Strong communication and interpersonal abilities
Responsibilities
- Lead the design and implementation of a high-performance inference runtime that leverages d-Matrix’s advanced hardware capabilities
- Integrate the inference runtime with PyTorch to enable upstream software capabilities like inference and finetuning
- Work closely with cross-functional teams including hardware engineers, data scientists, and product managers to define requirements and deliver integrated solutions
- Develop and implement optimization techniques to ensure low latency and high throughput in distributed and HPC environments
- Ensure the code quality, and performance through rigorous testing and code reviews
- Create technical documentation to support development, deployment, and maintenance activities
Preferred Qualifications
- Master’s degree preferred in computer science, Engineering, or a related field
- Knowledge of GPU programming and acceleration techniques is a plus