Posted in

Inference Runtime Systems – Software Engineer – Staff

Inference Runtime Systems – Software Engineer – Staff

Companyd-Matrix
LocationSanta Clara, CA, USA
Salary$142500 – $230000
TypeFull-Time
DegreesBachelor’s, Master’s
Experience LevelSenior, Expert or higher

Requirements

  • Bachelor’s with a minimum of 6+ years of professional experience in software development with a focus on C++
  • Master’s degree preferred in computer science, Engineering, or a related field with 3+ years of professional experience in software development with a focus on C++
  • Experience in architecting and building complex software systems
  • Experience with distributed systems or high-performance computing (HPC) applications
  • Familiarity with PyTorch internals or similar machine learning frameworks
  • Strong proficiency in modern C++ (C++11 and above) and Python
  • Solid understanding of software design patterns and best practices
  • Experience with parallel and concurrent programming
  • Proficient in CMake, Pytest, and other development tools
  • Knowledge of GPU programming and acceleration techniques is a plus
  • Proficient in using development tools and frameworks for building and deploying large-scale applications
  • Excellent problem-solving and analytical skills
  • Strong communication and interpersonal abilities

Responsibilities

  • Lead the design and implementation of a high-performance inference runtime that leverages d-Matrix’s advanced hardware capabilities
  • Integrate the inference runtime with PyTorch to enable upstream software capabilities like inference and finetuning
  • Work closely with cross-functional teams including hardware engineers, data scientists, and product managers to define requirements and deliver integrated solutions
  • Develop and implement optimization techniques to ensure low latency and high throughput in distributed and HPC environments
  • Ensure the code quality, and performance through rigorous testing and code reviews
  • Create technical documentation to support development, deployment, and maintenance activities

Preferred Qualifications

  • Master’s degree preferred in computer science, Engineering, or a related field
  • Knowledge of GPU programming and acceleration techniques is a plus