Posted in

ML Acceleration / Framework Engineer – Distributed Training & Inference – AWS Neuron – Annapurna Labs – Annapurna Labs

ML Acceleration / Framework Engineer – Distributed Training & Inference – AWS Neuron – Annapurna Labs – Annapurna Labs

CompanyAmazon
LocationSeattle, WA, USA, Cupertino, CA, USA
Salary$99500 – $200000
TypeFull-Time
DegreesBachelor’s, Master’s
Experience LevelEntry Level/New Grad

Requirements

  • Bachelors or Masters degree between December 2022 and September 2025
  • Working knowledge of C++ and Python
  • Experience with ML frameworks, particularly PyTorch, Jax, and/or vLLM
  • Understanding of parallel computing concepts and CUDA programming

Responsibilities

  • Improve PyTorch and JAX for distributed training on Trainium chips
  • Optimize ML models for efficient inference on Inferentia processors
  • Collaborate with compiler and runtime teams to maximize hardware performance
  • Develop and integrate new features in ML frameworks to support AWS AI services

Preferred Qualifications

  • Open source contributions to ML frameworks or tools
  • Experience optimizing ML workloads for performance
  • Direct experience with PyTorch internals or CUDA optimization
  • Hands-on experience with LLM infrastructure tools (e.g., vLLM, TensorRT)