ML Acceleration / Framework Engineer – Distributed Training & Inference – AWS Neuron – Annapurna Labs – Annapurna Labs
Company | Amazon |
---|---|
Location | Seattle, WA, USA, Cupertino, CA, USA |
Salary | $99500 – $200000 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s |
Experience Level | Entry Level/New Grad |
Requirements
- Bachelors or Masters degree between December 2022 and September 2025
- Working knowledge of C++ and Python
- Experience with ML frameworks, particularly PyTorch, Jax, and/or vLLM
- Understanding of parallel computing concepts and CUDA programming
Responsibilities
- Improve PyTorch and JAX for distributed training on Trainium chips
- Optimize ML models for efficient inference on Inferentia processors
- Collaborate with compiler and runtime teams to maximize hardware performance
- Develop and integrate new features in ML frameworks to support AWS AI services
Preferred Qualifications
- Open source contributions to ML frameworks or tools
- Experience optimizing ML workloads for performance
- Direct experience with PyTorch internals or CUDA optimization
- Hands-on experience with LLM infrastructure tools (e.g., vLLM, TensorRT)