Principal Machine Learning Engineer – ML Inference Platform
Company | Snap |
---|---|
Location | Palo Alto, CA, USA, Seattle, WA, USA, Los Angeles, CA, USA, Bellevue, WA, USA |
Salary | $235000 – $414000 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s, PhD |
Experience Level | Expert or higher |
Requirements
- Strong understanding of machine learning approaches and algorithms
- Excellent programming and software design skills, including debugging, performance analysis, and test design
- Proven track record of operating highly-available systems at scale
- Ability to proactively learn new concepts and technology and apply them at work
- Skilled at solving ambiguous problems
- Strong collaboration and mentorship skills
- BS in technical field such as computer science, mathematics, statistics or equivalent years of experience
- 9+ years of post-Bachelor’s machine learning experience; or a Master’s degree in a technical field + 8+ year of post-grad ML experience; or a PhD in a related technical field + 5+ years of post-grad ML experience
- 2+ years of experience as a technical lead
- Experience with GPU/TPU inference and optimizations
Responsibilities
- Design, implement, and scale critical machine learning components and services to support Snap’s most strategic initiatives
- Design and build a next-generation inference framework and services that can support large-scale model, high-throughput serving, enabling us to push the limits of what’s possible with machine learning
- Perform model and inference optimization with various GPUs to improve model inference speed and efficiency
- Work across teams to understand product requirements, evaluate trade-offs, and deliver the solutions needed to build innovative products or services
- Advocate for and apply best practices when it comes to availability, scalability, operational excellence, and cost management
- Provide technical direction that influences the entire company
Preferred Qualifications
- Masters/PhD in a technical field such as computer science
- Experience leading teams and driving technical roadmaps
- Experience working with machine learning, recommendation and ranking systems, or vector similarity search
- Experience with TensorFlow, PyTorch, or related deep learning frameworks
- Experience with Docker, Kubernetes, Ray, NoSQL solutions, Memcache/Redis, Google/AWS services
- Experienced in MLOps and managing production machine learning lifecycle