Machine Learning Ops – Mlops – AI Foundation Models for Design
Company | Autodesk |
---|---|
Location | Boston, MA, USA, Toronto, ON, Canada |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | Bachelor’s, Master’s |
Experience Level | Mid Level, Senior |
Requirements
- BSc or MSc in Computer Science or related field, or equivalent industry experience
- Experience with distributed systems for machine learning and deep learning at scale
- Strong knowledge of ML infrastructure and model parallelism techniques, including frameworks like PyTorch, Lightning, Megatron, DeepSpeed, and FSDP
- Proficiency in Python and strong software engineering practices
- Experience with cloud services and architectures (AWS, Azure, etc.)
- Familiarity with version control, CI/CD, and deployment pipelines
- Excellent written documentation skills to document code, architectures, and experiments
Responsibilities
- Support AI researchers by building scalable ML training pipelines and infrastructure for foundation model development
- Design efficient data processing workflows for large-scale design datasets and industry-specific file formats
- Optimize distributed training systems and develop solutions for model parallelism, checkpointing, and efficient resource management
- Analyze performance bottlenecks and provide solutions to scaling problems
- Implement and maintain robust, testable code that is well documented and easy to understand
- Collaborate on projects at the intersection of research and product with a diverse, global team of researchers and engineers
- Present results to collaborators and leadership
Preferred Qualifications
- Experience with AEC data formats (e.g., BIM models, IFC files, CAD files, Drawing Sets)
- Knowledge of the AEC industry and its specific data processing challenges
- Experience scaling ML training and data pipelines for large datasets
- Experience with distributed data processing and ML infrastructure (e.g., Apache Spark, Ray, Docker, Kubernetes)
- Experience with performance optimization, monitoring, and efficiency in large-scale ML systems
- Experience with Autodesk or similar products (Revit, Sketchup, Forma)