Posted in

Machine Learning Ops – Mlops – AI Foundation Models for Design

Machine Learning Ops – Mlops – AI Foundation Models for Design

CompanyAutodesk
LocationBoston, MA, USA, Toronto, ON, Canada
Salary$Not Provided – $Not Provided
TypeFull-Time
DegreesBachelor’s, Master’s
Experience LevelMid Level, Senior

Requirements

  • BSc or MSc in Computer Science or related field, or equivalent industry experience
  • Experience with distributed systems for machine learning and deep learning at scale
  • Strong knowledge of ML infrastructure and model parallelism techniques, including frameworks like PyTorch, Lightning, Megatron, DeepSpeed, and FSDP
  • Proficiency in Python and strong software engineering practices
  • Experience with cloud services and architectures (AWS, Azure, etc.)
  • Familiarity with version control, CI/CD, and deployment pipelines
  • Excellent written documentation skills to document code, architectures, and experiments

Responsibilities

  • Support AI researchers by building scalable ML training pipelines and infrastructure for foundation model development
  • Design efficient data processing workflows for large-scale design datasets and industry-specific file formats
  • Optimize distributed training systems and develop solutions for model parallelism, checkpointing, and efficient resource management
  • Analyze performance bottlenecks and provide solutions to scaling problems
  • Implement and maintain robust, testable code that is well documented and easy to understand
  • Collaborate on projects at the intersection of research and product with a diverse, global team of researchers and engineers
  • Present results to collaborators and leadership

Preferred Qualifications

  • Experience with AEC data formats (e.g., BIM models, IFC files, CAD files, Drawing Sets)
  • Knowledge of the AEC industry and its specific data processing challenges
  • Experience scaling ML training and data pipelines for large datasets
  • Experience with distributed data processing and ML infrastructure (e.g., Apache Spark, Ray, Docker, Kubernetes)
  • Experience with performance optimization, monitoring, and efficiency in large-scale ML systems
  • Experience with Autodesk or similar products (Revit, Sketchup, Forma)