Posted in

Senior Machine Learning Ops Engineer

Senior Machine Learning Ops Engineer

CompanyPrenuvo
LocationVancouver, BC, Canada
Salary$131000 – $197000
TypeFull-Time
DegreesBachelor’s, Master’s
Experience LevelSenior

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
  • At least 5 years of experience in software engineering, MLOps, or ML infrastructure roles.
  • Strong proficiency in Python and relevant ML engineering tooling for dependency management, packaging, testing, and deployment (e.g., Poetry, Pytest, Pylint).
  • Hands-on experience with ML workflow orchestration tools (e.g., MLflow, Kubeflow, Airflow, SageMaker, Weights & Biases).
  • Expertise in designing and managing CI/CD pipelines for ML applications using GitHub Actions, Jenkins, or similar tools.
  • Experience with cloud-based ML infrastructure (e.g., AWS, GCP, Azure) and containerized deployments using Docker and Kubernetes.
  • A strong sense of ownership, quality, and engineering best practices in ML production environments.

Responsibilities

  • Design, implement, and optimize scalable MLOps infrastructure to support data ingestion, model training, evaluation, and inference at scale.
  • Develop and maintain CI/CD pipelines for automating ML workflows, including training, validation, and deployment of ML models.
  • Build robust containerization and orchestration strategies for ML artifacts and services using Docker and Kubernetes.
  • Automate monitoring, logging, and alerting for ML models in production to ensure reliability and performance.
  • Establish and enforce best practices for ML model versioning, governance, and reproducibility using tools such as MLflow or Kubeflow.
  • Collaborate with data scientists, ML engineers, and DevOps teams to streamline the transition of ML models from research to production.
  • Contribute to regulatory documentation and compliance processes (FDA, IDE, etc.) to support ML model deployment in regulated environments.

Preferred Qualifications

  • Familiarity with deep learning frameworks like TensorFlow and PyTorch, particularly in the context of deployment and optimization.
  • Experience with medical imaging applications and regulatory compliance requirements.
  • Knowledge of microservices and API development frameworks such as FastAPI, REST, and gRPC.
  • Understanding of distributed computing frameworks such as Ray or Spark for ML scaling.