Machine Learning Software Engineer - Promo Media & Marketing

Machine Learning Software Engineer – Promo Media & Marketing

5+ years of proven experience in software engineering for ML projects and products, with strong proficiency in Python
BSc in Computer Science, Electrical Engineering, or a related technical field
Familiarity with ML libraries such as PyTorch, experience with ML/CV/GenAI pipelines, and knowledge of distributed data processing systems
Experience with ML workflow orchestration tools like Metaflow, Airflow, or similar pipeline management systems
Proficient in cloud infrastructure, including S3, Docker containers
Basic understanding of (multi) GPU training and inference for debugging and performance assessment, and CUDA runtime
Demonstrated ability to effectively communicate technical requirements and influence infrastructure decisions across team boundaries
Track record of mentoring engineers on ML engineering best practices while balancing technical debt reduction with feature delivery
Excellent communication and interpersonal skills, with a strong ability to navigate ambiguity
Collaborative and thrive in fast-paced dynamic environments, contributing positively to the team and company culture
The Netflix culture resonates with you.

Architect and implement robust ML pipelines and reusable frameworks across the full ML lifecycle in the multimedia domain, including data processing, efficient distributed model training with GPUs, and deploying models into creator workflows and production systems, personally developing critical components while providing technical leadership and mentorship to team members implementing other parts of the system
Collaborate cross-functionally with ML scientists, product managers, and engineers to define and prioritize system requirements
Partner with our ML platform team to translate technical needs into infrastructure requirements, advocating for platform capabilities that enhance reliability, boost team productivity, and support our long-term vision for innovation at scale
Optimize systems and workflows for latency, throughput, and reliability across both batch processing workflows and real-time services to meet SLA requirements
Strategically identify and address technical debt to improve development velocity, while implementing tools and processes that enhance team productivity and code quality

Experience with GPU training and inference optimization, performance tuning, and debugging
Experience building end-to-end multimedia systems and computer vision algorithms
Familiarity with generative models and tools, such as diffusion-based models