Staff Software Engineer
Company | Liftoff |
---|---|
Location | Los Angeles, CA, USA, San Carlos, CA, USA, Remote in USA, New York, NY, USA, Remote in Canada |
Salary | $150000 – $230000 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s, PhD |
Experience Level | Senior, Expert or higher |
Requirements
- BS in Computer Science with 8+ years of professional experience; or
- MS in Computer Science with 6+ years of professional experience; or
- PhD with 3+ years of professional experience; software engineering, or reliability engineering, with a focus on production systems.
- Proven ability to drive large technical initiatives and lead projects spanning multiple teams.
- Solid core CS fundamentals (data structures, algorithms, architecting systems).
- Deep expertise in Python and/or Go; fluency with ML libraries (e.g., TensorFlow, PyTorch), cloud infrastructure (e.g., AWS)
- Experience with ML monitoring tools (e.g. Prometheus, Grafana).
- Experience in big data engines such as Trino and Spark is a big plus.
- Strong problem-solving skills and the ability to work collaboratively across teams.
- Ability to lead across team and role boundaries to effect large scale change in culture and systems.
Responsibilities
- Lead the design and evolution of large-scale ML infrastructure, driving improvements in availability, reliability, and operational excellence for our production ML systems.
- Define and implement end-to-end monitoring, alerting, and performance tracking for ML models and data pipelines, ensuring model health and data integrity at scale.
- Partner with data scientists and platform teams to standardize and scale model deployment, versioning, and A/B experimentation frameworks.
- Lead and participate in incident response efforts, conducting root cause analysis and implementing corrective actions to prevent recurrence.
- Identify systemic inefficiencies and opportunities for automation or simplification, and drive cross-functional efforts to improve system performance and developer productivity.
- Drive adoption of best practices in software and ML engineering, including code quality, risk-driven testing, and explainable, maintainable systems.
- Act as a mentor and multiplier, helping other engineers level up in ML systems, reliability, and architectural thinking.
- Contribute to strategic planning and partner with product and platform leads to align engineering efforts with business outcomes.
Preferred Qualifications
- Experience in ML systems for training Transformer models, CTR prediction models.
- Prior experience in AdTech, mobile growth, or performance marketing domains.
- Contributions to open-source ML infrastructure or tools.