Software Engineer - GPU Inference

Software Engineer – GPU Inference

Deep expertise in model performance optimization, particularly at the inference layer
Strong background in kernel-level systems, data movement, and low-level performance tuning
Ability to navigate ambiguity, set technical direction, and drive complex initiatives to completion

Perform engineering efforts focused on improving model serving, inference performance, and system efficiency
Drive optimizations from a kernel and data movement perspective to improve system throughput and reliability
Partner closely with research and product teams to ensure our models perform effectively at scale
Design, build, and improve critical serving infrastructure to support Sora’s growth and reliability needs

No preferred qualifications provided.