Software Engineer – Systems ML – Frameworks / Compilers / Kernels
Company | Meta |
---|---|
Location | Toronto, ON, Canada |
Salary | $104000 – $148000 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s, PhD |
Experience Level | Junior, Mid Level |
Requirements
- Proven C/C++ programming skills
- Currently has, or is in the process of obtaining a Bachelor’s degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta.
- Experience in AI framework development or accelerating deep learning models on hardware architectures.
Responsibilities
- Development of SW stack with one of the following core focus areas: AI frameworks, compiler stack, high performance kernel development and acceleration onto next generation of hardware architectures.
- Contribute to the development of the industry-leading PyTorch AI framework core compilers to support new state of the art inference and training AI hardware accelerators and optimize their performance.
- Analyze deep learning networks, develop & implement compiler optimization algorithms.
- Collaborating with AI research scientists to accelerate the next generation of deep learning models such as Recommendation systems, Generative AI, Computer vision, NLP etc.
- Performance tuning and optimizations of deep learning framework & software components.
Preferred Qualifications
- A Bachelor’s degree in Computer Science, Computer Engineering, relevant technical field and 4+ years of experience in AI framework development or accelerating deep learning models on hardware architectures OR a Master’s degree in Computer Science, Computer Engineering, relevant technical field and 2+ years of experience in AI framework development or accelerating deep learning models on hardware architectures OR a PhD in Computer Science Computer Engineering, or relevant technical field.
- Knowledge of GPU, CPU, or AI hardware accelerator architectures.
- Experience working with frameworks like PyTorch, Caffe2, TensorFlow, ONNX, TensorRT
- Experience with CUDA programming, OpenMP / OpenCL programming or AI hardware accelerator kernel programming. Experience in accelerating libraries on AI hardware, similar to cuBLAS, cuDNN, CUTLASS, HIP, ROCm etc.
- Experience with compiler optimizations such as loop optimizations, vectorization, parallelization, hardware specific optimizations such as SIMD. Experience with MLIR, LLVM, IREE, XLA, TVM, Halide is a plus.
- Experience in developing training and inference framework components. Experience in system performance optimizations such as runtime analysis of latency, memory bandwidth, I/O access, compute utilization analysis and associated tooling development.