Research Engineer Intern – Doubao – Seed – Machine Learning System – 2025 Summer – MS
Company | ByteDance |
---|---|
Location | Seattle, WA, USA |
Salary | $Not Provided – $Not Provided |
Type | Internship |
Degrees | Master’s |
Experience Level | Internship |
Requirements
- Currently pursuing a MS in Software Development, Computer Science, Computer Engineering, or a related technical discipline.
- Familiar with machine learning algorithms and platforms
- Published papers at top conferences
- Familiar with the C/C++ and Python development in Linux environments
- Familiar with at least one deep learning framework (TensorFlow, PyTorch, MXNet, or other)
- Ability to work independently and complete projects from beginning to end and in a timely manner.
- Good communication and teamwork skills to clearly communicate technical concepts with other teammates.
- Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment.
Responsibilities
- Research and develop our efficient machine learning systems, including efficient optimizers, parameters, and gradient efficient training with rank reduction and communication compression.
- Develop a state-of-the-art asynchronous training framework ensuring convergence.
- Implement both general purpose training framework features and model specific optimizations (e.g. LLM, diffusions).
- Improve efficiency and stability for extremely large scale distributed training jobs.
Preferred Qualifications
- GPU based high performance computing, RDMA high performance network (MPI, NCCL, ibverbs).
- Distributed training framework optimizations such as DeepSpeed, FSDP, Megatron, GSPMD.
- AI compiler stacks such as torch.fx, XLA and MLIR.
- Large scale data processing and parallel computing.
- Experiences in designing and operating large scale systems in cloud computing or machine learning.
- Experiences in in-depth CUDA programming and performance tuning (cutlass, triton).