Software Engineer – ML System Architecture
Company | ByteDance |
---|---|
Location | Seattle, WA, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | |
Experience Level | Mid Level, Senior |
Requirements
- Be proficient in 1 to 2 programming languages such as C++/Go/Python/Shell in Linux environment
- Understand the principles of distributed systems and have experience in design, development and maintenance of large-scale machine learning systems
- Be familiar with Kubernetes architecture, and have rich experience in system-level development and tuning
- Have an excellent logical analysis ability, able to reasonably abstract and split business logic
- Have a strong sense of responsibility, good learning ability, communication skills and self-drive
Responsibilities
- Responsible for the design and development of Machine Learning infrastructure for LLM/AIGC, etc
- Build up a super large machine learning system integrating GPUs, RDMA networking, and high-performance storage
- Responsible for solving technical problems such as high stability and availability of the system
- Organize and coordinate multiple teams to complete the construction of the system, including Data center team, network team, computing team, storage team, resource team.
Preferred Qualifications
- Familiar with the ML Infrastructure of Large Model training and inference
- Experience in one of the following fields: AI Infrastructure, HW/SW Co-Design, High Performance Computing, ML Hardware Architecture (GPU, Accelerators, Networking)