Software Engineer – Inference
Company | ByteDance |
---|---|
Location | San Jose, CA, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | Bachelor’s |
Experience Level | Junior, Mid Level |
Requirements
- Bachelor’s degree or above, major in computer/electronics/automation/software, etc.
- Proficient in C/C++, proficient in algorithms and data structures, familiar with Python
- Understand the basic principles of deep learning algorithms, be familiar with the basic architecture of neural networks and understand deep learning training frameworks such as Pytorch.
Responsibilities
- Responsible for developing and optimizing LLM inference framework.
- Responsible for GPU and CUDA Performance optimization to create an industry-leading high-performance LLM inference engine.
Preferred Qualifications
- Proficient in GPU high-performance computing optimization technology on CUDA, in-depth understanding of computer architecture, familiar with parallel computing optimization, memory access optimization, low-bit computing, etc.
- Familiar with TensorRT-LLM, ORCA, VLLM, etc.
- Knowledge of LLM models, experience in accelerating LLM model optimization is preferred.