Posted in

Software Engineer – Inference

Software Engineer – Inference

CompanyByteDance
LocationSan Jose, CA, USA
Salary$Not Provided – $Not Provided
TypeFull-Time
DegreesBachelor’s
Experience LevelJunior, Mid Level

Requirements

  • Bachelor’s degree or above, major in computer/electronics/automation/software, etc.
  • Proficient in C/C++, proficient in algorithms and data structures, familiar with Python
  • Understand the basic principles of deep learning algorithms, be familiar with the basic architecture of neural networks and understand deep learning training frameworks such as Pytorch.

Responsibilities

  • Responsible for developing and optimizing LLM inference framework.
  • Responsible for GPU and CUDA Performance optimization to create an industry-leading high-performance LLM inference engine.

Preferred Qualifications

  • Proficient in GPU high-performance computing optimization technology on CUDA, in-depth understanding of computer architecture, familiar with parallel computing optimization, memory access optimization, low-bit computing, etc.
  • Familiar with TensorRT-LLM, ORCA, VLLM, etc.
  • Knowledge of LLM models, experience in accelerating LLM model optimization is preferred.