Posted in

Research Scientist – Foundation Model – Speech & Audio

Research Scientist – Foundation Model – Speech & Audio

CompanyByteDance
LocationSan Jose, CA, USA
Salary$Not Provided – $Not Provided
TypeFull-Time
DegreesMaster’s, PhD
Experience LevelMid Level, Senior

Requirements

  • M.S. or Ph.D. in computer science, machine learning, or similar fields.
  • At least three years of relevant industry or research experience
  • Good knowledge of theoretical and empirical research in addressing research problems
  • Solid knowledge and experience with at least one popular deep learning framework (e.g., PyTorch, TensorFlow) and familiarity with deep neural network architectures
  • Good presentation and communication skills
  • Experience in both neural and non-neural, classical machine learning models and algorithms

Responsibilities

  • Contribute cutting-edge research to ByteDance product evolution (e.g., TikTok, CapCut) to impact billions of users worldwide.
  • Lead research to advance science and technology in audio processing and generation (e.g., Speech Synthesis, Voice Conversion, Audio Codec Learning, Audio Language Modeling, etc.)
  • Research, model, design, develop and evaluate novel machine learning models and algorithms.
  • Collaborate with globally based researchers and engineering teams in developing machine learning models and algorithms.

Preferred Qualifications

  • Expertise in one or more of the following fields: speech synthesis or recognition, natural language processing, computer vision, generative models
  • Strong first-author publications record in top AI conferences or journals(e.g., NeurIPS, ICML, ICLR, ACL, EMNLP, NAACL, ICASSP)
  • Proficient in C / C + +, Python, and shell programming languages, and have a deep understanding of data structure and algorithm design.
  • Work or internship experience in an AI research organization