Research Scientist – Foundation Model – Speech & Audio
Company | ByteDance |
---|---|
Location | San Jose, CA, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | Master’s, PhD |
Experience Level | Mid Level, Senior |
Requirements
- M.S. or Ph.D. in computer science, machine learning, or similar fields.
- At least three years of relevant industry or research experience
- Good knowledge of theoretical and empirical research in addressing research problems
- Solid knowledge and experience with at least one popular deep learning framework (e.g., PyTorch, TensorFlow) and familiarity with deep neural network architectures
- Good presentation and communication skills
- Experience in both neural and non-neural, classical machine learning models and algorithms
Responsibilities
- Contribute cutting-edge research to ByteDance product evolution (e.g., TikTok, CapCut) to impact billions of users worldwide.
- Lead research to advance science and technology in audio processing and generation (e.g., Speech Synthesis, Voice Conversion, Audio Codec Learning, Audio Language Modeling, etc.)
- Research, model, design, develop and evaluate novel machine learning models and algorithms.
- Collaborate with globally based researchers and engineering teams in developing machine learning models and algorithms.
Preferred Qualifications
- Expertise in one or more of the following fields: speech synthesis or recognition, natural language processing, computer vision, generative models
- Strong first-author publications record in top AI conferences or journals(e.g., NeurIPS, ICML, ICLR, ACL, EMNLP, NAACL, ICASSP)
- Proficient in C / C + +, Python, and shell programming languages, and have a deep understanding of data structure and algorithm design.
- Work or internship experience in an AI research organization