Fundamental AI Research Scientist - Multimodal Audio - Speech - Sound and Music - Fair

Fundamental AI Research Scientist – Multimodal Audio – Speech – Sound and Music – Fair

Company	Meta
Location	Boston, MA, USA, Seattle, WA, USA, Menlo Park, CA, USA, New York, NY, USA
Salary	$147000 – $208000
Type	Full-Time
Degrees	Bachelor’s, PhD
Experience Level	Senior, Expert or higher

Bachelor’s degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
PhD degree in AI, computer science, data science, or related technical fields, or equivalent practical experience.
2+ years of experience holding an industry, faculty, academic, or government researcher position.
Research publications reflecting experience in related research fields: audio (speech, sound, or music) generation, text-to-speech (TTS) synthesis, text-to-music generation, text-to-sound generation, speech recognition, speech / audio representation learning, vision perception, image / video generation, video-to-audio generation, audio-visual learning, audio language models, lip sync, lip movement generation / correction, lip reading, etc.
Familiarity with one or more deep learning frameworks (e.g. pytorch, tensorflow, …)
Experienced in Python programming language.
Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment.

Develop algorithms based on state-of-the-art machine learning and neural network methodologies.
Perform research to advance the science and technology of intelligent machines.
Conduct research that enables learning the semantics of data across multiple modalities (audio, speech, images, video, text, and other modalities).
Work towards long-term ambitious research goals, while identifying intermediate milestones.
Design and implement models and algorithms.
Work with large datasets, train / tune / scale the models, create benchmarks to evaluate the performance, open source and publish.

First-authored publications at peer-reviewed conferences, such as ICML, NeuRIPS, ICLR, ICASSP, Interspeech, ACL, EMNLP, CVPR, and other similar venues.
Research and engineering experience demonstrated via publications, grants, fellowships, patents, internships, work experience, open source code, and / or coding competitions.
Experience solving complex problems and comparing alternative solutions, trade-offs, and diverse points of view.
Experience working and communicating cross functionally in a team environment.
Experience communicating research findings to public audiences of peers.