Posted in

Fundamental AI Research Scientist – Multimodal Audio – Speech – Sound and Music – Fair

Fundamental AI Research Scientist – Multimodal Audio – Speech – Sound and Music – Fair

CompanyMeta
LocationBoston, MA, USA, Seattle, WA, USA, Menlo Park, CA, USA, New York, NY, USA
Salary$147000 – $208000
TypeFull-Time
DegreesBachelor’s, PhD
Experience LevelSenior, Expert or higher

Requirements

  • Bachelor’s degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
  • PhD degree in AI, computer science, data science, or related technical fields, or equivalent practical experience.
  • 2+ years of experience holding an industry, faculty, academic, or government researcher position.
  • Research publications reflecting experience in related research fields: audio (speech, sound, or music) generation, text-to-speech (TTS) synthesis, text-to-music generation, text-to-sound generation, speech recognition, speech / audio representation learning, vision perception, image / video generation, video-to-audio generation, audio-visual learning, audio language models, lip sync, lip movement generation / correction, lip reading, etc.
  • Familiarity with one or more deep learning frameworks (e.g. pytorch, tensorflow, …)
  • Experienced in Python programming language.
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment.

Responsibilities

  • Develop algorithms based on state-of-the-art machine learning and neural network methodologies.
  • Perform research to advance the science and technology of intelligent machines.
  • Conduct research that enables learning the semantics of data across multiple modalities (audio, speech, images, video, text, and other modalities).
  • Work towards long-term ambitious research goals, while identifying intermediate milestones.
  • Design and implement models and algorithms.
  • Work with large datasets, train / tune / scale the models, create benchmarks to evaluate the performance, open source and publish.

Preferred Qualifications

  • First-authored publications at peer-reviewed conferences, such as ICML, NeuRIPS, ICLR, ICASSP, Interspeech, ACL, EMNLP, CVPR, and other similar venues.
  • Research and engineering experience demonstrated via publications, grants, fellowships, patents, internships, work experience, open source code, and / or coding competitions.
  • Experience solving complex problems and comparing alternative solutions, trade-offs, and diverse points of view.
  • Experience working and communicating cross functionally in a team environment.
  • Experience communicating research findings to public audiences of peers.