Research Scientist Manager - Generative AI - Llama Pre-training

Research Scientist Manager – Generative AI – Llama Pre-training

Company	Meta
Location	Menlo Park, CA, USA, New York, NY, USA
Salary	$177000 – $251000
Type	Full-Time
Degrees	PhD
Experience Level	Senior

5+ years of hands-on experience in large language model, NLP, and Transformer modeling, in the setting of both research and engineering development
Experience and track record of landing large research and/or product impacts in a fast-paced environment
3+ years of hands-on supporting and leading teams of research scientists and software engineers
Proven technical vision in where the field of generative AI will go
Experience of and knowledge of data curation techniques (training set preparation, ablation experiments, etc.)
Experience with cross functional collaboration with other teams including non-engineering functions
Demonstrated experience recruiting, building, structuring, leading technical organizations, including performance management

Drive end-to-end development of large language models, including data sourcing and curation, filtering, experiment design, evaluation and more
Drive efficiency gains on training and deployment of LLMs through novel techniques
Lead a team of applied researchers to democratize Llama for Meta’s users
Communicate, collaborate, and build relationships with clients and peer teams to facilitate cross-functional projects
Remain up-to-date on ongoing research and software development activities in the team, help work through technical challenges, and be involved in design decisions
Remain involved in the research community, both understanding trends, and setting them

PhD in deep learning, artificial intelligence, and/or related technical field
Experience and knowledge of ML frameworks like PyTorch, TensorFlow, etc.
Experience and knowledge of large-scale data platforms such as Spark, Hive, etc.
Experience and knowledge of working with LLM frameworks like LangChain
Experience and knowledge of training LLMs, fine-tuning on datasets, especially Llama