Posted in

Applied Researcher II

Applied Researcher II

CompanyCapital One
LocationCambridge, MA, USA, San Francisco, CA, USA, McLean, VA, USA, New York, NY, USA
Salary$257300 – $320400
TypeFull-Time
DegreesMaster’s, PhD
Experience LevelMid Level, Senior

Requirements

  • Currently has, or is in the process of obtaining, a PhD, with an expectation that required degree will be obtained on or before the scheduled start date plus at least 2 years of experience in Applied Research, or M.S. plus at least 4 years of experience in Applied Research
  • Has a deep understanding of the foundations of AI methodologies
  • Experience building large deep learning models, whether on language, images, events, or graphs, as well as expertise in one or more of the following: training optimization, self-supervised learning, robustness, explainability, RLHF
  • An engineering mindset as shown by a track record of delivering models at scale both in terms of training data and inference volumes
  • Experience in delivering libraries, platform level code or solution level code to existing products
  • A professional with a track record of coming up with new ideas or improving upon existing ideas in machine learning, demonstrated by accomplishments such as first author publications or projects
  • Possess the ability to own and pursue a research agenda, including choosing impactful research problems and autonomously carrying out long-running projects

Responsibilities

  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals

Preferred Qualifications

  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • PhD focused on topics related to optimizing training of very large deep learning models
  • Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
  • Experience optimizing training for a 10B+ model
  • Deep knowledge of deep learning algorithmic and/or optimizer design
  • Experience with compiler design
  • PhD focused on topics related to guiding LLMs with further tasks (Supervised Finetuning, Instruction-Tuning, Dialogue-Finetuning, Parameter Tuning)
  • Demonstrated knowledge of principles of transfer learning, model adaptation and model guidance
  • Experience deploying a fine-tuned large language model