Applied Researcher II
Company | Capital One |
---|---|
Location | Cambridge, MA, USA, San Francisco, CA, USA, McLean, VA, USA, New York, NY, USA |
Salary | $257300 – $320400 |
Type | Full-Time |
Degrees | Master’s, PhD |
Experience Level | Mid Level, Senior |
Requirements
- Currently has, or is in the process of obtaining, a PhD, with an expectation that required degree will be obtained on or before the scheduled start date plus at least 2 years of experience in Applied Research, or M.S. plus at least 4 years of experience in Applied Research
- Has a deep understanding of the foundations of AI methodologies
- Experience building large deep learning models, whether on language, images, events, or graphs, as well as expertise in one or more of the following: training optimization, self-supervised learning, robustness, explainability, RLHF
- An engineering mindset as shown by a track record of delivering models at scale both in terms of training data and inference volumes
- Experience in delivering libraries, platform level code or solution level code to existing products
- A professional with a track record of coming up with new ideas or improving upon existing ideas in machine learning, demonstrated by accomplishments such as first author publications or projects
- Possess the ability to own and pursue a research agenda, including choosing impactful research problems and autonomously carrying out long-running projects
Responsibilities
- Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money
- Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
- Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
- Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
- Flex your interpersonal skills to translate the complexity of your work into tangible business goals
Preferred Qualifications
- PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
- PhD focus on NLP or Masters with 5 years of industrial NLP research experience
- Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
- Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
- Publications in deep learning theory
- Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
- PhD focused on topics related to optimizing training of very large deep learning models
- Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
- Experience optimizing training for a 10B+ model
- Deep knowledge of deep learning algorithmic and/or optimizer design
- Experience with compiler design
- PhD focused on topics related to guiding LLMs with further tasks (Supervised Finetuning, Instruction-Tuning, Dialogue-Finetuning, Parameter Tuning)
- Demonstrated knowledge of principles of transfer learning, model adaptation and model guidance
- Experience deploying a fine-tuned large language model