Global LLM Data – LLM Training Operation Coding Analyst
Company | ByteDance |
---|---|
Location | Los Angeles, CA, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | |
Experience Level | Mid Level, Senior |
Requirements
- 3+ years of experience in project or operations management roles.
- Experience with programming languages such as Python, Java, Go, or C, acquired through coding projects, or technical, project manager, or scrum master roles in software engineering teams.
- Strong communication and problem-solving skills with the ability to understand and convey code-related concepts effectively.
- Strong project management skills, with the ability to design, manage, and optimize complex workflows.
- Ability to balance independent judgment with collaborative teamwork in a fast-paced, project-based environment.
Responsibilities
- Lead and manage multiple coding-focused LLM training projects, ensuring timelines, quality standards, and objectives are met.
- Track project progress, identify risks, and implement corrective actions as necessary to keep projects on course.
- Build and maintain strong relationships with product managers, engineers, researchers, data annotators, and other cross-functional team members.
- Communicate project updates, address concerns, and align expectations to ensure successful project outcomes.
- Coordinate meetings and discussions with global teams to ensure seamless project execution and work with external vendors and trainers per project demands.
- Design, manage, and optimize workflows for coding-focused LLM training projects, including training design, QA processes, and performance tracking to meet project needs.
- Collaborate closely with product managers, engineers, and cross-functional teams to ensure alignment on quality metrics and project expectations.
- Conduct quality and productivity improvement experiments to enhance operational processes for code-related training data.
- Lead and support general annotation operation improvement initiatives across various data domains.
- Develop and maintain technical guidelines and casebooks to support consistent, high-quality data production.
- Design and implement data analysis strategies for LLM coding projects.
- Analyze annotation quality, model performance, and dataset coverage using statistical and programmatic methods.
- Identify data gaps and failure patterns through slice-based evaluations and error analysis.
- Use Python (Pandas, NumPy, Matplotlib) and SQL to generate insights and support model training operations.
- Collaborate with researchers to inform training strategies and data improvements.
- Provide mentorship and guidance to team members, helping to develop their skills and ensuring the delivery of high-quality outputs.
- Foster a collaborative environment where team members can share knowledge and best practices to improve overall performance.
Preferred Qualifications
- Experience in RLHF annotation and working with leading AI/LLM companies on technical projects.
- Experience with codebases and understanding of software development processes, coding best practices, and version control systems (e.g., Git).
- Proven ability to lead and mentor junior team members in data-related or AI/LLM projects.
- Deep interest in LLMs, computational thinking, and ability to adapt to a high-intensity work environment.
- Enthusiasm for learning, engaging with diverse technical case studies, working with global teams, and comfort with technology tools that enhance project performance.
- Proficiency in Mandarin Chinese (reading and speaking) to effectively communicate with Chinese-speaking global teams.