Safeguards Analyst – User Well-being
Company | Anthropic |
---|---|
Location | San Francisco, CA, USA, New York, NY, USA |
Salary | $170000 – $200000 |
Type | Full-Time |
Degrees | |
Experience Level | Junior, Mid Level |
Requirements
- Standing up and scaling policy enforcement and review workflows
- Using SQL and/or other data analysis tools to draw insights from large datasets
- Identifying emerging risks and threat actors, and providing feedback to a diverse sets of stakeholders, such as Product, Policy, Engineering, and Legal teams
- Working with generative AI products, including writing effective prompts for content review and enforcement
- Understanding the challenges that exist in implementing product policies at scale, including in the content moderation space
- Maintaining strong collaboration with team members while navigating rapidly evolving priorities and workstreams
- Navigating evolving regulatory landscapes and enforcement best practices with regards to age assurance, CSAM/CSEM, NCII, and digital well-being
- As a trust & safety professional or subject matter expert working in one or more of the following focus areas: Child safety, human exploitation and abuse, and/or content classification systems.
Responsibilities
- Design and architect automated enforcement systems and review workflows that scale effectively while maintaining high accuracy
- Partner with Engineering and Data Science teams to optimize detection models for policy violations and automated enforcement systems
- Review flagged content to drive enforcement and policy improvements
- Enforce usage policies with a focus on detecting and mitigating potential harmful use of AI systems
- Support the Safeguards policy design team by providing detailed feedback on policy gaps based on real enforcement scenarios
- Keep up to date with emerging AI policy enforcement best practices, and use these to inform our decision-making and workflows
Preferred Qualifications
-
No preferred qualifications provided.