Posted in

Safeguards Analyst – User Well-being

Safeguards Analyst – User Well-being

CompanyAnthropic
LocationSan Francisco, CA, USA, New York, NY, USA
Salary$170000 – $200000
TypeFull-Time
Degrees
Experience LevelJunior, Mid Level

Requirements

  • Standing up and scaling policy enforcement and review workflows
  • Using SQL and/or other data analysis tools to draw insights from large datasets
  • Identifying emerging risks and threat actors, and providing feedback to a diverse sets of stakeholders, such as Product, Policy, Engineering, and Legal teams
  • Working with generative AI products, including writing effective prompts for content review and enforcement
  • Understanding the challenges that exist in implementing product policies at scale, including in the content moderation space
  • Maintaining strong collaboration with team members while navigating rapidly evolving priorities and workstreams
  • Navigating evolving regulatory landscapes and enforcement best practices with regards to age assurance, CSAM/CSEM, NCII, and digital well-being
  • As a trust & safety professional or subject matter expert working in one or more of the following focus areas: Child safety, human exploitation and abuse, and/or content classification systems.

Responsibilities

  • Design and architect automated enforcement systems and review workflows that scale effectively while maintaining high accuracy
  • Partner with Engineering and Data Science teams to optimize detection models for policy violations and automated enforcement systems
  • Review flagged content to drive enforcement and policy improvements
  • Enforce usage policies with a focus on detecting and mitigating potential harmful use of AI systems
  • Support the Safeguards policy design team by providing detailed feedback on policy gaps based on real enforcement scenarios
  • Keep up to date with emerging AI policy enforcement best practices, and use these to inform our decision-making and workflows

Preferred Qualifications

    No preferred qualifications provided.