Posted in

Senior Principal Data Engineer

Senior Principal Data Engineer

CompanyYahoo
LocationUnited States
Salary$160965 – $349885
TypeFull-Time
Degrees
Experience LevelExpert or higher

Requirements

  • 10+ years of software engineering experience, with a strong emphasis on system design and backend development.
  • 5+ years of experience developing large-scale analytics or ML systems for enterprise applications
  • 2+ years hands-on experience with Google Cloud Platform ecosystem (BigQuery, Dataproc, Composer, Dataflow, BigTable) or AWS equivalent.
  • 2+ years working with Hadoop technologies and distributed computing frameworks (Spark, Kafka, Hive, HBase)
  • Proven track record implementing data pipelines that process and analyze petabyte-scale datasets
  • Strong fundamentals: algorithms, distributed computing, data structure, database
  • Fluency with at least one object-oriented programming language from Java, Python, or Scala is highly desirable, as these skills are critical for developing robust applications and managing data workflows effectively. SQL proficiency is also valued for database operations.
  • Experience in audience platforms and managing large-scale data is a plus.

Responsibilities

  • Design and architect sophisticated data pipeline solutions for Yahoo’s Consumer Data Platform that effectively handle petabyte-scale user data across our cloud-native infrastructure, establishing a single, authoritative source for all user data across Yahoo properties.
  • Collaborate across horizontal and vertical business units with data scientists, ML engineers, product managers, and business stakeholders to define technical approaches that directly enhance experimentation, monetization, marketing, and personalization initiatives.
  • Demonstrate end-to-end ownership of complex distributed data systems from conception through production, making high-velocity, data-driven decisions that streamline how teams access, manage, and utilize customer information.
  • Apply deep technical expertise to diagnose and resolve critical data system bottlenecks, ensuring high availability while continuously optimizing for performance and data integrity across Yahoo properties.
  • Pioneer new consumer data applications by developing POCs that advance our data registration, discovery, and segmentation capabilities while strengthening regulatory compliance and enhancing the overall user experience.
  • Implement robust monitoring, alerting, and self-healing mechanisms for data pipelines that process billions of user interactions daily, consolidating our user data infrastructure with a focus on greater efficiency and reduced risk.
  • Provide architectural guidance to engineers across multiple teams, establish best practices for consumer data system design, and influence long-term technical roadmaps that enable teams to leverage user data with maximum efficiency and compliance.

Preferred Qualifications

  • Experience in audience platforms and managing large-scale data is a plus.