Senior Data Infrastructure Engineer

Experience building the infrastructure for large-scale data processing pipelines (both batch and streaming) using tools like Spark, Kafka, Apache Flink, and Apache Beam.
Experience designing and implementing large-scale data storage systems (feature store, timeseries DBs) for ML use cases. Strong familiarity with relational databases, data warehouses, object storage, timeseries data, and being adept at DB schema design.
Experience building data pipelines for external data sources that are observable, debuggable, and verifiably correct. Have dealt with challenges like data versioning, point-in-time correctness, and evolving schemas.
Strong distributed systems and infrastructure skills. Comfortable scaling and debugging Kubernetes services, writing Terraform, and working with orchestration tools like Flyte, Airflow, or Temporal.
Strong software engineering skills. Being able to write easy-to-extend and well-tested code.

Owning and scaling our data infrastructure by several orders of magnitude. This includes our data pipelines, distributed data processing, and data storage.
Building a unified feature store for all our ML models.
Efficient storing and loading hundreds of terabytes of weather data for use in AI-based weather models.
Processing and storing predictions and evaluation metrics for large-scale forecasting models.

You have 4+ years of experience building data infrastructure or data platforms
You have experience with ML infrastructure and have worked at companies that use ML for core business functions
You’re comfortable with ambiguity and a fast-moving environment, and have a bias for action
You learn and pick up new skills quickly
You’re motivated in making a real-world impact on climate and energy