Posted in

Staff Site Reliability Engineer – Storage

Staff Site Reliability Engineer – Storage

CompanyCrusoe
LocationSan Francisco, CA, USA
Salary$250000 – $250000
TypeFull-Time
Degrees
Experience LevelSenior, Expert or higher

Requirements

  • 8+ years of professional experience in SRE, systems, or storage engineering.
  • Hands-on experience with distributed storage systems (e.g., Ceph, GlusterFS, OpenEBS) and deep understanding of object, block, and file storage paradigms.
  • Proficiency in a programming language such as Python, Go, Java, or C.
  • Experience with Infrastructure as Code and deployment tooling such as Terraform, Ansible, or Puppet.
  • Deep knowledge of Linux internals with a focus on I/O subsystems, memory management, and storage scheduling.
  • Familiarity with storage protocols like NFS, SMB, iSCSI, or NVMe-oF.
  • Strong experience working with containerized workloads and orchestration platforms (e.g., Kubernetes, Docker).
  • Excellent incident response, troubleshooting, and documentation practices.
  • Excellent communication skills.
  • Must be able to pass a background check.

Responsibilities

  • Build automation and self-healing tools to monitor and maintain Crusoe’s distributed cloud storage infrastructure, which includes block, file, and object storage systems.
  • Drive reliability initiatives focused on data replication, encryption, backup and restore strategies, and robust failover mechanisms.
  • Collaborate closely with storage engineers to implement and maintain high-performance NVMe- and SSD-backed volumes that support large-scale AI compute clusters.
  • Support user-facing storage services with a focus on availability, performance tuning, and adherence to error budgets.
  • Investigate and resolve storage-related incidents using deep telemetry, logs, and performance profiling.
  • Partner with hardware and kernel teams to diagnose low-level I/O issues and optimize I/O paths, cache policies, and file systems.
  • Contribute to the architecture of fault-tolerant, scalable storage backends tailored for AI-first cloud environments.

Preferred Qualifications

    No preferred qualifications provided.