Skip to content

Staff Site Reliability Engineer – Storage
Company | Crusoe |
---|
Location | San Francisco, CA, USA |
---|
Salary | $250000 – $250000 |
---|
Type | Full-Time |
---|
Degrees | |
---|
Experience Level | Senior, Expert or higher |
---|
Requirements
- 8+ years of professional experience in SRE, systems, or storage engineering.
- Hands-on experience with distributed storage systems (e.g., Ceph, GlusterFS, OpenEBS) and deep understanding of object, block, and file storage paradigms.
- Proficiency in a programming language such as Python, Go, Java, or C.
- Experience with Infrastructure as Code and deployment tooling such as Terraform, Ansible, or Puppet.
- Deep knowledge of Linux internals with a focus on I/O subsystems, memory management, and storage scheduling.
- Familiarity with storage protocols like NFS, SMB, iSCSI, or NVMe-oF.
- Strong experience working with containerized workloads and orchestration platforms (e.g., Kubernetes, Docker).
- Excellent incident response, troubleshooting, and documentation practices.
- Excellent communication skills.
- Must be able to pass a background check.
Responsibilities
- Build automation and self-healing tools to monitor and maintain Crusoe’s distributed cloud storage infrastructure, which includes block, file, and object storage systems.
- Drive reliability initiatives focused on data replication, encryption, backup and restore strategies, and robust failover mechanisms.
- Collaborate closely with storage engineers to implement and maintain high-performance NVMe- and SSD-backed volumes that support large-scale AI compute clusters.
- Support user-facing storage services with a focus on availability, performance tuning, and adherence to error budgets.
- Investigate and resolve storage-related incidents using deep telemetry, logs, and performance profiling.
- Partner with hardware and kernel teams to diagnose low-level I/O issues and optimize I/O paths, cache policies, and file systems.
- Contribute to the architecture of fault-tolerant, scalable storage backends tailored for AI-first cloud environments.
Preferred Qualifications
No preferred qualifications provided.