Posted in

Incident Manager

Incident Manager

CompanyCrusoe
LocationSan Francisco, CA, USA
Salary$140000 – $165000
TypeFull-Time
Degrees
Experience LevelMid Level, Senior

Requirements

  • Strong technical experience with Linux, Virtualization, Kubernetes, and handling customer incidents.
  • Solid understanding of the TCP/IP stack.
  • Understanding of Infrastructure-as-Code (IaC) practices.
  • Excellent communication skills, both written and verbal.
  • Proven problem-solving mindset with the ability to diagnose and resolve complex technical issues.
  • 3-5+ years’ experience in a team leadership role while acting as a liaison with external/internal customers.
  • 4-5 years of customer facing experience.

Responsibilities

  • Diagnose and resolve complex technical issues related to Infiniband, containerization, and distributed training, ensuring minimal disruption to customer operations.
  • Guide and assist customers in implementing and optimizing their HPC infrastructure to achieve maximum performance and efficiency.
  • Develop and deliver training materials, including internal training sessions, documentation, and knowledge base articles, to empower customers to effectively utilize our solutions.
  • Work closely with internal engineering and product teams to provide valuable customer feedback and contribute to the improvement of product quality and the overall customer experience.

Preferred Qualifications

  • Programming skills with one or more programming languages.