Posted in

Incident Response Engineer

Incident Response Engineer

CompanyDigiCert
LocationLehi, UT, USA
Salary$Not Provided – $Not Provided
TypeFull-Time
Degrees
Experience LevelSenior

Requirements

  • 5+ years of experience in IT, Service Operations, or Development Operations related roles.
  • 3+ years of experience with Deployment Tools: SALT, Kubernetes, Docker, Jenkins.
  • 5+ years of experience with multiple OS flavors: Linux, AWS.
  • 5+ years of experience in the Hi-tech industry.
  • 2+ years of experience with Database Environments: MySQL, Casandra.
  • 2+ years of experience with multiple programming languages.

Responsibilities

  • Perform proactive daily monitoring of our services including reviewing system and applications logs and manage Incident life cycle (detection, confirmation, notification, repair/Isolation, escalation, resolution and reporting) to ensure quick turnaround in service restoration.
  • Repair and recover from hardware or software failures. Coordinate and communicate with impacted stakeholders and clients, escalating where appropriate.
  • Work closely with development and engineering teams helping to build, maintain and extend support for all production services.
  • Review entire environment and execute initiatives to reduce failures, defects and improving overall performance.
  • Monitor and troubleshoot issues across the entire stack – hardware, software, application and network.
  • Demonstrate technical leadership with incident handling and troubleshooting.
  • Document current and future configuration processes and policies.
  • Assist with the implementation and development of SRE tools and applications.
  • Manage and support SRE tools and applications.
  • Perform periodic on-call duty as part of a global team.
  • Able to install and manage web certificates (SSL, Client Auth).
  • Prior working knowledge of Salt, Splunk, JIRA, Atlassian Wiki, NewRelic.

Preferred Qualifications

    No preferred qualifications provided.