Incident Response Engineer
Company | DigiCert |
---|---|
Location | Lehi, UT, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | |
Experience Level | Senior |
Requirements
- 5+ years of experience in IT, Service Operations, or Development Operations related roles.
- 3+ years of experience with Deployment Tools: SALT, Kubernetes, Docker, Jenkins.
- 5+ years of experience with multiple OS flavors: Linux, AWS.
- 5+ years of experience in the Hi-tech industry.
- 2+ years of experience with Database Environments: MySQL, Casandra.
- 2+ years of experience with multiple programming languages.
Responsibilities
- Perform proactive daily monitoring of our services including reviewing system and applications logs and manage Incident life cycle (detection, confirmation, notification, repair/Isolation, escalation, resolution and reporting) to ensure quick turnaround in service restoration.
- Repair and recover from hardware or software failures. Coordinate and communicate with impacted stakeholders and clients, escalating where appropriate.
- Work closely with development and engineering teams helping to build, maintain and extend support for all production services.
- Review entire environment and execute initiatives to reduce failures, defects and improving overall performance.
- Monitor and troubleshoot issues across the entire stack – hardware, software, application and network.
- Demonstrate technical leadership with incident handling and troubleshooting.
- Document current and future configuration processes and policies.
- Assist with the implementation and development of SRE tools and applications.
- Manage and support SRE tools and applications.
- Perform periodic on-call duty as part of a global team.
- Able to install and manage web certificates (SSL, Client Auth).
- Prior working knowledge of Salt, Splunk, JIRA, Atlassian Wiki, NewRelic.
Preferred Qualifications
-
No preferred qualifications provided.