Site Reliability Engineer - Cloud

Site Reliability Engineer – Cloud

MS or BS in Computer Science/Engineering or a related field or equivalent experience.
5+ years of experience supporting technical operations in a live-site production environment with a real passion for automation and tooling.
Built and ran critical production services packaged or custom python/java on Windows or Linux.
Strong knowledge of Kubernetes Platform, deployments, automation.
SRE On call experience is a must.
Advance level experience with scripting and development in (Python).
Shown strengths in problem-solving and root causing issues.

Rapidly debug and triage user-reported issues on the Digital Marketing Organization.
On-board new applications and services on AWS Infrastructure.
Make valuable contribution to the overall health, performance, and uptime of our services running in Linux and Windows.
Implement monitors, alerts and SOPs to ensure early detection, and accurate response to service-impacting issues.
Taking ownership of automating, scripting, and tooling of new/existing scripts to help the team achieve 100% automation of daily tasks.

Strong Experience with AWS Cloud Platform, Kubernetes as a platform.
Excellent communication, presentation, social, and analytical skills; the ability to communicate sophisticated interaction concepts clearly and persuasively across different audiences and varying levels of the organization.