Skip to content

Cloud Systems Engineer
Company | MasterControl |
---|
Location | Salt Lake City, UT, USA |
---|
Salary | $150000 – $180000 |
---|
Type | Full-Time |
---|
Degrees | Bachelor’s, Master’s |
---|
Experience Level | Mid Level, Senior |
---|
Requirements
- MS or BS in Computer Science (or equivalent experience).
- Strong 4-5+ years hands-on DevOps experience working on complex cloud-based CI/CD pipelines (eg. Github Actions, Spinnaker)
- 3-5+ years of experience with AWS or GoogleCloud
- 3+ years of experience with Kubernetes (EKS, ArgoCD, Istio)
- 10+ years building and scaling distributed systems leveraging web-scale technologies like Linux, Apache, Nginx, NoSQL, RDBMS, Redis, Postgres, and Vault.
- Experience with containerized workloads and the management and deployment of Kubernetes (K8s).
- Experience with Linux/Unix internals and systems services like DNS, DHCP, iptables, smtp.
- Experience in cloud-based networking/routing using protocols and troubleshooting tools such as ping, tracert, etc
- Experience with observability systems (monitoring, tracing, logging) to instrument and manage large-scale global systems and 24×7 availability.
- Experience building and maintaining application stacks in AWS cloud environments.
- Experience with cloud based IAM.
- Professionally programmed in one or more of the following languages: Go, Java, Python, Ruby, Terraform, terragrunt Shell, PowerShell.
- Experience in one or more of the following CI/CD tools – ArgoCD, Spinnaker, Harness, GoCD, Travis, Drone, Jenkins.
- Network troubleshooting skills (ping, tracert, etc)
- Experience / Knowledge of developing system requirements, documentation, diagrams, implementation plans, troubleshooting & operational procedures.
- Strong service-oriented team player but able to work independently
- Time management, attention to detail, expectation setting and organizational skills to provide solid service management and process improvement
- Ability to work with minimal supervision, making decisions based upon priorities, schedules and an understanding of business decisions.
Responsibilities
- Architect, design, build, and implement automation, tooling, and processes to ensure the consistency, reliability, availability, and scalability of MasterControl’s production cloud infrastructure, systems, and networking.
- Instrument observability in systems for reliability, performance, and efficiency of MasterControl’s products.
- Influence, design, and create new architectures, standards and methods for large-scale enterprise systems.
- Test, certify, and document new systems, environments, and cloud services
- Drive root-cause analysis sessions of complex problems involving multiple parties, networks, hardware, and software related to end-to-end flow (value stream mapping), scaling, and performance.
- Define standards for deployments at scale, infrastructure, reliability, and scalability, then iterate and optimize continual improvements.
- Influence internal customers such as product, operations, and engineering teams across MasterControl, having foremost a customer focus, demonstrating world-class quality, effective communication, decisive, and work in a fast-moving environment and able to quickly and constructively resolve conflict.
- Help support the management of MasterControl’s always-available – infrastructure and runtime, deployment pipelines, and platform tooling to eliminate downtime and improve the manageability of services and systems.
- Manage service availability and scalability through well-thought-out processes, custom tooling, and an eye for automation. Perform blameless post-mortems and help iterate and optimize incident response processes by helping bridge development teams and operations.
- Lead incident response for production incidents helping drive investigation(s), working with teams through analysis, troubleshooting, discovery, and resolving production incidents.
- Champion continuous improvements using *Accelerate* metrics to systematically drive down detection (MTTD) and mitigation/restoration (MRRT) measurements.
- Build infrastructure as code, drive automated configurations and maintain the automation
- Design, build, and automate new solutions centered around release stability and readiness.
Preferred Qualifications
- Experience with hybrid and other cloud vendors such as, Google Cloud Platform (GCP) and Azure is a plus.
- Security and compliance training such as ITIL, PCI, ISO 27001/27017, SOX, and SOC a plus
- Operations management and monitoring tools (Grafana, PagerDuty, Nagios, Icinga, etc)
Benefits
No information provided on Benefits.