Posted in

Cloud Systems Engineer

Cloud Systems Engineer

CompanyMasterControl
LocationSalt Lake City, UT, USA
Salary$150000 – $180000
TypeFull-Time
DegreesBachelor’s, Master’s
Experience LevelMid Level, Senior

Requirements

  • MS or BS in Computer Science (or equivalent experience).
  • Strong 4-5+ years hands-on DevOps experience working on complex cloud-based CI/CD pipelines (eg. Github Actions, Spinnaker)
  • 3-5+ years of experience with AWS or GoogleCloud
  • 3+ years of experience with Kubernetes (EKS, ArgoCD, Istio)
  • 10+ years building and scaling distributed systems leveraging web-scale technologies like Linux, Apache, Nginx, NoSQL, RDBMS, Redis, Postgres, and Vault.
  • Experience with containerized workloads and the management and deployment of Kubernetes (K8s).
  • Experience with Linux/Unix internals and systems services like DNS, DHCP, iptables, smtp.
  • Experience in cloud-based networking/routing using protocols and troubleshooting tools such as ping, tracert, etc
  • Experience with observability systems (monitoring, tracing, logging) to instrument and manage large-scale global systems and 24×7 availability.
  • Experience building and maintaining application stacks in AWS cloud environments.
  • Experience with cloud based IAM.
  • Professionally programmed in one or more of the following languages: Go, Java, Python, Ruby, Terraform, terragrunt Shell, PowerShell.
  • Experience in one or more of the following CI/CD tools – ArgoCD, Spinnaker, Harness, GoCD, Travis, Drone, Jenkins.
  • Network troubleshooting skills (ping, tracert, etc)
  • Experience / Knowledge of developing system requirements, documentation, diagrams, implementation plans, troubleshooting & operational procedures.
  • Strong service-oriented team player but able to work independently
  • Time management, attention to detail, expectation setting and organizational skills to provide solid service management and process improvement
  • Ability to work with minimal supervision, making decisions based upon priorities, schedules and an understanding of business decisions.

Responsibilities

  • Architect, design, build, and implement automation, tooling, and processes to ensure the consistency, reliability, availability, and scalability of MasterControl’s production cloud infrastructure, systems, and networking.
  • Instrument observability in systems for reliability, performance, and efficiency of MasterControl’s products.
  • Influence, design, and create new architectures, standards and methods for large-scale enterprise systems.
  • Test, certify, and document new systems, environments, and cloud services
  • Drive root-cause analysis sessions of complex problems involving multiple parties, networks, hardware, and software related to end-to-end flow (value stream mapping), scaling, and performance.
  • Define standards for deployments at scale, infrastructure, reliability, and scalability, then iterate and optimize continual improvements.
  • Influence internal customers such as product, operations, and engineering teams across MasterControl, having foremost a customer focus, demonstrating world-class quality, effective communication, decisive, and work in a fast-moving environment and able to quickly and constructively resolve conflict.
  • Help support the management of MasterControl’s always-available – infrastructure and runtime, deployment pipelines, and platform tooling to eliminate downtime and improve the manageability of services and systems.
  • Manage service availability and scalability through well-thought-out processes, custom tooling, and an eye for automation. Perform blameless post-mortems and help iterate and optimize incident response processes by helping bridge development teams and operations.
  • Lead incident response for production incidents helping drive investigation(s), working with teams through analysis, troubleshooting, discovery, and resolving production incidents.
  • Champion continuous improvements using *Accelerate* metrics to systematically drive down detection (MTTD) and mitigation/restoration (MRRT) measurements.
  • Build infrastructure as code, drive automated configurations and maintain the automation
  • Design, build, and automate new solutions centered around release stability and readiness.

Preferred Qualifications

  • Experience with hybrid and other cloud vendors such as, Google Cloud Platform (GCP) and Azure is a plus.
  • Security and compliance training such as ITIL, PCI, ISO 27001/27017, SOX, and SOC a plus
  • Operations management and monitoring tools (Grafana, PagerDuty, Nagios, Icinga, etc)

Benefits

    No information provided on Benefits.