Staff Devops Engineer
Company | BuildOps |
---|---|
Location | Los Angeles, CA, USA |
Salary | $155000 – $190000 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s, PhD |
Experience Level | Senior, Expert or higher |
Requirements
- A minimum of seven years of professional experience in DevOps, Site Reliability Engineering, or Infrastructure Engineering roles is required.
- A minimum of five years of practical experience with Amazon Web Services (AWS) services and architecture is required.
- Comprehensive understanding of Infrastructure as Code principles and tools, specifically Terraform and CloudFormation.
- Extensive experience with containerization technologies, including Docker, Kubernetes, Elastic Container Service (ECS), and Elastic Kubernetes Service (EKS).
- Demonstrated experience in implementing and managing Continuous Integration/Continuous Delivery (CI/CD) pipelines using GitHub Actions, Codepipeline, and Codebuild.
- Expertise in monitoring and observability tools, such as Prometheus, Grafana, CloudWatch, and Datadog.
- Proficient scripting skills in languages such as Python and Bash.
- Thorough understanding of networking concepts, including Virtual Private Clouds (VPCs), subnets, security groups, and load balancing.
- Proven ability to lead technical initiatives and manage complex projects.
- Experience implementing security best practices within cloud environments.
- Knowledge of database administration and optimization, specifically Aurora Relational Database Service (RDS) MySQL and PostgreSQL.
- Excellent communication skills with the ability to articulate complex technical concepts clearly and concisely.
- Experience with infrastructure automation and configuration management tools.
- Proficiency in programming languages (e.g., Python, Java/Springboot, Node.js) with the ability to debug complex applications and develop infrastructure automation tools.
- Advanced knowledge of Linux systems administration, including performance tuning, troubleshooting, and security hardening.
- Experience with kernel-level operations, system calls, and low-level Linux internals.
- Proficiency in the general networking stack and the ability to diagnose and resolve complex system issues.
- A Bachelor’s, Master’s, or Doctoral degree from a leading university in computer science, engineering, or related fields is required.
Responsibilities
- Design and implement robust, scalable infrastructure as code using tools like Terraform, CloudFormation.
- Build and maintain CI/CD pipelines that enable fast, reliable software delivery.
- Architect and implement software services, covering many aspects such as infrastructure, deployment, networking, monitoring, alerting, and observability solutions to ensure system reliability and resilience.
- Develop and enforce infrastructure security best practices across all environments.
- Implement cost optimization strategies for cloud resources without compromising performance.
- Own and drive the SRE culture, focusing on reliability metrics like DORA, SLOs, SLIs, and error budgets.
- Lead incident response for critical production issues and conduct thorough post-mortems where applicable.
- Mentor and provide technical guidance to other engineers on infrastructure and reliability practices.
- Build automated tooling to reduce operational overhead and improve developer productivity.
- Establish and document infrastructure standards and best practices.
- Collaborate effectively with software engineers, product managers, and other stakeholders.
Preferred Qualifications
- Excels in high-growth startup environments.
- Demonstrates strong strategic thinking.
- Shows problem-solving abilities.
- Displays self-motivation.
- Collaborative team player.
- Curious and eager to learn.
- Possesses a strong work ethic.
- Maintains integrity.
- Exhibits grit.