Skip to content

Distinguished Engineer
Company | Geico |
---|
Location | Bethesda, MD, USA |
---|
Salary | $150000 – $300000 |
---|
Type | Full-Time |
---|
Degrees | |
---|
Experience Level | Expert or higher |
---|
Requirements
- Deep hands-on experience in building complex distributed system to process large scale telemetry and architectures to support the scale and performance, with great knowledge on Docker and Kubernetes
- Advance knowledge of at least two of the OOP language such as Java, Go, Python, etc.
- Great understanding of open-source databases like MySQL, PostgreSQL, etc. And strong foundation with No-SQL databases like Clickhouse, Cassandra. Apache Trino etc. Knowledge or Big data formats such as Parquet or Avro etc.
- Experience in architecting, designing, building Observability platform solutions, Advanced data analytics using Open-Source technologies are a big plus.
- Experience building distributed systems
- Excellent communication skills – needs to be able to lead projects from the front and interact with clients and sponsors on a regular basis
- Experience partnering with engineering teams and transferring research to production
- Experience with continuous delivery (CI/CD) and Infrastructure as Code
- In-depth knowledge of CS data structures and algorithms
- Experience solving analytical problems with quantitative approaches
- Experience with Windows Server Administration and Windows Event Log
- Ability to excel in a fast-paced, startup-like environment
- Willing to work on both fast development and operation environment
- Knowledge of developer tooling across the software development life cycle (task management, source code, building, deployment, test automation and related tools, operations, real-time communication)
- Knowledge in big data and streaming data pipeline architecture (Lambda/Kappa) and K8 cluster
- Experience in open-source tools like GIT/Jenkin/CircleCI, and knowledge in Terraform/Ansible is a plus
- Knowledge of ML and AI technologies
- Knowledge on Open-source monitoring software like Grafana and Prometheus
Responsibilities
- Develop and drive the overall tech strategy for the Reliability and observability tools organization, and report to the Senior Director
- Focus on multiple areas and provide technical and thought leadership as Observability Domain Technical Champion
- Collaborate with product managers, team members, customers, and other engineering teams to solve our toughest problems
- Develop and execute technical software development strategy for the Observability Engineering domain
- Accountable for the quality, usability, and performance of the solutions
- Be a role model and mentor, helping to coach and strengthen the technical expertise and know-how of our engineering and product community. Influence and educate executives
- Consistently share best practices and improve processes within and across teams
- Lead the design and architecture of resilient and scalable systems, considering both on-premises and cloud-based solutions
- Develop and maintain comprehensive incident response plans to address various disaster scenarios on our backup/restore systems
- Conduct regular simulations and drills to ensure the readiness of the organization in the event of a disaster
- Hands-on software engineering and SDLC best practices (Technical Review Documents, Architecture, Software Development, Software Reviews, Testing, Production Readiness Reviews, among others)
- Evaluate, select, and implement cutting-edge technologies and tools to enhance our data safeguard capabilities including but not limited to processes, compliance, and visibility
- Stay current with industry best practices and emerging technologies to continuously improve our data safeguard capabilities
- Work closely with executive leadership, IT teams, and other stakeholders to communicate the importance of data safeguarding and foster a culture of resilience
- Act as a trusted advisor, providing guidance on backups/restores and security of data, matters to technical and non-technical stakeholders
- Possess Finops discipline, Analyze cost and forecast, incorporating them into business plans
- Determine and support resource requirements, evaluate operational processes, measure outcomes to ensure desired results, and demonstrate adaptability and sponsoring continuous learning
Preferred Qualifications
- Experience in open-source frameworks
- 4+ years of experience with AWS, GCP, Azure, or another cloud service