Principal Software Engineer - Sustaining

Principal Software Engineer – Sustaining

Bachelor’s degree in computer science, or related field
10+ years in software development or reliability engineering
Strong coding skills in Python
Experience in a fast-paced, agile environment
Demonstrated ability to: Investigate and triage production issues end-to-end, Analyze logs, metrics, and telemetry to pinpoint root causes, Develop fixes or workarounds under tight SLAs, Ship stable patches and rollouts with minimal disruption, Drive post-mortems and follow-through on corrective action plans, Communicate status and technical tradeoffs clearly to stakeholders
Comfortable with: Linux (Ubuntu), Version control (Git), Issue tracking (Jira)

Lead investigation of field and lab failures; own root-cause analysis and drive fixes
Instrument code with metrics/logs; develop health checks and self-healing routines
Design, build, test, and deploy hotfixes and maintenance releases
Identify recurring issues; propose and implement design or process changes to raise MTBF and lower MTTR
Work with development teams to bake reliability into new features; train support teams on diagnostics
Maintain clear runbooks; track and report on reliability KPIs
Define and drive our sustaining engineering strategy and architecture
Mentor and coach other sustaining engineers on best practices for reliability and incident response
Collaborate with product leadership to integrate reliability objectives into the product roadmap
Own the development and scaling of our platform-monitoring, tracing, and alerting

Master’s degree in CS, Robotics, or related field
Familiarity with: Monitoring stacks (Elastic/Kibana, Prometheus/Grafana), Distributed in-code tracing frameworks (OpenTelemetry), Container orchestration (Docker, Kubernetes), Automated test frameworks (pytest, unit/system tests), Chaos engineering and resilience testing methodologies
Hands-on experience with robotic applications or other high-uptime systems
Data-driven mindset: profiling, statistics, pandas