Software Engineer – Sustaining
Company | Berkshire Grey |
---|---|
Location | Bedford, MA, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | Bachelor’s, Master’s |
Experience Level | Junior, Mid Level |
Requirements
- Bachelor’s degree in computer science, or related field
- 2+ years in software development or reliability engineering
- Strong coding skills in Python
- Experience in a fast-paced, agile environment
- Demonstrated ability to: Investigate and triage production issues end-to-end, Analyze logs, metrics, and telemetry to pinpoint root causes, Develop fixes or workarounds under tight SLAs, Ship stable patches and rollouts with minimal disruption, Communicate status and technical tradeoffs clearly to stakeholders
- Comfortable with: Linux (Ubuntu), Version control (Git), Issue tracking (Jira)
Responsibilities
- Lead investigation of field and lab failures; own root-cause analysis and drive fixes
- Instrument code with metrics/logs; develop health checks and self-healing routines
- Design, build, test, and deploy hotfixes and maintenance releases
- Identify recurring issues; propose and implement design or process changes to raise MTBF and lower MTTR
- Work with development teams to bake reliability into new features; train support teams on diagnostics
- Maintain clear runbooks; track and report on reliability KPIs
Preferred Qualifications
- Master’s degree in CS, Robotics, or related field
- Familiarity with: Monitoring stacks (Elastic/Kibana, Prometheus/Grafana), Distributed in-code tracing frameworks (OpenTelemetry), Container orchestration (Docker, Kubernetes), Automated test frameworks (pytest, unit/system tests)
- Hands-on experience with robotic applications or other high-uptime systems
- Data-driven mindset: profiling, statistics, pandas