Posted in

Software Engineer – Sustaining

Software Engineer – Sustaining

CompanyBerkshire Grey
LocationBedford, MA, USA
Salary$Not Provided – $Not Provided
TypeFull-Time
DegreesBachelor’s, Master’s
Experience LevelJunior, Mid Level

Requirements

  • Bachelor’s degree in computer science, or related field
  • 2+ years in software development or reliability engineering
  • Strong coding skills in Python
  • Experience in a fast-paced, agile environment
  • Demonstrated ability to: Investigate and triage production issues end-to-end, Analyze logs, metrics, and telemetry to pinpoint root causes, Develop fixes or workarounds under tight SLAs, Ship stable patches and rollouts with minimal disruption, Communicate status and technical tradeoffs clearly to stakeholders
  • Comfortable with: Linux (Ubuntu), Version control (Git), Issue tracking (Jira)

Responsibilities

  • Lead investigation of field and lab failures; own root-cause analysis and drive fixes
  • Instrument code with metrics/logs; develop health checks and self-healing routines
  • Design, build, test, and deploy hotfixes and maintenance releases
  • Identify recurring issues; propose and implement design or process changes to raise MTBF and lower MTTR
  • Work with development teams to bake reliability into new features; train support teams on diagnostics
  • Maintain clear runbooks; track and report on reliability KPIs

Preferred Qualifications

  • Master’s degree in CS, Robotics, or related field
  • Familiarity with: Monitoring stacks (Elastic/Kibana, Prometheus/Grafana), Distributed in-code tracing frameworks (OpenTelemetry), Container orchestration (Docker, Kubernetes), Automated test frameworks (pytest, unit/system tests)
  • Hands-on experience with robotic applications or other high-uptime systems
  • Data-driven mindset: profiling, statistics, pandas