Production Support Engineering LMTS
Company | Salesforce |
---|---|
Location | Boston, MA, USA, San Francisco, CA, USA, Bellevue, WA, USA |
Salary | $184000 – $276100 |
Type | Full-Time |
Degrees | Bachelor’s |
Experience Level | Senior, Expert or higher |
Requirements
- 8+ years experience in a SRE role or related field (DevOps, Production Operations etc)
- Experience in Public Cloud environments, specifically with AWS
- Experience with New Relic, collectd, Splunk, Sumo Logic, Grafana, Terraform, Jenkins, Kubernetes, Spinnaker or related tools
- Excellent knowledge of Internet technologies and protocols (TCP/IP, DNS, HTTP, SSL, etc.)
- Strong experience with API fundamentals (SOAP, REST, RAML or OAS)
- Ability to root cause sources of instability in high-traffic, large-scale distributed systems
- Solid knowledge of large-scale complex systems from a reliability perspective
- Passion for resolving reliability issues and identifying strategies to mitigate repeat issues.
- Experience with development in Python, Go, Bash, or related.
- Experience with FedRAMP environments.
- A related technical degree required.
Responsibilities
- Maintain and improve service reliability, availability, and performance across distributed systems and applications.
- Design, build, and maintain comprehensive monitoring, logging, and alerting systems to detect and address issues proactively.
- Respond to production incidents, perform root cause analysis, and implement preventative measures.
- Automate repetitive tasks using scripts, configuration management, and infrastructure-as-code (IaC) tools to improve efficiency and consistency.
- Monitor usage trends, forecast growth, and scale systems to meet future demands while controlling costs.
- Maintain and improve continuous integration and continuous delivery pipelines; ensure safe and frequent deployments.
- Collaborate with security teams to ensure systems adhere to best practices and compliance requirements and participate in security assessments for onboarding and maintaining FedRAMP services.
- Work closely with development teams to design resilient and scalable systems; participate in architectural decisions.
- Create and maintain clear, detailed documentation for runbooks, systems, and processes.
Preferred Qualifications
-
No preferred qualifications provided.