Posted in

Technical Lead – Services Reliability & Management

Technical Lead – Services Reliability & Management

CompanyServiceNow
LocationOrlando, FL, USA
Salary$Not Provided – $Not Provided
TypeFull-Time
Degrees
Experience LevelExpert or higher

Requirements

  • Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving. This may include using AI-powered tools, automating workflows, analyzing AI-driven insights, or exploring AI’s potential impact on the function or industry.
  • 10+ years of professional software delivery experience, with a focus on Microservices architecture and support.
  • Strong proficiency in Java and/or Python.
  • Extensive experience with containerization technologies such as Docker and orchestration tools like Kubernetes.
  • Deep understanding of RESTful APIs and API gateway technologies.
  • Strong knowledge of SQL, NoSQL, and in-memory databases.
  • Familiarity with CI/CD tools such as Jenkins, GitLab CI.
  • Knowledge of event-driven architectures and messaging systems like Kafka, RabbitMQ.
  • Excellent problem-solving skills and the ability to think critically and analytically.
  • Experience with cloud platforms such as Azure, or Google Cloud.
  • Experience with monitoring and logging tools such as Prometheus, Grafana, ELK stack, or Splunk.
  • Experience in defining and rolling out key support/operational processes is essential.
  • Strong communication skills, with the ability to articulate complex technical concepts to both technical and non-technical stakeholders.
  • Team leadership experience and mindset to celebrate successes and acknowledge the hard work and dedication of the team.

Responsibilities

  • Oversee and ensure high-performance support for deployed microservices, including AI/ML, foundational, and integration services.
  • Collaborate closely with different internal departments to understand their business requirements and manage expectations clearly.
  • Engage with peers across various departments to comprehend their critical needs and provide reliable support services that enhance business efficiency.
  • Tasked with proactively monitoring and troubleshooting service performance, ensuring high availability and reliability to minimize any potential business impact.
  • Engage with partner teams to gather comprehensive details on service alerts.
  • Analyze each incident thoroughly to identify the root cause, develop and provide technical solutions as necessary to resolve the incidents.
  • Responsible for preparing and communicating operational metrics both within the organization and to external stakeholders.
  • Lead and mentor a team of engineers, fostering a culture of continuous improvement.
  • Define and rollout best practices for technical support and operational activities.
  • Stay updated with the latest industry trends and technologies, advocating for their adoption where appropriate.

Preferred Qualifications

    No preferred qualifications provided.