Senior System Software Engineer – DC Platform Software Tools
Company | NVIDIA |
---|---|
Location | Santa Clara, CA, USA |
Salary | $184000 – $356500 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s, PhD |
Experience Level | Expert or higher |
Requirements
- BS, MS, or PhD in EE/CS or related field of education (or equivalent experience) with 10+ years of experience
- Proven record of having worked in management solutions for large scale clusters in data centers
- Strong and demonstrable skill in Python
- Experience programming and debugging skills for large scale data centers
- Experience in SCM (e.g., Git, Perforce) and project management tools like Jira
- Possess excellent written and oral communication skills, excellent work ethics, a deep sense of teamwork, love to produce quality work and commitment to finish your tasks every single day
- You are a self-starter who loves to find creative solutions to complicated problems and hands on with coding.
Responsibilities
- Drive next generation GPU Server Software manageability workflows for scaling AI infrastructure for Datacenters
- Work with internal and external customers to understand requirements for various tools to improve debuggability, serviceability and runtime of data center firmware and software
- Contribute to all phases of product development, from product definition, architecture, and design, through implementation, debugging, testing and early customer support
- Maintain detailed documentation of tool designs, capabilities, and usage guidelines
- Provide regular reports and technical insights to internal teams on the effectiveness and improvements of developed tools
- Define KPIs for tools and work across various stakeholders to improve it over time.
Preferred Qualifications
- Worked on data center deployment and management projects
- Hands on with x86 or ARM system architecture
- Are familiar with processor microarchitecture such as caches, pipelining, memory hierarchy, and instruction set architecture (ISA)
- Experience with code coverage and static analysis tools.