NVIDIA Ltd.
Location

Santa Clara, California,
United States

Job Type
Full Time
Apply Deadline

* opens in a new window

Details

NVIDIA is looking for a senior engineer to design and develop data center infrastructure automation frameworks core to large-scale supercomputer and cloud deployments, reliable, scalable and efficient automation to support core infrastructure-as-code workflows and tools, including CI/CD pipelines, compute resource management flow, developer productivity tools for environments that incorporate Data Processing Units (DPUs), as well as other devices and resources. We are looking for an engineer who has a deep understanding of data center automation tools (such as Ansible, Puppet, Chef, Salt) in a distributed systems environment, outstanding design skills and a track record in building and delivering large-scale software infrastructure, as well as experience administering embedded devices, their operating systems, and firmware. What You Will Be Doing - Designing and developing DC automation frameworks, workflows and data models for automating processes - Create workflows to automate DC automation processes such as provisioning, patching, packaging and change management activities for DPUs. - Perform integrations between various DC systems such as DCIM, monitoring and reporting, control plane of infrastructure (storage, compute and networking) - Develop DC automation optimized to the application What We Need To See - Minimum 8+ years of experience in DC automation framework to support highly-available, large-scale, cloud service environments - BS Degree or the equivalent combination of education, technical training, and work experience - Expertise in data center automation processes - provisioning, patching, packaging and change management activities - Deep knowledge of DCIM, asset DB design/implementation and DC automation tools such as Ansible, Puppet, Chef or Salt - Expertise in CI/CD tools such as Jenkins or AWX - Experience deploying and maintaining embedded devices such as SmartNICs, DPUs, and other devices that run Linux on less-tha n -traditional servers. Experience with ARM CPU based systems. - Background working with Mellanox networking technologies or Infiniband environments preferred. - Good communication and soft skills, able to present to senior management in a sensible and persuasive manner - Love to influence and establish relationships with other software and functional groups such as development, server, storage and security teams Ways To Stand Out From The Crowd - You have architected, built, and deployed DC automation framework for a large scale (1000s of machines) in the past, used by 1000s of people - Passionate about innovating and investing in ground breaking technologies NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for great people like you to help us accelerate the next wave of artificial intelligence. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression , sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

* opens in a new window