Senior System Software Engineer, NCCL

Vor 4 Tagen


Zürich, Zürich, Schweiz NVIDIA Vollzeit CHF 120'000 - CHF 180'000 pro Jahr

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars.

Come work for the team that brought to you NCCL, NVSHMEM & GPUDirect. Our GPU communication libraries are crucial for scaling Deep Learning and HPC applications We are looking for a motivated Partner Enablement Engineer to guide our key partners and customers with NCCL. Most DL/HPC applications run on large clusters with high-speed networking (Infiniband, RoCE, Ethernet). This is an outstanding opportunity to get an end to end understanding of the AI networking stack. Are you ready for to contribute to the development of innovative technologies and help realize NVIDIA's vision?

What you will be doing:

  • Engage with our partners and customers to root cause functional and performance issues reported with NCCL

  • Conduct performance characterization and analysis of NCCL and DL applications on groundbreaking GPU clusters

  • Develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (Azure, AWS, GCP, etc.)

  • Guide our customers and support teams on HPC knowledge and standard methodologies for running applications on multi-node clusters

  • Document and conduct trainings/webinars for NCCL

  • Engage with internal teams in different time zones on networking, GPUs, storage, infrastructure and support.

What we need to see:

  • B.S./M.S. degree in CS/CE or equivalent experience with 5+ years of relevant experience. Experience with parallel programming and at least one communication runtime (MPI, NCCL, UCX, NVSHMEM)

  • Excellent C/C++ programming skills, including debugging, profiling, code optimization, performance analysis, and test design

  • Experience working with engineering or academic research community supporting HPC or AI

  • Practical experience with high performance networking: Infiniband/RoCE/Ethernet networks, RDMA, topologies, congestion control

  • Expert in Linux fundamentals and a scripting language, preferably Python

  • Familiar with containers, cloud provisioning and scheduling tools (Docker, Docker Swarm, Kubernetes, SLURM, Ansible)

  • Adaptability and passion to learn new areas and tools

  • Flexibility to work and communicate effectively across different teams and timezones

Ways to stand out from the crowd:

  • Experience conducting performance benchmarking and developing infrastructure on HPC clusters. Prior system administration experience, esp for large clusters. Experience debugging network configuration issues in large scale deployments

  • Familiarity with CUDA programming and/or GPUs. Good understanding of Machine Learning concepts and experience with Deep Learning Frameworks such PyTorch, TensorFlow

  • Deep understanding of technology and passionate about what you do

NVIDIA is at the forefront of breakthroughs in Artificial Intelligence, High-Performance Computing, and Visualization. Our teams are composed of driven, innovative professionals dedicated to pushing the boundaries of technology. We offer highly competitive salaries, an extensive benefits package, and a work environment that promotes diversity, inclusion, and flexibility. As an equal opportunity employer, we are committed to fostering a supportive and empowering workplace for all.



  • Zürich, Zürich, Schweiz Rocken AG Vollzeit CHF 130'000 pro Jahr

    Senior Software Engineer Rocken AG Zürich, Switzerland days ago Role details Contract type Permanent contract Employment type Full-time (> 32 hours Working hours Regular working hours Languages English, German Experience level Senior Compensation CHF 130K Job location Zürich, Switzerland Tech stack Artificial Intelligence Cloud Computing Software Quality...


  • Zürich, Zürich, Schweiz NVIDIA Vollzeit CHF 120'000 - CHF 180'000 pro Jahr

    NVIDIA is seeking a Senior High Performance Computing (HPC) and AI Networking Performance Research and Analysis Engineer to join our Performance group. In this exciting role, you will profile and analyze AI workloads on large GPUs and CPUs scale clusters for distributed Deep Learning LLM training focused on collectives communication and networking. You will...

  • Senior System Engineer

    vor 2 Wochen


    Zürich, Zürich, Schweiz Klyven Vollzeit CHF 80'000 - CHF 120'000 pro Jahr

    Role DescriptionThe Senior System Engineer is responsible for designing, implementing, maintaining, and optimizing complex IT infrastructure and systems that support organizational operations. This role involves leading technical projects, managing system performance, ensuring network security, and mentoring junior engineers. The Senior System Engineer works...


  • Zürich, Zürich, Schweiz NVIDIA Vollzeit CHF 120'000 - CHF 150'000 pro Jahr

    Intelligent machines powered by Artificial Intelligence computers that can learn, reason and interact with people are no longer science fiction. GPU Deep Learning has provided the foundation for machines to learn, perceive, reason and solve problems. Today, visual computing is a crucial tool in helping people get along with technology, and NVIDIA has...


  • Zürich, Zürich, Schweiz NVIDIA Vollzeit CHF 120'000 - CHF 180'000 pro Jahr

    Intelligent machines powered by Artificial Intelligence computers that can learn, reason and interact with people are no longer science fiction. GPU Deep Learning has provided the foundation for machines to learn, perceive, reason and solve problems. Today, visual computing is a crucial tool in helping people get along with technology, and NVIDIA has...


  • Zürich, Zürich, Schweiz NVIDIA Vollzeit CHF 120'000 - CHF 180'000 pro Jahr

    Intelligent machines powered by Artificial Intelligence computers that can learn, reason and interact with people are no longer science fiction. GPU Deep Learning has provided the foundation for machines to learn, perceive, reason and solve problems. Today, visual computing is a crucial tool in helping people get along with technology, and NVIDIA has...


  • Zürich, Zürich, Schweiz GlobalEngineer GmbH Vollzeit

    GlobalEngineer is an innovative company operating in various engineering sectors. In mechanical and plant engineering — for example, in the energy industry — we are engaged in the development and implementation of innovative systems and technologies, as well as in testing and type approval. To strengthen our team, we are looking for an interested and...


  • Zürich, Zürich, Schweiz Oliver James Vollzeit

    Senior System Engineer - Linux & KubernetesJob Type: Permanent, Full-time %)Location: Zurich - HybridWe've partnered with one of the leading universities in Zurich that's advancing its IT infrastructure. They are looking for an experienced Senior System Engineer - Linux to strengthen their platform operations and play a key role in shaping the future of...


  • Zürich, Zürich, Schweiz Maison du Software Vollzeit

    Maison du Software entwickelt massgeschneiderte Softwarelösungen für die Transport- und Lagerlogistik. Mitten im lebendigen Quartier rund um die Hardbrücke Zürich, arbeiten rund 30 Talente mit Leidenschaft an zukunftsweisenden Lösungen. Entwickelt werden Cloud-Anwendungen, die durch fortschrittliche, verteilte Systeme und intelligente Algorithmen die...


  • Zürich, Zürich, Schweiz Adnovum AG Vollzeit

    100% What you're going to do As a Senior Software Engineer at Adnovum, you will work with highly skilled and experienced engineers who have their work measured against the highest standards. The projects you work on will not only solve the customers' problems but deliver the added value that defines all Adnovum' s software solutions. The main task of a...