[Hiring] Senior Systems Software Engineer, GPU Compute @Nebius
Senior Systems Software Engineer, GPU Compute @Nebius
Software Development
Salary $170k-$300k + e..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Employment Type full-time
Posted 2mths ago

[Hiring] Senior Systems Software Engineer, GPU Compute @Nebius

2mths ago - Nebius is hiring a remote Senior Systems Software Engineer, GPU Compute. πŸ’Έ Salary: $170k-$300k + equity πŸ“Location: USA

Role Description

We’re looking for a Senior Software Systems Engineer to join our team and play a key role in the development of our cutting-edge hyperscaler platform. The GPU & InfiniBand team is responsible for enhancing and optimizing the core components of our Cloud platform, with a specific focus on GPU computing, InfiniBand networks, and the KVM/QEMU stack. You’ll work closely with hardware virtualization and device emulation technologies, ensuring high performance and security in multi-GPU, HPC environments. The role involves analyzing, troubleshooting, and improving infrastructure to support new hardware, fine-tuning system performance, and automating fault detection and resolution in a complex system.

  • Tuning the performance of GPU clusters and InfiniBand networks to ensure optimal operation in HPC and GPU-based environments.
  • Analyzing and troubleshooting the root cause of issues related to GPUs and InfiniBand networks, and proposing corrective actions.
  • Integrating new hardware into the existing infrastructure, including support for new GPU hardware through software stacks like Kubernetes, QEMU, and KVM.
  • Enhancing automation systems for proactive monitoring, detecting, and resolving issues in GPU and InfiniBand environments.
  • Configuring and managing GPU devices and InfiniBand fabrics, ensuring efficient and reliable operation.

Qualifications

  • 5+ years of professional experience in system-level software development (focused on performance optimization, low-level programming).
  • 3+ years of hands-on experience with Linux systems (administration, troubleshooting, and performance tuning).
  • In-depth understanding of server architecture, including PCIe devices, NICs, Linux OS/Kernel, and high-performance computing (HPC) systems.
  • Strong proficiency in one or more performance-oriented programming languages (C/C++, Go, Python).

Requirements

  • Experience with GPU end-to-end testing in a cluster environment using InfiniBand networking.
  • Proven track record of analyzing and optimizing the performance of HPC workloads (e.g., simulations, data analysis, AI/ML workloads).
  • Familiarity with RDMA, RoCE, and InfiniBand protocols for high-performance communication.
  • Background in Software-Defined Networking (SDN) and experience with HPC cluster networking.
  • Understanding of QEMU/KVM virtualization and managing virtualized environments.
  • Experience with deep learning frameworks such as PyTorch and TensorFlow, and their integration with HPC systems.
  • Familiarity with collective communication libraries like MPI and NCCL for distributed computing.

Benefits

  • Competitive compensation ranging from $170k-$300k + equity based on your experience.
  • Career growth and learning opportunities.
  • Flexibility and work-life balance.
  • Collaborative and innovative culture.
  • Opportunity to work on impactful AI projects.
  • International environment and talented teams.
Before You Apply
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Senior Systems Software Engineer, GPU Compute @Nebius
Software Development
Salary $170k-$300k + e..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Employment Type full-time
Posted 2mths ago
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Application Denied βœ“
Unlock 155,000+ Remote Jobs
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Application Denied βœ“
Unlock 155,000+ Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 155,000+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later