NVIDIA Shifts To Open-Source GPU Kernel Modules

NVIDIA takes a significant step forward in its commitment to open-source software. The company announces a full transition to open-source GPU kernel modules with its upcoming R560 driver release. This move marks a major shift in NVIDIA's approach to driver development and distribution.

Table of Contents

Background

In May 2022, NVIDIA introduced open-source Linux GPU kernel modules with the R515 driver. These modules, released under dual GPL and MIT licenses, initially targeted datacenter compute GPUs. Support for GeForce and Workstation GPUs was in an alpha state at the time.

Progress and Improvements

Over the past two years, NVIDIA has made substantial progress:

Performance: Open-source modules now match or exceed the performance of closed-source drivers.
New capabilities:
- Heterogeneous memory management (HMM) support,
- Confidential computing features,
- Support for coherent memory architectures on Grace platforms.

Supported GPUs

The transition to open-source modules affects different GPU generations differently:

Cutting-edge Platforms: Grace Hopper and Blackwell platforms require open-source modules.
Supported GPUs: Newer architectures such as Turing, Ampere, Ada Lovelace, and Hopper are fully supported by the open-source modules.
Unsupported GPUs: Older GPUs from Maxwell, Pascal, and Volta architectures require continued use of the proprietary drivers due to compatibility limitations.
Mixed Deployments: Systems with a mix of old and new GPUs should continue using the proprietary driver for optimal performance and stability.

If you're unsure about which driver to install, do not worry! NVIDIA provides a detection helper script to guide users in selecting the appropriate driver.

Installer Changes

NVIDIA is shifting the default installation method from proprietary to open-source drivers across all installation methods.

1. Package Managers with CUDA Metapackage

When using a package manager to install CUDA Toolkit, a top-level cuda package installs both the CUDA Toolkit and the associated driver release. For example, installing cuda during the CUDA 12.5 release provided the proprietary NVIDIA driver 555 and CUDA Toolkit 12.5.

Previously, using open-source GPU kernel modules required installing the distribution-specific NVIDIA driver open package alongside the chosen cuda-toolkit-X-Y package.

Starting with CUDA 12.6, this process changes. The default installation now includes the open-source driver.

2. Runfile Installation

The .run file installer for CUDA or NVIDIA drivers now:

Queries your hardware,
Automatically installs the best-fit driver,
Offers UI toggles to choose between proprietary and open-source drivers.

For command-line or automated installations (E.g. Ansible), use these overrides:

# For CUDA installation
sh ./cuda_12.6.0_560.22_linux.run --override --kernel-module-type=proprietary

# For NVIDIA driver installation
sh ./NVIDIA-Linux-x86_64-560.run --kernel-module-type=proprietary

3. Installation Helper Script

NVIDIA provides a helper script to guide driver selection. To use it, first install the nvidia-driver-assistant package and then run the script:

$ nvidia-driver-assistant

4. Package Manager Details

NVIDIA recommends using package managers for consistent CUDA Toolkit and driver installation. Here are distribution-specific instructions:

Debian-based systems:

Install the open-source driver:

$ sudo apt-get install nvidia-open

For Ubuntu 20.04, upgrade to open kernel modules first and then install the open source driver like below:

$ sudo apt-get install -V nvidia-kernel-source-open
$ sudo apt-get install nvidia-open

RHEL-based Systems:

Install the open-source driver:

$ sudo dnf module install nvidia-driver:open-dkms

To upgrade using the CUDA metapackage, disable module streams:

$ echo "module_hotfixes=1" | tee -a /etc/yum.repos.d/cuda*.repo
$ sudo dnf install --allowerasing nvidia-open
$ sudo dnf module reset nvidia-driver

SUSE or OpenSUSE:

Choose the appropriate command based on your kernel:

# Default kernel flavor
$ sudo zypper install nvidia-open

# Azure kernel flavor (sles15/x86_64)
$ sudo zypper install nvidia-open-azure

# 64kb kernel flavor (sles15/sbsa) for Grace-Hopper
$ sudo zypper install nvidia-open-64k

5. Windows Subsystem for Linux

WSL users do not need to take any action, as it uses the NVIDIA kernel driver from the host Windows system.

6. CUDA Toolkit Installation

The CUDA Toolkit installation process remains unchanged. Users can install it through their package manager as before.

$ sudo apt-get/dnf/zypper install cuda-toolkit

For more details on driver installation or CUDA Toolkit setup, users can refer to the CUDA Installation Guide.

Conclusion

NVIDIA's move towards open-source GPU kernel modules represents a significant change in the company's approach to driver development.

I really hope that this will improve compatibility, performance, and user choice across various GPU generations and Linux distributions.

Resource:

NVIDIA Transitions Fully Towards Open-Source GPU Kernel Modules

Featured Image by Mizter_X94 from Pixabay.

Announcements CUDA GPU GPU kernel modules Kernel modules Linux News NVIDIA Open source R560 Releases