Mesa’s New “CLUDA” Driver Bridges Gallium3D and NVIDIA CUDA for OpenCL Compute

Red Hat and Rusticl developer Karol Herbst has opened a new Mesa merge request introducing “CLUDA,” a compute-only Gallium3D driver that runs on top of NVIDIA’s CUDA driver API. The proposal introduces a Gallium3D driver implemented over CUDA’s libcuda.so, enabling Mesa’s compute framework to operate on proprietary NVIDIA hardware.

Herbst describes CLUDA as a driver that “implements the Gallium API on top of the CUDA driver API.” The project, still experimental, currently targets compute workloads such as OpenCL and uses the CUDA driver library (libcuda.so) shipped with NVIDIA’s proprietary stack. No runtime CUDA packages are required beyond the driver, though build-time CUDA development headers are needed.

Learning CUDA the Hard Way

Herbst said the idea came after a hallway conversation at XDC (X.Org Developers’ Conference): “Somebody mentioned to me at XDC … that implementing OpenCL on top of CUDA in Mesa could help out with something.”

He began coding shortly after returning home with access to an NVIDIA GPU. The result is CLUDA—a blend of C and Rust, where the main driver is written in C and the PTX-generation logic uses Rust to simplify string handling. Mesa’s NIR intermediate representation is lowered to PTX, which then runs through CUDA’s built-in PTXJIT compiler.

What Works — and What Doesn’t Yet

CLUDA already supports general kernel launches, memory operations, and several OpenCL extensions missing from NVIDIA’s proprietary OpenCL driver. The supported list includes features like cl_khr_fp16, cl_khr_integer_dot_product, multiple subgroup extensions, and full SPIR-V support through cl_khr_il_program.

Herbst joked about the unusually long list: “Some of you might look at this list and go ‘wait a second … are those … really?’ and my answer is: ‘yes they are.’”

The driver is hard-coded for SM86 (Ampere / RTX 40 Series) hardware for now. Missing pieces include image support, cl_gl_sharing, double precision (FP64), and 64-bit atomics — features Herbst says could follow later “if motivation and time allow.”

Early Testing and Performance

Herbst’s internal Conformance Test Suite (CTS) run in “wimpy mode” passed 3,871 tests, failed 10, and crashed 4.
He noted minor precision issues with fp16 and denorm handling, and that timestamp queries currently rely on CUDA’s cuEventElapsedTime, which is not fully accurate.

Performance already approaches NVIDIA’s proprietary stack. On an RTX A6000, CLUDA achieved a LuxMark score of 57,702, compared with 64,009 under NVIDIA’s own OpenCL driver — roughly 90% of the native result. Herbst attributes the performance gap to NIR → PTX translation overhead and less-optimized generated code.

A Hack That Works

Herbst calls CLUDA an impromptu project: “I kept writing code and it kinda worked.” He released the merge request partly to gauge community interest: “Would be nice to know if people are interested at all … I don’t really have any concrete plans myself for this.” Herbst also noted that “CLUDA” is just a working name, saying he is open to changing it if a better suggestion comes along.

Sources

GitLab MR #37831 – “CLUDA (Gallium-on-CUDA driver)”
Phoronix – “CLUDA Posted For Mesa: Gallium3D API Implemented Atop NVIDIA CUDA Driver API” — Michael Larabel, 12 Oct 2025

Mesa’s New “CLUDA” Driver Bridges Gallium3D and NVIDIA CUDA for OpenCL Compute

Learning CUDA the Hard Way

What Works — and What Doesn’t Yet

Early Testing and Performance

A Hack That Works

LEAVE A REPLY Cancel reply

Hot of the Week

Mesa 24.0.9 Released with OpenGL 4.6 API, Vulkan 1.3 API, and More

Linux 6.18 DRM Pull Bringing Tyr, Rocket, and Critical Intel/AMD Enhancements

FFmpeg Introduces Vulkan-Accelerated Apple ProRes Decoding

Intel Releases NPU Driver 1.24, Validated for Meteor, Arrow, and Lunar Lake Chips

AMD’s ROCm 7.0.2 Released with Linux GPU and AI Support, Adds RDNA4 and RAG Capabilities

> The Latest News

FFmpeg Introduces Vulkan-Accelerated Apple ProRes Decoding

Mesa 25.2.5 Released: Critical Vulkan, Intel, and Zink Fixes Across the Graphics Stack

AMD’s ROCm 7.0.2 Released with Linux GPU and AI Support, Adds RDNA4 and RAG Capabilities

Linux 6.18 DRM Pull Bringing Tyr, Rocket, and Critical Intel/AMD Enhancements

Intel Releases NPU Driver 1.24, Validated for Meteor, Arrow, and Lunar Lake Chips