This is the radix sort implementation from Google's Fuchsia operating system, but in a way that is easier to integrate into CMake codebases. My modifications are in moving the build system to CMake and changes in shader loading (they are compiled into the binary.) You can treat my changes as CC0, so you are just bound by the original LICENSE of this radix sort library.
Below is the original README:
RadixSort/VK is a high-performance sorting library for Vulkan 1.2.
Features include:
- Ultra-fast stable sorting of 32‑bit or 64‑bit keyvals
- Key size is declared at sort time
- Indirectly dispatchable
- Simple to integrate in a Vulkan 1.2 environment
See radix_sort_vk.h
.
The following architectures are supported:
Vendor | Architecture | 32‑bit Keyvals | 64‑bit Keyvals | Notes |
---|---|---|---|---|
NVIDIA | sm_35+ | ✔ | ✔ | |
AMD | GCN3+ | ✔ | ✔ | |
AMD | RDNA+ | ✔ | ✔ | 64-wide subgroup |
ARM | Bifrost4 | ✔ | ✔ | |
ARM | Bifrost8 | ✔ | ✔ | |
Intel | GEN8+ | ✔ | ✔ | |
Intel | Xe+ | ✔ | ✔ |
- 96-bit and 128-bit keyval support.
- Possibly reduce GMEM transactions by accumulating stores that share the same transaction boundary.
- Merrill, Duane and Michael Garland. "Single-pass Parallel Prefix Scan with Decoupled Lookback." (2016).
- Adinets, Andy. "A Faster Radix Sort Implementation." NVIDIA GTC 2020.