This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
Breaking Changes
- #553: Deprecate the
CUB_USE_COOPERATIVE_GROUPS
macro, as all supported CTK distributions provide CG. This macro will be removed in a future version of CUB.
New Features
- #359: Add new
DeviceBatchMemcpy
algorithm. - #565: Add
DeviceMergeSort::StableSortKeysCopy
API. Thanks to David Wendt (@davidwendt) for this contribution. - #585: Add SM90 tuning policy for
DeviceRadixSort
. Thanks to Andy Adinets (@canonizer) for this contribution. - #586: Introduce a new mechanism to opt-out of compiling CDP support in CUB algorithms by defining
CUB_DISABLE_CDP
. - #589: Support 64-bit indexing in
DeviceReduce
. - #607: Support 128-bit integers in radix sort.
Bug Fixes
- #547: Resolve several long-running issues resulting from using multiple versions of CUB within the same process. Adds an inline namespace that encodes CUB version and targeted PTX architectures.
- #562: Fix bug in
BlockShuffle
resulting from an invalid thread offset. Thanks to @sjfeng1999 for this contribution. - #564: Fix bug in
BlockRadixRank
when used with blocks that are not a multiple of 32 threads. - #579: Ensure that all threads in the logical warp participate in the index-shuffle for
BlockRadixRank
. Thanks to Andy Adinets (@canonizer) for this contribution. - #582: Fix reordering in CUB member initializer lists.
- #589: Fix
DeviceSegmentedSort
when used withbool
keys. - #590: Fix CUB’s CMake install rules. Thanks to Robert Maynard (@robertmaynard) for this contribution.
- #592: Fix overflow in
DeviceReduce
. - #598: Fix
DeviceRunLengthEncode
when the first item is aNaN
. - #611: Fix
WarpScanExclusive
for vector types.
Other Enhancements
- #537: Add detailed and expanded version of a [CUB developer overview](https://github.com/NVIDIA/cub/blob/main/docs/developer_overview.md).
- #549: Fix
BlockReduceRaking
docs for non-commutative operations. Thanks to Tobias Ribizel (@upsj) for this contribution. - #606: Optimize CUB’s decoupled-lookback implementation.