This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
CUB 1.17.0 #475
alliepiper
announced in
Announcements
CUB 1.17.0
#475
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
CUB 1.17.0
Summary
CUB 1.17.0 is the final minor release of the 1.X series. It provides a variety of bug fixes and miscellaneous enhancements, detailed below.
Known Issues
“Run-to-run” Determinism Broken
Several CUB device algorithms are documented to provide deterministic results (per device) for non-associative reduction operators (e.g. floating-point addition). Unfortunately, the implementations of these algorithms contain performance optimizations that violate this guarantee. The
DeviceReduce::ReduceByKey
andDeviceScan
algorithms are known to be affected. We’re currently evaluating the scope and impact of correcting this in a future CUB release. See NVIDIA/cub#471 for details.Bug Fixes
DeviceSelect
to work with discard iterators and mixed input/output types.CMAKE_INSTALL_LIBDIR
contained nested directories. Thanks to @robertmaynard for this contribution.DeviceSegmentedSort
on sm_61 and sm_70.DeviceSelect::Flagged
so that flags are normalized to 0 or 1.DeviceRadixSort
givennum_items
close to 2^32. Thanks to @canonizer for this contribution.Other Enhancements
DeviceSegmentedSort
when launched via CDP.BlockDiscontinuity
: Replaced recursive-template loop unrolling with#pragma unroll
. Thanks to @kshitij12345 for this contribution.TexRefInputIterator
implementation with an alias toTexObjInputIterator
. This fully removes all usages of the deprecated CUDA texture reference APIs from CUB.BlockAdjacentDifference
: Replaced recursive-template loop unrolling with#pragma unroll
. Thanks to @kshitij12345 for this contribution.cub::DeviceAdjacentDifference
API has been updated to use the newOffsetT
deduction approach described in Transparent support for 64-bit indexing in device algorithms #212.This discussion was created from the release CUB 1.17.0.
Beta Was this translation helpful? Give feedback.
All reactions