This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
Thrust 1.3.0
Thrust 1.3.0 provides support for CUDA Toolkit 3.2 in addition to many feature and performance enhancements. Performance of the sort and sort_by_key algorithms is improved by as much as 3x in certain situations. The performance of stream compaction algorithms, such as copy_if, is improved by as much as 2x. CUDA errors are now converted to runtime exceptions using the system_error interface. Combined with a debug mode, also new in 1.3, runtime errors can be located with greater precision. Lastly, a few header files have been consolidated or renamed for clarity. See the deprecations section below for additional details.
Breaking Changes
- Promotions
- thrust::experimental::inclusive_segmented_scan has been renamed thrust::inclusive_scan_by_key and exposes a different interface
- thrust::experimental::exclusive_segmented_scan has been renamed thrust::exclusive_scan_by_key and exposes a different interface
- thrust::experimental::partition_copy has been renamed thrust::partition_copy and exposes a different interface
- thrust::next::gather has been renamed thrust::gather
- thrust::next::gather_if has been renamed thrust::gather_if
- thrust::unique_copy_by_key has been renamed thrust::unique_by_key_copy
- Deprecations
- thrust::copy_when has been renamed thrust::deprecated::copy_when
- thrust::absolute_value has been renamed thrust::deprecated::absolute_value
- The header thrust/set_intersection.h is now deprecated; use thrust/set_operations.h instead
- The header thrust/utility.h is now deprecated; use thrust/swap.h instead
- The header thrust/swap_ranges.h is now deprecated; use thrust/swap.h instead
- Eliminations
- thrust::deprecated::gather
- thrust::deprecated::gather_if
- thrust/experimental/arch.h and the functions therein
- thrust/sorting/merge_sort.h
- thrust/sorting/radix_sort.h
- NVCC 2.3 is no longer supported
New Features
-
Algorithms:
thrust::exclusive_scan_by_key
thrust::find
thrust::find_if
thrust::find_if_not
thrust::inclusive_scan_by_key
thrust::is_partitioned
thrust::is_sorted_until
thrust::mismatch
thrust::partition_point
thrust::reverse
thrust::reverse_copy
thrust::stable_partition_copy
-
Types:
thrust::system_error
and related types.thrust::experimental::cuda::ogl_interop_allocator
.thrust::bit_and
,thrust::bit_or
, andthrust::bit_xor
.
-
Device Support:
- GF104-based GPUs.
New Examples
- opengl_interop.cu
- repeated_range.cu
- simple_moving_average.cu
- sparse_vector.cu
- strided_range.cu
Other Enhancements
- Performance of thrust::sort and thrust::sort_by_key is substantially improved for primitive key types
- Performance of thrust::copy_if is substantially improved
- Performance of thrust::reduce and related reductions is improved
- THRUST_DEBUG mode added
- Callers of Thrust functions may detect error conditions by catching thrust::system_error, which derives from std::runtime_error
- The number of compiler warnings generated by Thrust has been substantially reduced
- Comparison sort now works correctly for input sizes > 32M
- min & max usage no longer collides with <windows.h> definitions
- Compiling against the OpenMP backend no longer requires nvcc
- Performance of device_vector initialized in .cpp files is substantially improved in common cases
- Performance of thrust::sort_by_key on the host is substantially improved
Bug Fixes
- Debug device code now compiles correctly
- thrust::uninitialized_copy and thrust::uninitialized_fill now dispatch constructors on the device rather than the host
Known Issues
- #212 set_intersection is known to fail for large input sizes
- partition_point is known to fail for 64b types with nvcc 3.2
Acknowledgments
- Thanks to Duane Merrill for contributing a fast CUDA radix sort implementation
- Thanks to Erich Elsen for contributing an implementation of find_if
- Thanks to Andrew Corrigan for contributing changes which allow the OpenMP backend to compile in the absence of nvcc
- Thanks to Andrew Corrigan, Cliff Wooley, David Coeurjolly, Janick Martinez Esturo, John Bowers, Maxim Naumov, Michael Garland, and Ryuta Suzuki for bug reports
- Thanks to Cliff Woolley for help with testing