Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor radix_sort tuning #3657

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Feb 3, 2025

With:

cmake --preset cub-benchmark -DCMAKE_CUDA_ARCHITECTURES="50;60;61;62;70;80;90;100"
  • No SASS changes cub.bench.radix_sort.keys.base
  • No SASS changes cub.bench.radix_sort.pairs.base

Copy link

copy-pr-bot bot commented Feb 3, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@bernhardmgruber
Copy link
Contributor Author

/ok to test

Copy link
Contributor

github-actions bot commented Feb 3, 2025

🟩 CI finished in 1h 44m: Pass: 100%/90 | Total: 1d 20h | Avg: 29m 30s | Max: 1h 10m | Hits: 384%/12730
  • 🟩 cub: Pass: 100%/44 | Total: 1d 08h | Avg: 43m 52s | Max: 1h 10m | Hits: 485%/3500

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 06h | Avg: 43m 32s | Max:  1h 10m | Hits: 485%/3500  
      🟩 arm64              Pass: 100%/2   | Total:  1h 41m | Avg: 50m 57s | Max: 52m 01s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 04m | Avg: 48m 54s | Max:  1h 01m | Hits: 485%/875   
      🟩 12.5               Pass: 100%/2   | Total:  1h 39m | Avg: 49m 56s | Max: 50m 20s
      🟩 12.8               Pass: 100%/37  | Total:  1d 02h | Avg: 42m 52s | Max:  1h 10m | Hits: 485%/2625  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 52m | Avg: 56m 04s | Max: 56m 29s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 04m | Avg: 48m 54s | Max:  1h 01m | Hits: 485%/875   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 39m | Avg: 49m 56s | Max: 50m 20s
      🟩 nvcc12.8           Pass: 100%/35  | Total:  1d 00h | Avg: 42m 07s | Max:  1h 10m | Hits: 485%/2625  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 52m | Avg: 56m 04s | Max: 56m 29s
      🟩 nvcc               Pass: 100%/42  | Total:  1d 06h | Avg: 43m 18s | Max:  1h 10m | Hits: 485%/3500  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 08m | Avg: 47m 03s | Max: 49m 02s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 29m | Avg: 44m 37s | Max: 45m 16s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 29m | Avg: 44m 38s | Max: 44m 53s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 26m | Avg: 43m 12s | Max: 43m 52s
      🟩 Clang18            Pass: 100%/7   | Total:  4h 50m | Avg: 41m 32s | Max: 56m 29s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 27m | Avg: 43m 39s | Max: 44m 18s
      🟩 GCC8               Pass: 100%/1   | Total: 45m 12s | Avg: 45m 12s | Max: 45m 12s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 28m | Avg: 44m 27s | Max: 45m 07s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 34m | Avg: 47m 18s | Max: 49m 09s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 30m | Avg: 45m 26s | Max: 46m 47s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 34m | Avg: 47m 23s | Max: 49m 01s
      🟩 GCC13              Pass: 100%/10  | Total:  5h 22m | Avg: 32m 17s | Max: 56m 19s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 03m | Hits: 485%/1750  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 10m | Hits: 485%/1750  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 39m | Avg: 49m 56s | Max: 50m 20s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 12h 23m | Avg: 43m 45s | Max: 56m 29s
      🟩 GCC                Pass: 100%/21  | Total: 13h 44m | Avg: 39m 16s | Max: 56m 19s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 22m | Avg:  1h 05m | Max:  1h 10m | Hits: 485%/3500  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 39m | Avg: 49m 56s | Max: 50m 20s
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 43m 12s | Avg: 21m 36s | Max: 23m 35s
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 03h | Avg: 49m 14s | Max:  1h 10m | Hits: 485%/3500  
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 33m | Avg: 26m 39s | Max: 44m 10s
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 05h | Avg: 48m 09s | Max:  1h 10m | Hits: 485%/3500  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 15s | Avg: 21m 15s | Max: 21m 15s
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 00s | Avg: 17m 00s | Max: 17m 00s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 09m | Avg: 23m 18s | Max: 24m 12s
      🟩 TestGPU            Pass: 100%/2   | Total: 40m 44s | Avg: 20m 22s | Max: 22m 19s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 43m 12s | Avg: 21m 36s | Max: 23m 35s
      🟩 90;90a;100         Pass: 100%/1   | Total: 56m 19s | Avg: 56m 19s | Max: 56m 19s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 16h 14m | Avg: 48m 44s | Max:  1h 06m | Hits: 485%/2625  
      🟩 20                 Pass: 100%/24  | Total: 15h 55m | Avg: 39m 49s | Max:  1h 10m | Hits: 485%/875   
    
  • 🟩 thrust: Pass: 100%/43 | Total: 11h 30m | Avg: 16m 02s | Max: 37m 39s | Hits: 346%/9230

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 23m 50s | Avg: 11m 55s | Max: 12m 43s
    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total: 11h 05m | Avg: 16m 13s | Max: 37m 39s | Hits: 346%/9230  
      🟩 arm64              Pass: 100%/2   | Total: 25m 03s | Avg: 12m 31s | Max: 12m 35s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 23m | Avg: 16m 47s | Max: 30m 34s | Hits: 341%/1846  
      🟩 12.5               Pass: 100%/2   | Total: 56m 18s | Avg: 28m 09s | Max: 28m 09s
      🟩 12.8               Pass: 100%/36  | Total:  9h 09m | Avg: 15m 16s | Max: 37m 39s | Hits: 347%/7384  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 25m 23s | Avg: 12m 41s | Max: 12m 43s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 23m | Avg: 16m 47s | Max: 30m 34s | Hits: 341%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 56m 18s | Avg: 28m 09s | Max: 28m 09s
      🟩 nvcc12.8           Pass: 100%/34  | Total:  8h 44m | Avg: 15m 25s | Max: 37m 39s | Hits: 347%/7384  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 25m 23s | Avg: 12m 41s | Max: 12m 43s
      🟩 nvcc               Pass: 100%/41  | Total: 11h 04m | Avg: 16m 12s | Max: 37m 39s | Hits: 346%/9230  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 50m 18s | Avg: 12m 34s | Max: 13m 05s
      🟩 Clang15            Pass: 100%/2   | Total: 25m 56s | Avg: 12m 58s | Max: 13m 30s
      🟩 Clang16            Pass: 100%/2   | Total: 28m 40s | Avg: 14m 20s | Max: 14m 30s
      🟩 Clang17            Pass: 100%/2   | Total: 26m 09s | Avg: 13m 04s | Max: 13m 08s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 22m | Avg: 11m 46s | Max: 14m 05s
      🟩 GCC7               Pass: 100%/2   | Total: 27m 08s | Avg: 13m 34s | Max: 13m 40s
      🟩 GCC8               Pass: 100%/1   | Total: 12m 27s | Avg: 12m 27s | Max: 12m 27s
      🟩 GCC9               Pass: 100%/2   | Total: 29m 11s | Avg: 14m 35s | Max: 14m 41s
      🟩 GCC10              Pass: 100%/2   | Total: 29m 57s | Avg: 14m 58s | Max: 15m 19s
      🟩 GCC11              Pass: 100%/2   | Total: 26m 31s | Avg: 13m 15s | Max: 13m 35s
      🟩 GCC12              Pass: 100%/2   | Total: 30m 20s | Avg: 15m 10s | Max: 15m 30s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 39m | Avg: 12m 26s | Max: 15m 14s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 03m | Avg: 31m 49s | Max: 33m 05s | Hits: 341%/3692  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  1h 41m | Avg: 33m 52s | Max: 37m 39s | Hits: 349%/5538  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 56m 18s | Avg: 28m 09s | Max: 28m 09s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  3h 33m | Avg: 12m 33s | Max: 14m 30s
      🟩 GCC                Pass: 100%/19  | Total:  4h 15m | Avg: 13m 25s | Max: 15m 30s
      🟩 MSVC               Pass: 100%/5   | Total:  2h 45m | Avg: 33m 03s | Max: 37m 39s | Hits: 346%/9230  
      🟩 NVHPC              Pass: 100%/2   | Total: 56m 18s | Avg: 28m 09s | Max: 28m 09s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/33  | Total:  8h 51m | Avg: 16m 06s | Max: 33m 05s | Hits: 341%/5538  
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 38m | Avg: 15m 50s | Max: 37m 39s | Hits: 353%/3692  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 10h 10m | Avg: 16m 29s | Max: 37m 39s | Hits: 341%/7384  
      🟩 TestCPU            Pass: 100%/3   | Total: 47m 39s | Avg: 15m 53s | Max: 31m 51s | Hits: 365%/1846  
      🟩 TestGPU            Pass: 100%/3   | Total: 32m 07s | Avg: 10m 42s | Max: 11m 09s
    🟩 sm
      🟩 90;90a;100         Pass: 100%/1   | Total: 15m 10s | Avg: 15m 10s | Max: 15m 10s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  5h 41m | Avg: 17m 05s | Max: 33m 05s | Hits: 341%/5538  
      🟩 20                 Pass: 100%/21  | Total:  5h 24m | Avg: 15m 27s | Max: 37m 39s | Hits: 353%/3692  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 7m 07s | Avg: 3m 33s | Max: 5m 00s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  5m 00s
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  5m 00s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  5m 00s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  5m 00s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  5m 00s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  5m 00s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  5m 00s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 07s | Avg:  2m 07s | Max:  2m 07s
      🟩 Test               Pass: 100%/1   | Total:  5m 00s | Avg:  5m 00s | Max:  5m 00s
    
  • 🟩 python: Pass: 100%/1 | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@bernhardmgruber bernhardmgruber marked this pull request as ready for review February 4, 2025 16:22
@bernhardmgruber bernhardmgruber requested a review from a team as a code owner February 4, 2025 16:22
Copy link
Contributor

github-actions bot commented Feb 4, 2025

🟩 CI finished in 1h 42m: Pass: 100%/90 | Total: 2d 11h | Avg: 39m 45s | Max: 1h 16m | Hits: 307%/12742
  • 🟩 cub: Pass: 100%/44 | Total: 1d 14h | Avg: 52m 01s | Max: 1h 16m | Hits: 366%/3512

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 12h | Avg: 51m 29s | Max:  1h 16m | Hits: 366%/3512  
      🟩 arm64              Pass: 100%/2   | Total:  2h 06m | Avg:  1h 03m | Max:  1h 03m
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 11m | Avg: 50m 14s | Max:  1h 04m | Hits: 367%/878   
      🟩 12.5               Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 05m
      🟩 12.8               Pass: 100%/37  | Total:  1d 07h | Avg: 51m 35s | Max:  1h 16m | Hits: 365%/2634  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 53m | Avg: 56m 31s | Max: 58m 01s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 11m | Avg: 50m 14s | Max:  1h 04m | Hits: 367%/878   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 05m
      🟩 nvcc12.8           Pass: 100%/35  | Total:  1d 05h | Avg: 51m 18s | Max:  1h 16m | Hits: 365%/2634  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 53m | Avg: 56m 31s | Max: 58m 01s
      🟩 nvcc               Pass: 100%/42  | Total:  1d 12h | Avg: 51m 48s | Max:  1h 16m | Hits: 366%/3512  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 25m | Avg: 51m 17s | Max: 56m 02s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 55m | Avg: 57m 46s | Max: 59m 29s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 51m | Avg: 55m 41s | Max: 58m 26s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 49m | Avg: 54m 32s | Max: 55m 57s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 35m | Avg: 47m 58s | Max:  1h 03m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 37m | Avg: 48m 48s | Max: 53m 26s
      🟩 GCC8               Pass: 100%/1   | Total: 55m 30s | Avg: 55m 30s | Max: 55m 30s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 43m | Avg: 51m 58s | Max: 58m 48s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 59m | Avg: 59m 40s | Max: 59m 58s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 16s | Max: 57m 21s
      🟩 GCC12              Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 04m
      🟩 GCC13              Pass: 100%/10  | Total:  6h 22m | Avg: 38m 16s | Max:  1h 10m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 10m | Hits: 366%/1756  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 29m | Avg:  1h 14m | Max:  1h 16m | Hits: 365%/1756  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 05m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 14h 36m | Avg: 51m 35s | Max:  1h 03m
      🟩 GCC                Pass: 100%/21  | Total: 16h 37m | Avg: 47m 30s | Max:  1h 10m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 45m | Avg:  1h 11m | Max:  1h 16m | Hits: 366%/3512  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 05m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 46m 52s | Avg: 23m 26s | Max: 23m 35s
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 09h | Avg: 58m 44s | Max:  1h 16m | Hits: 366%/3512  
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 05m | Avg: 30m 38s | Max:  1h 01m
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 11h | Avg: 57m 52s | Max:  1h 16m | Hits: 366%/3512  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 10s | Avg: 21m 10s | Max: 21m 10s
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 05s | Avg: 16m 05s | Max: 16m 05s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 10m | Avg: 23m 28s | Max: 24m 19s
      🟩 TestGPU            Pass: 100%/2   | Total: 40m 03s | Avg: 20m 01s | Max: 21m 32s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 46m 52s | Avg: 23m 26s | Max: 23m 35s
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 10m | Avg:  1h 10m | Max:  1h 10m
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 19h 20m | Avg: 58m 01s | Max:  1h 16m | Hits: 366%/2634  
      🟩 20                 Pass: 100%/24  | Total: 18h 48m | Avg: 47m 01s | Max:  1h 12m | Hits: 364%/878   
    
  • 🟩 thrust: Pass: 100%/43 | Total: 20h 55m | Avg: 29m 11s | Max: 1h 02m | Hits: 285%/9230

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 38m 35s | Avg: 19m 17s | Max: 27m 06s
    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total: 19h 56m | Avg: 29m 11s | Max:  1h 02m | Hits: 285%/9230  
      🟩 arm64              Pass: 100%/2   | Total: 58m 30s | Avg: 29m 15s | Max: 30m 18s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 40m | Avg: 20m 00s | Max: 48m 07s | Hits: 273%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  1h 38m | Avg: 49m 25s | Max: 50m 25s
      🟩 12.8               Pass: 100%/36  | Total: 17h 36m | Avg: 29m 20s | Max:  1h 02m | Hits: 288%/7384  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 25m 54s | Avg: 12m 57s | Max: 13m 02s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 40m | Avg: 20m 00s | Max: 48m 07s | Hits: 273%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 38m | Avg: 49m 25s | Max: 50m 25s
      🟩 nvcc12.8           Pass: 100%/34  | Total: 17h 10m | Avg: 30m 18s | Max:  1h 02m | Hits: 288%/7384  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 25m 54s | Avg: 12m 57s | Max: 13m 02s
      🟩 nvcc               Pass: 100%/41  | Total: 20h 29m | Avg: 29m 58s | Max:  1h 02m | Hits: 285%/9230  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 27m | Avg: 21m 51s | Max: 32m 21s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 00m | Avg: 30m 24s | Max: 30m 50s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 01m | Avg: 30m 39s | Max: 33m 01s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 05m | Avg: 32m 30s | Max: 33m 30s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 11m | Avg: 18m 44s | Max: 29m 13s
      🟩 GCC7               Pass: 100%/2   | Total: 43m 28s | Avg: 21m 44s | Max: 30m 58s
      🟩 GCC8               Pass: 100%/1   | Total: 30m 58s | Avg: 30m 58s | Max: 30m 58s
      🟩 GCC9               Pass: 100%/2   | Total: 49m 03s | Avg: 24m 31s | Max: 34m 36s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 04m | Avg: 32m 10s | Max: 35m 00s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 03m | Avg: 31m 39s | Max: 33m 24s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 03m | Avg: 31m 42s | Max: 32m 06s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 08m | Avg: 23m 37s | Max: 35m 12s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 39m | Avg: 49m 40s | Max: 51m 14s | Hits: 268%/3692  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 27m | Avg: 49m 15s | Max:  1h 02m | Hits: 297%/5538  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 38m | Avg: 49m 25s | Max: 50m 25s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  6h 45m | Avg: 23m 52s | Max: 33m 30s
      🟩 GCC                Pass: 100%/19  | Total:  8h 23m | Avg: 26m 29s | Max: 35m 12s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 07m | Avg: 49m 25s | Max:  1h 02m | Hits: 285%/9230  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 38m | Avg: 49m 25s | Max: 50m 25s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/33  | Total: 17h 00m | Avg: 30m 55s | Max: 51m 55s | Hits: 266%/5538  
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 54m | Avg: 23m 27s | Max:  1h 02m | Hits: 314%/3692  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 19h 31m | Avg: 31m 39s | Max:  1h 02m | Hits: 265%/7384  
      🟩 TestCPU            Pass: 100%/3   | Total: 49m 19s | Avg: 16m 26s | Max: 33m 23s | Hits: 365%/1846  
      🟩 TestGPU            Pass: 100%/3   | Total: 34m 36s | Avg: 11m 32s | Max: 11m 56s
    🟩 sm
      🟩 90;90a;100         Pass: 100%/1   | Total: 35m 12s | Avg: 35m 12s | Max: 35m 12s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 10h 26m | Avg: 31m 18s | Max: 51m 55s | Hits: 266%/5538  
      🟩 20                 Pass: 100%/21  | Total:  9h 50m | Avg: 28m 07s | Max:  1h 02m | Hits: 314%/3692  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 7m 23s | Avg: 3m 41s | Max: 5m 15s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  7m 23s | Avg:  3m 41s | Max:  5m 15s
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total:  7m 23s | Avg:  3m 41s | Max:  5m 15s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total:  7m 23s | Avg:  3m 41s | Max:  5m 15s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  7m 23s | Avg:  3m 41s | Max:  5m 15s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  7m 23s | Avg:  3m 41s | Max:  5m 15s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  7m 23s | Avg:  3m 41s | Max:  5m 15s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  7m 23s | Avg:  3m 41s | Max:  5m 15s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 08s | Avg:  2m 08s | Max:  2m 08s
      🟩 Test               Pass: 100%/1   | Total:  5m 15s | Avg:  5m 15s | Max:  5m 15s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 05s | Avg: 26m 05s | Max: 26m 05s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 05s | Avg: 26m 05s | Max: 26m 05s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 26m 05s | Avg: 26m 05s | Max: 26m 05s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 26m 05s | Avg: 26m 05s | Max: 26m 05s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 05s | Avg: 26m 05s | Max: 26m 05s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 05s | Avg: 26m 05s | Max: 26m 05s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 05s | Avg: 26m 05s | Max: 26m 05s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 26m 05s | Avg: 26m 05s | Max: 26m 05s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 05s | Avg: 26m 05s | Max: 26m 05s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

1 participant