Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to CUDA 12.4.1 [14.0.x] #9051

Open
wants to merge 10 commits into
base: IB/CMSSW_14_0_X/master
Choose a base branch
from

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Mar 6, 2024

Update CMake to version 3.28.3.

  • backport from Geant4 v11.2.0 a fix for CMake 3.27 and later

Update to CUDA 12.4.1:

  • CUDA runtime version 12.4.127
  • NVIDIA drivers version 550.54.15

See https://docs.nvidia.com/cuda/archive/12.4.1/cuda-toolkit-release-notes/index.html for the full CUDA 12.4.0 release notes and change log.

Force CUDA to consider only the major/minor version of GCC, to work around a problem in the GCC version checks inside cudafe++, where GCC 12.3.1 is not recognised as equivalent to GCC 12.3.0.

Rename the CUDA-related flags used in the spec files to avoid potential problems when adding,
removing or reordering them.

Update cuDNN to version 8.9:

See https://docs.nvidia.com/deeplearning/cudnn/archives/cudnn-897/release-notes/index.html for the release notes and change log.

Update NVIDIA gdrcopy to v2.4.1

Add support for new hardware (Grace Hopper, BlueField 3 NICs and DPUs).
Add support for recent OS (kernel, RHEL, SUSE).
Various bug fixes.

See https://github.com/NVIDIA/gdrcopy/releases/tag/v2.4 and https://github.com/NVIDIA/gdrcopy/releases/tag/v2.4.1 for the release notes and change log.

Update ONNX runtime to version 1.17.1:

Older version of ONNX runtime fail to compile with CUDA 12.4.
On the other hand, ONNX 1.17.1 requires CMAKE 3.26 or later, and cuDNN 8.9 or later.

This version requires two more workarounds:

  • -Wno-error=maybe-uninitialized is needed to a void a (hopefully false positive) warning about a potentially uninitialised variable with cuDNN 8.9 and 9.0
  • -Donnxruntime_NVCC_THREADS=0 is needed because the default is ON, causing nvcc to be called as nvcc ... --threads ON ..., which causes an error.

This version requires an update inside CMSSW, implemented in cms-sw/cmssw#44355.

Add missing thrust include in PyTorch

Backport of pytorch/pytorch#121421.

Backport status

Backport of #9046 and #9126 to the data taking release.

@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 6, 2024

backport #9046

@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 6, 2024

enable gpu

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 6, 2024

A new Pull Request was created by @fwyzard for branch IB/CMSSW_14_0_X/master.

@cmsbuild, @iarspider, @aandvalenzuela, @smuzaffar can you please review it and eventually sign? Thanks.
@antoniovilela, @sextonkennedy, @rappoccio you are the release manager for this.
cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 6, 2024

cms-bot internal usage

@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 6, 2024

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 6, 2024

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-14648e/37929/summary.html
COMMIT: 271ef55
CMSSW: CMSSW_14_0_X_2024-03-06-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9051/37929/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

[961/1148] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.4.0-db00bd44f20c40655446378926308f3f/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DPROTOBUF_USE_DLLS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-999e041f1a53b3ff94ee65a9cc8b7a2c/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cudnn/8.8.0.121-acaa18b242f7c97b443d981c456c94c4/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/cutlass-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/cutlass-src/examples -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.4.0-db00bd44f20c40655446378926308f3f/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/mp11-src/include -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG --generate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_80,code=[compute_80,sm_80] --generate-code=arch=compute_89,code=[compute_89,sm_89] -Xcompiler=-fPIC --diag-suppress 554 --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads "" -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/ngram_repeat_block_impl.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/ngram_repeat_block_impl.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/ngram_repeat_block_impl.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/ngram_repeat_block_impl.cu.o
[962/1148] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.4.0-db00bd44f20c40655446378926308f3f/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DPROTOBUF_USE_DLLS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-999e041f1a53b3ff94ee65a9cc8b7a2c/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cudnn/8.8.0.121-acaa18b242f7c97b443d981c456c94c4/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/cutlass-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/cutlass-src/examples -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.4.0-db00bd44f20c40655446378926308f3f/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/mp11-src/include -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG --generate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_80,code=[compute_80,sm_80] --generate-code=arch=compute_89,code=[compute_89,sm_89] -Xcompiler=-fPIC --diag-suppress 554 --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads "" -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/relative_attn_bias_impl.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/relative_attn_bias_impl.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/relative_attn_bias_impl.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/relative_attn_bias_impl.cu.o
[963/1148] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.4.0-db00bd44f20c40655446378926308f3f/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DPROTOBUF_USE_DLLS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-999e041f1a53b3ff94ee65a9cc8b7a2c/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cudnn/8.8.0.121-acaa18b242f7c97b443d981c456c94c4/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/cutlass-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/cutlass-src/examples -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.4.0-db00bd44f20c40655446378926308f3f/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/mp11-src/include -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG --generate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_80,code=[compute_80,sm_80] --generate-code=arch=compute_89,code=[compute_89,sm_89] -Xcompiler=-fPIC --diag-suppress 554 --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads "" -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/skip_layer_norm_impl.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/skip_layer_norm_impl.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/skip_layer_norm_impl.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/skip_layer_norm_impl.cu.o
[964/1148] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.4.0-db00bd44f20c40655446378926308f3f/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DPROTOBUF_USE_DLLS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-999e041f1a53b3ff94ee65a9cc8b7a2c/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cudnn/8.8.0.121-acaa18b242f7c97b443d981c456c94c4/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/cutlass-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/cutlass-src/examples -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.4.0-db00bd44f20c40655446378926308f3f/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/build/_deps/mp11-src/include -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG --generate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_80,code=[compute_80,sm_80] --generate-code=arch=compute_89,code=[compute_89,sm_89] -Xcompiler=-fPIC --diag-suppress 554 --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads "" -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/mha_runner.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/mha_runner.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/mha_runner.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.14.1-fcb60177485aaddd6dfade8bf135aeb9/onnxruntime-1.14.1/onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/mha_runner.cu.o
ninja: build stopped: subcommand failed.
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.vDonL5 (%build)


RPM build errors:
line 37: It's not recommended to have unversioned Obsoletes: Obsoletes: external+onnxruntime+1.14.1-fcb60177485aaddd6dfade8bf135aeb9
Macro expanded in comment on line 365: %{pkginstroot}/${PYTHON3_LIB_SITE_PACKAGES}


@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 6, 2024

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 6, 2024

Pull request #9051 was updated.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 6, 2024

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-14648e/37945/summary.html
COMMIT: 82ae32b
CMSSW: CMSSW_14_0_X_2024-03-06-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9051/37945/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

-- Check if compiler accepts -pthread - yes
-- Found Geant4: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/geant4/11.1.2-f1592688174d5183c351e58fb91ca8c3/lib64/cmake/Geant4/Geant4Config.cmake (found version "11.1.2") 
-- Found nlohmann_json: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/json/3.11.3-627865c7a90221218546ea41a85dfb9a/share/cmake/nlohmann_json/nlohmann_jsonConfig.cmake (found suitable version "3.11.3", minimum required is "3.7.0") 
-- Found Python: /usr/bin/python3.6 (found suitable version "3.6.8", minimum required is "3.6") found components: Interpreter 
-- Configuring incomplete, errors occurred!
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.Kldb1Y (%build)


RPM build errors:
line 37: It's not recommended to have unversioned Obsoletes: Obsoletes: external+celeritas+v0.4.1-565cdff813b11a60cde5971b97a7a139
Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.Kldb1Y (%build)


@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 6, 2024

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-14648e/37945/summary.html COMMIT: 82ae32b CMSSW: CMSSW_14_0_X_2024-03-06-1100/el8_amd64_gcc12 Additional Tests: GPU User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9051/37945/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

-- Check if compiler accepts -pthread - yes
-- Found Geant4: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/geant4/11.1.2-f1592688174d5183c351e58fb91ca8c3/lib64/cmake/Geant4/Geant4Config.cmake (found version "11.1.2")
-- Found nlohmann_json: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/json/3.11.3-627865c7a90221218546ea41a85dfb9a/share/cmake/nlohmann_json/nlohmann_jsonConfig.cmake (found suitable version "3.11.3", minimum required is "3.7.0")
-- Found Python: /usr/bin/python3.6 (found suitable version "3.6.8", minimum required is "3.6") found components: Interpreter
-- Configuring incomplete, errors occurred!
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.Kldb1Y (%build)


RPM build errors:
line 37: It's not recommended to have unversioned Obsoletes: Obsoletes: external+celeritas+v0.4.1-565cdff813b11a60cde5971b97a7a139
Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.Kldb1Y (%build)

Looking at the log, this seems due to

  • The action "build-external+celeritas+v0.4.1-565cdff813b11a60cde5971b97a7a139" was not completed successfully because Failed to build celeritas. Log file in /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/celeritas/v0.4.1-565cdff813b11a60cde5971b97a7a139/log. Final lines of the log file:
    geant4_set_and_check_package_variable Macro invoked with incorrect
    arguments for macro named: geant4_set_and_check_package_variable
    Call Stack (most recent call first):
    /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/geant4/11.1.2-f1592688174d5183c351e58fb91ca8c3/lib64/cmake/Geant4/Geant4Config.cmake:254 (include)
    cmake/FindGeant4.cmake:17 (find_package)
    CMakeLists.txt:256 (find_package)

Who is responsible for "celeritas" ?
Can they check what is going on, and propose a fix ?

@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 6, 2024

-- Found Python: /usr/bin/python3.6 (found suitable version "3.6.8", minimum required is "3.6") found components: Interpreter

Also, why does it find python 3.6 ?
We are using and including python 3.9 in CMSSW.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 7, 2024

Pull request #9051 was updated.

@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 7, 2024

please test

@cmsbuild
Copy link
Contributor

Pull request #9051 was updated.

@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 10, 2024

please test with cms-sw/cmssw#44355

@antoniovilela
Copy link

hold

  • Hold backport as discussed at ORP.

@cmsbuild
Copy link
Contributor

Pull request has been put on hold by @antoniovilela
They need to issue an unhold command to remove the hold state or L1 can unhold it for all

@cmsbuild cmsbuild added the hold label Mar 12, 2024
@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 12, 2024

I missed the ORP (on vacation), but yes, please hold this.

My intent is to use the 14.1.x build and externals to evaluate the impact on the HLT performance, and request the backport only if that check is positive.

@smuzaffar
Copy link
Contributor

please test with cms-sw/cmssw#44355

@cmsbuild
Copy link
Contributor

-1

Failed Tests: GpuUnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-14648e/38213/summary.html
COMMIT: c5a8a5f
CMSSW: CMSSW_14_0_X_2024-03-17-2300/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9051/38213/install.sh to create a dev area with all the needed externals and cmssw changes.

GPU Unit Tests

I found 1 errors in the following unit tests:

---> test testTFVisibleDevicesCUDA had ERRORS

Comparison Summary

Summary:

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 48 differences found in the comparisons
  • DQMHistoTests: Total files compared: 3
  • DQMHistoTests: Total histograms compared: 39740
  • DQMHistoTests: Total failures: 1241
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 38499
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 2 files compared)
  • Checked 8 log files, 10 edm output root files, 3 DQM output files
  • TriggerResults: no differences found

Update to CUDA 12.4.1:
  * CUDA runtime version 12.4.127
  * NVIDIA drivers version 550.54.15

See https://docs.nvidia.com/cuda/archive/12.4.1/cuda-toolkit-release-notes/index.html
for the full CUDA 12.4.1 release notes and change log.
@cmsbuild
Copy link
Contributor

Pull request #9051 was updated.

@fwyzard fwyzard changed the title Update to CUDA 12.4.0 [14.0.x] Update to CUDA 12.4.1 [14.0.x] Apr 11, 2024
@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 11, 2024

please test with cms-sw/cmssw#44355

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests GpuUnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-14648e/38801/summary.html
COMMIT: fc1f061
CMSSW: CMSSW_14_0_X_2024-04-11-2300/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/9051/38801/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 1 errors in the following unit tests:

---> test TestGeneratorInterfacePythia8ConcurrentGeneratorFilter had ERRORS

GPU Unit Tests

I found 1 errors in the following unit tests:

---> test testTFVisibleDevicesCUDA had ERRORS

Comparison Summary

Summary:

  • You potentially added 16 lines to the logs
  • Reco comparison results: 53 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3308451
  • DQMHistoTests: Total failures: 12
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3308419
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 3
  • DQMHistoTests: Total histograms compared: 39740
  • DQMHistoTests: Total failures: 21
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 39719
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 2 files compared)
  • Checked 8 log files, 10 edm output root files, 3 DQM output files
  • TriggerResults: no differences found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants