Torch, CUDA, and App SDK compatibility #460
MMelQin
announced in
Announcements
Replies: 2 comments 1 reply
-
Issue 461 also raised by @vikashg was due to the compatibility issues mentioned here. |
Beta Was this translation helpful? Give feedback.
0 replies
-
thank you @MMelQin for your investigations and write-up. I faced this same error today. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It is great that torch 2.1.0 was released on Oct 4th, 2023, but it has some issues detecting GPU with older CUDA toolkit and driver version, hence falling back to CPU device for inference and reducing the application performance.
Here are some of the combos that I have tested out,
torch==2.1.0 cannot detect GPU with the CUDA 11.7 on my desktop, so the application with PyTorch inference fell back to CPU device, and the rest of the app worked fine with much longer overall execution time.
pinned torch==2.0.1, still with CUDA 11.7, GPU was detected, and the PyTorch inference app built on App SDK v0.6 worked fine.
updated to CUDA 12.2, though understand the underlying SDK v0.6 supports 11.8, but CUDA is backward compatible.
with torch==2.1.0, error occurred in App SDK v0.6,
when this happens, installing cuda runtime for the version complained by the app fixed the issue, e.g.
sudo apt-get install cuda-runtime-11.8
with torch==2.0.1, the app and torch inference on GPU all worked fine, when run with Python interpreter. The Packager,
monai-deploy package
, fails with the libcudart.so error as above. However, installing the cuda runtime fixed Packager issue,sudo apt-get install cuda-runtime-11.8
did not try out torch==2.1.0 with CUDA 11.8 combo. After having installed CUDA 12.2, tried multiple times to completely remove CUDA from the system including reboot, and then installing CUDA 11.8 ended up still with CUDA12.2.
Beta Was this translation helpful? Give feedback.
All reactions