Torch, CUDA, and App SDK compatibility #460

MMelQin · 2023-10-07T00:59:37Z

MMelQin
Oct 7, 2023
Maintainer

It is great that torch 2.1.0 was released on Oct 4th, 2023, but it has some issues detecting GPU with older CUDA toolkit and driver version, hence falling back to CPU device for inference and reducing the application performance.

Here are some of the combos that I have tested out,

torch==2.1.0 cannot detect GPU with the CUDA 11.7 on my desktop, so the application with PyTorch inference fell back to CPU device, and the rest of the app worked fine with much longer overall execution time.
pinned torch==2.0.1, still with CUDA 11.7, GPU was detected, and the PyTorch inference app built on App SDK v0.6 worked fine.
updated to CUDA 12.2, though understand the underlying SDK v0.6 supports 11.8, but CUDA is backward compatible.
- with torch==2.1.0, error occurred in App SDK v0.6,
```
  File "/home/mqin/src/monai-deploy-app-sdk/.venv/lib/python3.8/site-packages/holoscan/graphs/__init__.py", line 24, in <module>
    from ._graphs import FragmentFlowGraph, OperatorFlowGraph
ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory
```
  when this happens, installing cuda runtime for the version complained by the app fixed the issue, e.g.
  sudo apt-get install cuda-runtime-11.8
- with torch==2.0.1, the app and torch inference on GPU all worked fine, when run with Python interpreter. The Packager, monai-deploy package, fails with the libcudart.so error as above. However, installing the cuda runtime fixed Packager issue, sudo apt-get install cuda-runtime-11.8
did not try out torch==2.1.0 with CUDA 11.8 combo. After having installed CUDA 12.2, tried multiple times to completely remove CUDA from the system including reboot, and then installing CUDA 11.8 ended up still with CUDA12.2.

MMelQin · 2023-10-09T06:53:51Z

MMelQin
Oct 9, 2023
Maintainer Author

Issue 461 also raised by @vikashg was due to the compatibility issues mentioned here.

0 replies

mpsampat · 2023-10-11T05:38:02Z

mpsampat
Oct 11, 2023

thank you @MMelQin for your investigations and write-up. I faced this same error today.
Will try out your recommendations tomorrow.

1 reply

MMelQin Oct 11, 2023
Maintainer Author

Please do and thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch, CUDA, and App SDK compatibility #460

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Torch, CUDA, and App SDK compatibility #460

MMelQin Oct 7, 2023 Maintainer

Replies: 2 comments · 1 reply

MMelQin Oct 9, 2023 Maintainer Author

mpsampat Oct 11, 2023

MMelQin Oct 11, 2023 Maintainer Author

MMelQin
Oct 7, 2023
Maintainer

Replies: 2 comments 1 reply

MMelQin
Oct 9, 2023
Maintainer Author

mpsampat
Oct 11, 2023

MMelQin Oct 11, 2023
Maintainer Author