A warning message showing that `MultiScaleDeformableAttention.so` is not found in `/root/.cache/torch_extensions` if `ninja` is installed with `transformers` #35349

cainmagi · 2024-12-19T20:41:22Z

System Info

transformers: 4.47.1
torch: 2.5.1
timm: 1.0.12
ninja: 1.11.1.3
python: 3.10.14
pip: 23.0.1
CUDA runtime installed by torch: nvidia-cuda-runtime-cu12==12.4.127
OS (in container): Debian GNU/Linux 12 (bookworm)
OS (native device): Windows 11 Enterprise 23H2 (10.0.22631 Build 22631)
Docker version: 27.3.1, build ce12230
NVIDIA Driver: 565.57.02

Who can help?

I am asking help for DeformableDetrModel
vision models: @amyeroberts, @qubvel

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Start a new docker container by

docker run --gpus all -it --rm --shm-size=1g python:3.10-slim bash

Install dependencies

pip install transformers[torch] requests pillow timm

Run the following script (copied from the document), it works fine and does not show any message.

from transformers import AutoImageProcessor, DeformableDetrModel
from PIL import Image
import requests

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

image_processor = AutoImageProcessor.from_pretrained("SenseTime/deformable-detr")
model = DeformableDetrModel.from_pretrained("SenseTime/deformable-detr")

inputs = image_processor(images=image, return_tensors="pt")

outputs = model(**inputs)

last_hidden_states = outputs.last_hidden_state
list(last_hidden_states.shape)

Install ninja:
```
pip install ninja
```

Run the same script again, this time, the following warning messages will show

    
                               !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (c++) is not compatible with the compiler Pytorch was
built with for this platform, which is g++ on linux. Please
use g++ to to compile your extension. Alternatively, you may
compile PyTorch from source using c++, and then you can also use
c++ to compile your extension.

See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
with compiling PyTorch from source.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                              !! WARNING !!

  warnings.warn(WRONG_COMPILER_WARNING.format(
Could not load the custom kernel for multi-scale deformable attention: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory

Certainly, /root/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/ is empty.

The issue happens only when both ninja and transformers are installed. I believe that the following issue may be related to this issue:

https://app.semanticdiff.com/gh/huggingface/transformers/pull/32834/overview

Expected behavior

It seems that ninja will let DeformableDetrModel throw unexpected error messages (despite that the script still works). That's may be because I am using a container without any compiler or CUDA preinstalled (the CUDA run time is installed by pip).

I think there should be a check that automatically turn of the ninja related functionalities even if ninja is installed by pip, as long as the requirements like compiler version, CUDA path, or something, are not fulfilled.

The text was updated successfully, but these errors were encountered:

Rocketknight1 · 2024-12-20T14:18:54Z

This seems like an interaction between ninja and the DeformableDETR library, rather than transformers specifically.

cainmagi · 2025-01-02T13:33:32Z

Thank you for letting me know that! I am not familiar with CUDA. It may take me a few days to check how to reproduce this issue with that library. I will try to do that and submit another issue to that repository.

pspdada · 2025-01-10T16:55:39Z

Hello, I've encountered a similar issue. May I ask if there has been any progress? This is the warning I received when loading DINO:

[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,888 >> Could not load the custom kernel for multi-scale deformable attention: Error building extension 'MultiScaleDeformableAttention': [1/4] /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output ms_deform_attn_cuda.cuda.o.d -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/TH -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/THC -isystem ~/anaconda3/envs/psp/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_80,code=sm_80 --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -std=c++17 -c ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/cuda/ms_deform_attn_cuda.cu -o ms_deform_attn_cuda.cuda.o 
FAILED: ms_deform_attn_cuda.cuda.o 
/usr/bin/nvcc --generate-dependencies-with-compile --dependency-output ms_deform_attn_cuda.cuda.o.d -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/TH -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/THC -isystem ~/anaconda3/envs/psp/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_80,code=sm_80 --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -std=c++17 -c ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/cuda/ms_deform_attn_cuda.cu -o ms_deform_attn_cuda.cuda.o 
<command-line>: fatal error: cuda_runtime.h: No such file or directory
compilation terminated.
[2/4] c++ -MMD -MF ms_deform_attn_cpu.o.d -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/TH -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/THC -isystem ~/anaconda3/envs/psp/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -DWITH_CUDA=1 -c ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/cpu/ms_deform_attn_cpu.cpp -o ms_deform_attn_cpu.o 
[3/4] c++ -MMD -MF vision.o.d -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/TH -isystem ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/THC -isystem ~/anaconda3/envs/psp/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -DWITH_CUDA=1 -c ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/vision.cpp -o vision.o 
In file included from ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/vision.cpp:11:
~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/ms_deform_attn.h: In function ‘at::Tensor ms_deform_attn_forward(const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, int)’:
~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/ms_deform_attn.h:29:19: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
   29 |     if (value.type().is_cuda())
      |         ~~~~~~~~~~^~
In file included from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/ATen/Tensor.h:3,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/function_hook.h:3,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/cpp_hook.h:2,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/variable.h:6,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/autograd.h:3,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:3,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/extension.h:5,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/cpu/ms_deform_attn_cpu.h:12,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/ms_deform_attn.h:13,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/vision.cpp:11:
~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:225:30: note: declared here
  225 |   DeprecatedTypeProperties & type() const {
      |                              ^~~~
In file included from ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/vision.cpp:11:
~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/ms_deform_attn.h: In function ‘std::vector<at::Tensor> ms_deform_attn_backward(const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, int)’:
~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/ms_deform_attn.h:51:19: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
   51 |     if (value.type().is_cuda())
      |         ~~~~~~~~~~^~
In file included from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/ATen/Tensor.h:3,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/function_hook.h:3,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/cpp_hook.h:2,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/variable.h:6,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/autograd.h:3,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:3,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/torch/extension.h:5,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/cpu/ms_deform_attn_cpu.h:12,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/ms_deform_attn.h:13,
                 from ~/anaconda3/envs/psp/lib/python3.10/site-packages/transformers/kernels/deformable_detr/vision.cpp:11:
~/anaconda3/envs/psp/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:225:30: note: declared here
  225 |   DeprecatedTypeProperties & type() const {
      |                              ^~~~
ninja: build stopped: subcommand failed.

[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,893 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,896 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,900 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,903 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,906 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,909 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,912 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,915 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,918 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,921 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
[WARNING|modeling_grounding_dino.py:628] 2025-01-11 00:50:18,923 >> Could not load the custom kernel for multi-scale deformable attention: ~/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory

The system env info of mine:

- `transformers` version: 4.48.0
- Platform: Linux-6.8.0-49-generic-x86_64-with-glibc2.35
- Python version: 3.10.0
- Huggingface_hub version: 0.27.0
- Safetensors version: 0.4.5
- Accelerate version: 1.1.0
- Accelerate config:    not found
- PyTorch version (GPU?): 2.5.1+cu124 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: <fill in>
- Using GPU in script?: Yes
- GPU type: NVIDIA A800 80GB PCIe

suraj-srinivas · 2025-01-10T21:56:02Z

I'm encountering a similar issue. Any updates appreciated!

Rocketknight1 · 2025-01-13T17:07:14Z

Can someone open an issue with DeformableDETR? I'm not sure what we can do at our end!

cainmagi · 2025-01-13T17:22:54Z

Can someone open an issue with DeformableDETR? I'm not sure what we can do at our end!

@Rocketknight1 I am preparing some testing scripts now. After finishing them, I will report an update here and submit the issue to DeformableDETR.

cainmagi · 2025-01-13T22:56:53Z

@Rocketknight1

My conclusion

Currently, the multi-scale deformable attention seems to require the CUDA dev files. It needs the CUDA_HOME environment variable, and

If CUDA dev files (/usr/local/cuda) exists, nothing wrong happens no matter whether ninja is installed or not.
If ninja is not installed, nothing wrong happens.
If ninja is installed but CUDA dev files does not exist, There will be two warning messages. One is for requiring CUDA_HOME environment variable, the other is for requiring the MultiScaleDeformableAttention.so file. If we find a proper MultiScaleDeformableAttention.so file as a compensation, the second warning will disappear, but the warning requring CUDA_HOME still exists.

A workaround

I have tried to use a different docker image

docker run --gpus all -it --rm --shm-size=1g nvcr.io/nvidia/pytorch:24.12-py3

and repeat the same steps mentioned above. This time, these error messages do not appear even if ninja is installed.

Tip

nvcr.io/nvidia/pytorch:24.12-py3 is based on Ubuntu 24.04. For those who want to use Debian 12, I have made two images:

cuda: only with CUDA 12.6 or 12.4.
deformable-detr: with PyTorch and DeformableDETR installed.

I believe that a key reason is because this image contains the CUDA dev files and the $CUDA_HOME is properly configured (it contains python:3.12). If I am using a bare image like python:3.10-slim or python:3.12-slim, since pip only install the CUDA run time files, something does not work as expected.

A further try by copying the missing library file

After successfully using nvcr.io/nvidia/pytorch:24.12-py3, I tried the following steps:

Use nvcr.io/nvidia/pytorch:24.12-py3 to launch a container, and use the Git source code build the DeformableDETR package. After that, I got the built library file: deformable_detr/models/ops/build/lib.linux-x86_64-cpython-312/MultiScaleDeformableAttention.cpython-312-x86_64-linux-gnu.so.
Make a backup of the library file MultiScaleDeformableAttention.cpython-312-x86_64-linux-gnu.so.
Launch another container python:3.12-slim, and copy the library file as MultiScaleDeformableAttention.cpython-312-x86_64-linux-gnu.so -> /root/.cache/torch_extensions/py312_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so.

Run the test again. This time, I got

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
/usr/local/lib/python3.12/site-packages/torch/utils/cpp_extension.py:361: UserWarning:

                               !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (c++) is not compatible with the compiler Pytorch was
built with for this platform, which is g++ on linux. Please
use g++ to to compile your extension. Alternatively, you may
compile PyTorch from source using c++, and then you can also use
c++ to compile your extension.

See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
with compiling PyTorch from source.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                              !! WARNING !!

  warnings.warn(WRONG_COMPILER_WARNING.format(
Could not load the custom kernel for multi-scale deformable attention: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
Could not load the custom kernel for multi-scale deformable attention: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /root/.cache/torch_extensions/py312_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so)
Could not load the custom kernel for multi-scale deformable attention: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /root/.cache/torch_extensions/py312_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so)
Could not load the custom kernel for multi-scale deformable attention: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /root/.cache/torch_extensions/py312_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so)
...

Apparently, The above issue appears because python:3.12-slim is based on Debian which has a different GLibC version compared to NVIDIA's image (Ubuntu).

Another test by copying the missing library file

Well, it is difficult to make Debian's GLibC version (2.38) the same as Ubuntu (3.4). I think I should use a Debian image to build the package. So, this time, I tried the following things:

Build a customized Docker Image for Python 3.12, CUDA 12.4, and Debian 12. The image can be found here: https://hub.docker.com/r/cainmagi/deformable-detr
Repeat the same steps for testing DeformableDetrModel.

The script works fine like nvcr.io/nvidia/pytorch:24.12-py3 did. Only the following messages are shown

/usr/local/lib/python3.12/site-packages/torch/utils/cpp_extension.py:1964: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
  warnings.warn(

Use the Git source codes to build the library file MultiScaleDeformableAttention.cpython-312-x86_64-linux-gnu.so. Make a backup for this file.
Launch another container python:3.12-slim, where CUDA is not installed. Copy the library file as MultiScaleDeformableAttention.cpython-312-x86_64-linux-gnu.so -> /root/.cache/torch_extensions/py312_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so.

Run the test in this container. This time, the warning messages only contain these:

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
/usr/local/lib/python3.12/site-packages/torch/utils/cpp_extension.py:361: UserWarning:

                               !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (c++) is not compatible with the compiler Pytorch was
built with for this platform, which is g++ on linux. Please
use g++ to to compile your extension. Alternatively, you may
compile PyTorch from source using c++, and then you can also use
c++ to compile your extension.

See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
with compiling PyTorch from source.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                              !! WARNING !!

  warnings.warn(WRONG_COMPILER_WARNING.format(
Could not load the custom kernel for multi-scale deformable attention: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

@pspdada I think the issue you encountered is mainly due to the lack of the CUDA dev files. This line tells you why your build fails:

fatal error: cuda_runtime.h: No such file or directory

However, the warning like these should not cause the failure of building the package.

if (value.type().is_cuda())
~~~~~~~~~~^~

I did not use conda for very long. I am not sure whether the CUDA you are using is complete or not if you install it by conda. However, I am quite sure that the CUDA installed by pip only contains run time but does not contain the dev files.

I highly recommend you use docker and a docker image nvcr.io/nvidia/pytorch:24.12-py3. It almost has everything we need (CUDA run time, CUDA dev files, PyTorch, ...).

cainmagi · 2025-01-13T23:16:29Z

I have submitted the issue in fundamentalvision/Deformable-DETR#244

Hopefully, my tests can provide usable information.

jmmfcoutinho · 2025-01-23T15:53:50Z

Having the same problem here! Any updates?

qubvel · 2025-01-30T13:22:23Z

I also faced an issue using DefarmableAttention, trying to figure out how this could be fixed..

In my environment, I'm getting Segmentation fault (core dumped).

Meanwhile, you can disable custom kernels to avoid an error/warning:

# Grounding DINO 
model = AutoModelForZeroShotObjectDetection.from_pretrained(checkpoint, disable_custom_kernels=True)

haofanwang · 2025-02-03T09:29:07Z

I met the same problem.

grounding_model = AutoModelForZeroShotObjectDetection.from_pretrained("IDEA-Research/grounding-dino-tiny")

gets stuck.

qubvel · 2025-02-03T09:43:11Z

Hey, if you're using latest torch, it breaks compilation for custom kernels, there's is a PR to fix it
#35979

cainmagi added the bug label Dec 19, 2024

cainmagi mentioned this issue Jan 13, 2025

Unexpected dependency requirements from MultiScaleDeformableAttention.so when ninja is installed but CUDA devel files are not installed. fundamentalvision/Deformable-DETR#244

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A warning message showing that `MultiScaleDeformableAttention.so` is not found in `/root/.cache/torch_extensions` if `ninja` is installed with `transformers` #35349

A warning message showing that `MultiScaleDeformableAttention.so` is not found in `/root/.cache/torch_extensions` if `ninja` is installed with `transformers` #35349

cainmagi commented Dec 19, 2024 •

edited

Loading

Rocketknight1 commented Dec 20, 2024

cainmagi commented Jan 2, 2025

pspdada commented Jan 10, 2025 •

edited

Loading

suraj-srinivas commented Jan 10, 2025

Rocketknight1 commented Jan 13, 2025

cainmagi commented Jan 13, 2025

cainmagi commented Jan 13, 2025 •

edited

Loading

cainmagi commented Jan 13, 2025

jmmfcoutinho commented Jan 23, 2025

qubvel commented Jan 30, 2025 •

edited

Loading

haofanwang commented Feb 3, 2025

qubvel commented Feb 3, 2025

A warning message showing that MultiScaleDeformableAttention.so is not found in /root/.cache/torch_extensions if ninja is installed with transformers #35349

A warning message showing that MultiScaleDeformableAttention.so is not found in /root/.cache/torch_extensions if ninja is installed with transformers #35349

Comments

cainmagi commented Dec 19, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Rocketknight1 commented Dec 20, 2024

cainmagi commented Jan 2, 2025

pspdada commented Jan 10, 2025 • edited Loading

suraj-srinivas commented Jan 10, 2025

Rocketknight1 commented Jan 13, 2025

cainmagi commented Jan 13, 2025

cainmagi commented Jan 13, 2025 • edited Loading

My conclusion

A workaround

A further try by copying the missing library file

Another test by copying the missing library file

cainmagi commented Jan 13, 2025

jmmfcoutinho commented Jan 23, 2025

qubvel commented Jan 30, 2025 • edited Loading

haofanwang commented Feb 3, 2025

qubvel commented Feb 3, 2025

A warning message showing that `MultiScaleDeformableAttention.so` is not found in `/root/.cache/torch_extensions` if `ninja` is installed with `transformers` #35349

A warning message showing that `MultiScaleDeformableAttention.so` is not found in `/root/.cache/torch_extensions` if `ninja` is installed with `transformers` #35349

cainmagi commented Dec 19, 2024 •

edited

Loading

pspdada commented Jan 10, 2025 •

edited

Loading

cainmagi commented Jan 13, 2025 •

edited

Loading

qubvel commented Jan 30, 2025 •

edited

Loading