Debugging "CUDA kernel not found in registries for Op type" #23434
-
My model runs much slower in onnx than in torch. During the session initialization, I get some of these messages.
I'm wondering if this might be the cause. Does it mean that the operations Equal, Resize, and GridSample are being executed on the CPU? If so, how can I debug this? Looking at https://github.com/microsoft/onnxruntime/blob/rel-1.20.0/docs/OperatorKernels.md it looks like all these kernels should be implemented for the CUDA execution provider. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Cross posted https://stackoverflow.com/questions/79372568/debugging-cuda-kernel-not-found-in-registries-for-op-type-in-onnxruntime for more visibility. |
Beta Was this translation helpful? Give feedback.
-
The issue had to do with the operator versioning. I exported with opset=20, which caused GridSample to be exported at version=20, for example. However, the CUDA provider has only implemented it for version=16+. Re-exporting at opset=17 fixed the issue. |
Beta Was this translation helpful? Give feedback.
The issue had to do with the operator versioning. I exported with opset=20, which caused GridSample to be exported at version=20, for example. However, the CUDA provider has only implemented it for version=16+. Re-exporting at opset=17 fixed the issue.