Debugging "CUDA kernel not found in registries for Op type" #23434

axbycc-mark · 2025-01-20T19:51:12Z

axbycc-mark
Jan 20, 2025

My model runs much slower in onnx than in torch. During the session initialization, I get some of these messages.

025-01-20 14:42:10.647579188 [I:onnxruntime:, cuda_execution_provider.cc:2517 GetCapability] CUDA kernel not found in registries for Op type: Equal node name: /Equal
2025-01-20 14:42:10.647694414 [I:onnxruntime:, cuda_execution_provider.cc:2517 GetCapability] CUDA kernel not found in registries for Op type: Equal node name: /Equal_1
2025-01-20 14:42:10.647722106 [I:onnxruntime:, cuda_execution_provider.cc:2517 GetCapability] CUDA kernel not found in registries for Op type: ConstantOfShape node name: /ConstantOfShape_2
2025-01-20 14:42:10.647764466 [I:onnxruntime:, cuda_execution_provider.cc:2517 GetCapability] CUDA kernel not found in registries for Op type: Equal node name: /Equal_2
2025-01-20 14:42:10.647860586 [I:onnxruntime:, cuda_execution_provider.cc:2517 GetCapability] CUDA kernel not found in registries for Op type: Equal node name: /Equal_3
2025-01-20 14:42:10.649066349 [I:onnxruntime:, cuda_execution_provider.cc:2517 GetCapability] CUDA kernel not found in registries for Op type: Resize node name: /image_encoder/image_encoder.1/Resize
2025-01-20 14:42:10.649137533 [I:onnxruntime:, cuda_execution_provider.cc:2517 GetCapability] CUDA kernel not found in registries for Op type: Resize node name: /image_encoder/image_encoder.1/Resize_1
2025-01-20 14:42:10.649199519 [I:onnxruntime:, cuda_execution_provider.cc:2517 GetCapability] CUDA kernel not found in registries for Op type: Resize node name: /image_encoder/image_encoder.1/Resize_2
2025-01-20 14:42:10.649272877 [I:onnxruntime:, cuda_execution_provider.cc:2517 GetCapability] CUDA kernel not found in registries for Op type: GridSample node name: /GridSample
2025-01-20 14:42:10.649420975 [I:onnxruntime:, cuda_execution_provider.cc:2517 GetCapability] CUDA kernel not found in registries for Op type: Equal node name: /Equal_4

I'm wondering if this might be the cause. Does it mean that the operations Equal, Resize, and GridSample are being executed on the CPU? If so, how can I debug this? Looking at https://github.com/microsoft/onnxruntime/blob/rel-1.20.0/docs/OperatorKernels.md it looks like all these kernels should be implemented for the CUDA execution provider.

Answered by axbycc-mark

Jan 21, 2025

The issue had to do with the operator versioning. I exported with opset=20, which caused GridSample to be exported at version=20, for example. However, the CUDA provider has only implemented it for version=16+. Re-exporting at opset=17 fixed the issue.

View full answer

axbycc-mark · 2025-01-20T20:24:36Z

axbycc-mark
Jan 20, 2025
Author

Cross posted https://stackoverflow.com/questions/79372568/debugging-cuda-kernel-not-found-in-registries-for-op-type-in-onnxruntime for more visibility.

0 replies

axbycc-mark · 2025-01-21T22:42:49Z

axbycc-mark
Jan 21, 2025
Author

The issue had to do with the operator versioning. I exported with opset=20, which caused GridSample to be exported at version=20, for example. However, the CUDA provider has only implemented it for version=16+. Re-exporting at opset=17 fixed the issue.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Debugging "CUDA kernel not found in registries for Op type" #23434

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Debugging "CUDA kernel not found in registries for Op type" #23434

axbycc-mark Jan 20, 2025

Replies: 2 comments

axbycc-mark Jan 20, 2025 Author

axbycc-mark Jan 21, 2025 Author

axbycc-mark
Jan 20, 2025

axbycc-mark
Jan 20, 2025
Author

axbycc-mark
Jan 21, 2025
Author