Cache GPU objects on second hit #1876

phaniarnab · 2023-08-08T07:27:25Z

No description provided.

This patch pushes out the loading of cuDNN and cuSPARSE libraries until required. Moreover, we now record the unusable GPU memory due to fragmentation and use that to avoid unnecessary cudaMalloc failures.

This patch updates the reuse logic of GPU objects to skip the first reference and cache on the second hit. This filters out many never-repeating intermediates, which in turns reduces GPU memory pressure, allocation and deallocation counts.

phaniarnab added 2 commits August 8, 2023 09:23

[MINOR] Push loading Cuda libraries until when required

8033619

This patch pushes out the loading of cuDNN and cuSPARSE libraries until required. Moreover, we now record the unusable GPU memory due to fragmentation and use that to avoid unnecessary cudaMalloc failures.

phaniarnab closed this in eb1a697 Aug 8, 2023

phaniarnab deleted the gpuOps3 branch August 8, 2023 09:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache GPU objects on second hit #1876

Cache GPU objects on second hit #1876

phaniarnab commented Aug 8, 2023

Cache GPU objects on second hit #1876

Cache GPU objects on second hit #1876

Conversation

phaniarnab commented Aug 8, 2023