Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clustering_qr.kmeans_plusplus explicit tensors deletion #774

Conversation

RobertoDF
Copy link
Contributor

@RobertoDF RobertoDF commented Sep 3, 2024

Expand the reach of clear_cache to clustering_qr.kmeans_plusplus.
Somehow I still get OOM at

vexp = 2 * Xg @ Xc.T - (Xc**2).sum(1)
in one particular session. With this explicit cache cleaning I manage to process the session.

The error happens with a specific Xg matrix that takes 5GB of GPU memory. I have 12GB of memory on my RTX 4070.

The explicit deletion of vexp and dexp will slow down the loop therefore happens only if the tensor is bigger than 4GB.

dexp and vexp are reassigned within each loop but somehow the GPU does not delete immediately the previous tensor. It is enough to delete the variables, it is not necessary to use directly torch.cuda.empty_cache(), that would further slow down the loop.

Related to #746, possibly #771

@RobertoDF RobertoDF closed this Sep 3, 2024
@RobertoDF RobertoDF deleted the kmeans_plusplus_explicit_tensors_deletion branch September 4, 2024 08:08
@RobertoDF
Copy link
Contributor Author

oh! I thought drafts were only visible to me sorry. I still have the problem but I found a way better solution. Ill do another pull request soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant