-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MLTable - AzureML - Cache Environment variables #3143
Comments
I have the same bug. data caching eats up all memory on 64Gb disk. Cant store training checkpoints. |
Normally you can use the fix i did, just set the DATASET_MOUNT_CACHE_SIZE env variable with a size and normally it should work. But anyway it should be fixed.... |
Operating System
Linux
Version Information
mltable-1.6.1
azureml-dataprep-rslex~=2.22.2dev0
Steps to reproduce
For example, in AzureMachine Learning :
In order to fix my issue, i need to add extra mount settings :
https://learn.microsoft.com/en-us/azure/machine-learning/how-to-read-write-data-v2?view=azureml-api-2&tabs=python#available-mount-settings
I use a wrapper class in order to do this on multiple storage / containers :
I also tried to add the environment variable in the yaml job :
But none of theses solutions are working well.
Expected behavior
I expect that the disk cache is pruned when it is reaching the -40GB limit on the compute machine.
Actual behavior
Currently, the cache continues to grow :
Until fail :
Even if i set environment variables in yaml :
or in code :
And i can confirm that the environment variable are used in the job :
But it seems mltables are ignoring them.
Addition information
No response
The text was updated successfully, but these errors were encountered: