[BUG] torch.OutOfMemoryError: CUDA out of memory when ingesting JSON/CSV Files #2058
Open
8 of 9 tasks
Labels
bug
Something isn't working
Pre-check
Description
When I want to ingest JSON or CSV Files CUDA always throws a OOM Error when ingesting.
CUDA out of memory. Tried to allocate 21.00 GiB. GPU 0 has a total capacity of 11.72 GiB of which 9.67 GiB is free. Process 1204 has 152.00 MiB memory in use. Including non-PyTorch memory, this process has 1.89 GiB memory in use. Of the allocated memory 1.51 GiB is allocated by PyTorch, and 210.18 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
The interesting thing is that the amount tried to allocade does not correspont with the size of the JSON.
This is a JSON file with around 800 lines and I have tried larger ones with several thousand lines and there it wants only to allocate only 7GiB Memory.
I also tried to run ingestion with the "mock" profile as stated in the docs and even disabling CUDA with
CMAKE_ARGS='-DLLAMA_CUBLAS=off' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
doesn't make the error go away.
I have a Nvidia RTX 4070 with 12GB of Memory running on Ubuntu 23.10 (without Docker) with Python 3.11
I'm using profile local with the default lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF and default nomic-ai/nomic-embed-text-v1.5 embedding model.
Steps to Reproduce
Expected Behavior
Ingests JSON File
Actual Behavior
Throws Out of Memory error
Environment
NVIDIA RTX 4070 12GB on Ubuntu 23.10
Additional Information
No response
Version
0.6.?
Setup Checklist
NVIDIA GPU Setup Checklist
nvidia-smi
to verify).sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
)The text was updated successfully, but these errors were encountered: