[BUG] torch.OutOfMemoryError: CUDA out of memory when ingesting JSON/CSV Files #2058

EEmlan · 2024-08-12T06:46:56Z

Pre-check

I have searched the existing issues and none cover this bug.

Description

When I want to ingest JSON or CSV Files CUDA always throws a OOM Error when ingesting.

CUDA out of memory. Tried to allocate 21.00 GiB. GPU 0 has a total capacity of 11.72 GiB of which 9.67 GiB is free. Process 1204 has 152.00 MiB memory in use. Including non-PyTorch memory, this process has 1.89 GiB memory in use. Of the allocated memory 1.51 GiB is allocated by PyTorch, and 210.18 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

The interesting thing is that the amount tried to allocade does not correspont with the size of the JSON.
This is a JSON file with around 800 lines and I have tried larger ones with several thousand lines and there it wants only to allocate only 7GiB Memory.

I also tried to run ingestion with the "mock" profile as stated in the docs and even disabling CUDA with
CMAKE_ARGS='-DLLAMA_CUBLAS=off' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
doesn't make the error go away.
I have a Nvidia RTX 4070 with 12GB of Memory running on Ubuntu 23.10 (without Docker) with Python 3.11

I'm using profile local with the default lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF and default nomic-ai/nomic-embed-text-v1.5 embedding model.

Steps to Reproduce

local ingest JSON File with simple ingestion Mode with local ingestion and the default llama 3.1 Model

Expected Behavior

Ingests JSON File

Actual Behavior

Throws Out of Memory error

Environment

NVIDIA RTX 4070 12GB on Ubuntu 23.10

Additional Information

No response

Version

0.6.?

Setup Checklist

Confirm that you have followed the installation instructions in the project’s documentation.
Check that you are using the latest version of the project.
Verify disk space availability for model storage and data processing.
Ensure that you have the necessary permissions to run the project.

NVIDIA GPU Setup Checklist

Check that the all CUDA dependencies are installed and are compatible with your GPU (refer to CUDA's documentation)
Ensure an NVIDIA GPU is installed and recognized by the system (run nvidia-smi to verify).
Ensure proper permissions are set for accessing GPU resources.
Docker users - Verify that the NVIDIA Container Toolkit is configured correctly (e.g. run sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi)

The text was updated successfully, but these errors were encountered:

jaluma · 2024-08-13T06:45:10Z

Are you trying to ingest local files using ingest_folder?
If yes, can you try to run using mock profile PGPT_PROFILES=mock make ingest?

EEmlan · 2024-08-13T07:07:02Z

Are you trying to ingest local files using ingest_folder? If yes, can you try to run using mock profile PGPT_PROFILES=mock make ingest?
Yes I use an folder for ingestion
I tried using the mock profile with same results.
When I ingest .pdf or txt files with same settings it works mostly fine except when the documents get too large and presumably the ingestion sometimes has a problem splitting it into chunks. I have noticed when there are long texts with no punctuation e.g. in technical data sheets that it throws a similar error.

EEmlan added the bug Something isn't working label Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] torch.OutOfMemoryError: CUDA out of memory when ingesting JSON/CSV Files #2058

[BUG] torch.OutOfMemoryError: CUDA out of memory when ingesting JSON/CSV Files #2058

EEmlan commented Aug 12, 2024

jaluma commented Aug 13, 2024

EEmlan commented Aug 13, 2024

[BUG] torch.OutOfMemoryError: CUDA out of memory when ingesting JSON/CSV Files #2058

[BUG] torch.OutOfMemoryError: CUDA out of memory when ingesting JSON/CSV Files #2058

Comments

EEmlan commented Aug 12, 2024

Pre-check

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Additional Information

Version

Setup Checklist

NVIDIA GPU Setup Checklist

jaluma commented Aug 13, 2024

EEmlan commented Aug 13, 2024