You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been trying to ingest about 1000 PDFs through PGPT. After testing I found that pipeline with 1 worker is the fastest option on my system (any more workers hinder the speed). However, I found that the 8 GB VRAM and 32 GB (out of 64 GB) shared memory of my system quickly gets occupied even if I try to ingest 10 PDFs at a time. I tried to circumvent the memory hogging issue by restarting the pipeline every time. See below how I build a chunking solution by using LocalIngestWorker from ingest_folder.py.
files = get_list_of_combined_files(folders)
print(len(files))
split_into_chunks = lambda lst, n: [lst[i:i+n] for i in range(0, len(lst), n)]
list_of_size_30_chunks = split_into_chunks(files, 10)
for index, chunk in enumerate(list_of_size_30_chunks):
print("Chunk number", index, "of", len(list_of_size_30_chunks))
destination = r"\Temp\\"
copy_new_files(destination, chunk)
ingest_service = global_injector.get(IngestService)
settings = global_injector.get(Settings)
worker = LocalIngestWorker(ingest_service, settings)
worker.ingest_folder(Path(destination), irgnored)
del worker
del ingest_service
del settings
However this does not release the memory at the end of for loop and the same problem persists (I even tried del with no luck). I tried to search around about potential memory leak issues with huggingface text embeddings solution: found this memory leak issue.
Is it just me or anyone else also facing the same issue with ingest mode pipeline, huggingface on an nvidia gpu? I would appreciate any solution or suggestions.
The text was updated successfully, but these errors were encountered:
I have found a work around by ingesting 5 pdfs at one time, then clear torch cuda cache, and restart the process again (pipeline mode, mock profile, huggingface embedding model). It is slow, but it works. The memory is reset after every batch. It takes time to write the results to database, the GPU is idle in the meantime but it is the most efficient way I could find based on my hardware. Added the following at the end of my code adapted from ingest_folder.py.
del worker
del settings
del ingest_service
with torch.no_grad():
torch.cuda.empty_cache()
gc.collect()
Question
I have been trying to ingest about 1000 PDFs through PGPT. After testing I found that
pipeline
with1
worker is the fastest option on my system (any more workers hinder the speed). However, I found that the 8 GB VRAM and 32 GB (out of 64 GB) shared memory of my system quickly gets occupied even if I try to ingest 10 PDFs at a time. I tried to circumvent the memory hogging issue by restarting the pipeline every time. See below how I build a chunking solution by usingLocalIngestWorker
fromingest_folder.py
.However this does not release the memory at the end of
for
loop and the same problem persists (I even trieddel
with no luck). I tried to search around about potential memory leak issues withhuggingface text embeddings
solution: found this memory leak issue.Is it just me or anyone else also facing the same issue with ingest mode
pipeline
,huggingface
on an nvidia gpu? I would appreciate any solution or suggestions.The text was updated successfully, but these errors were encountered: