Replies: 4 comments 6 replies
-
To persist and load back a
By using Elasticsearch, you can handle large datasets efficiently and avoid the limitations and potential issues associated with using pickle for persistence [1]. |
Beta Was this translation helpful? Give feedback.
-
what if i don't want to use elastic search? |
Beta Was this translation helpful? Give feedback.
-
The bottleneck occurs in texts_processed and vectorizer = BM25Okapi(texts_processed, bm25_params) in the from_texts method. You can persist the vectorizer and avoid repeatedly processing texts. |
Beta Was this translation helpful? Give feedback.
-
Hi, I opened a PR supporting an alternative and persistable implementation based on bm25s #28123. |
Beta Was this translation helpful? Give feedback.
-
Checked other resources
Commit to Help
Example Code
Description
I am working on a hybrid search implementation, and use this code to do keyword search:
However, this is not scalable to large dataset due to pickle. What's the right way to persist keyword index?
System Info
langchain==0.2.15
langchain-aws==0.1.7
langchain-community==0.2.10
langchain-core==0.2.36
langchain-google-vertexai==1.0.8
langchain-milvus==0.1.4
langchain-openai==0.1.23
langchain-text-splitters==0.2.2
Beta Was this translation helpful? Give feedback.
All reactions