-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add multi vector support #11
feat: add multi vector support #11
Conversation
Hi @omriel1 thanks for the PR! FYI, the reviewer @zc277584121 is out for a week. Is this PR urgent? Would prefer to wait till the reviewer is back, or we could ask @efriis to take a look first. |
Hi! Thanks for the reply. This is important for us (it was delayed due to the repo separation), but I completely understand and agree that we should wait for @zc277584121. But in the meantime, if you or @efriis have any comments, I'd be happy to start addressing them! Thanks! |
dense_embeddings_func_2, | ||
], | ||
texts=fake_texts, | ||
connection_args={"uri": "./milvus_demo.db"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this uri be passed in the same way as ut above using temp_milvus_db? Just for the code consistency
embedding=[FakeEmbeddings(), FakeEmbeddings()], | ||
index_params=[index_param_1, index_param_2], | ||
vector_field=["vec_field_1", "vec_field_2"], | ||
connection_args={"uri": "./milvus_demo.db"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
embedding=[embedding_1, embedding_2], | ||
texts=fake_texts, | ||
index_params=[index_param_1, index_param_2], | ||
connection_args={"uri": "./milvus_demo.db"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
for i, embeddings_func in enumerate(embeddings_functions): | ||
if not self._get_index(vector_fields[i]): | ||
try: | ||
# If no index params, use a default HNSW based one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should keep the comment If no index params, use a default *AutoIndex* based one
unchanged
using=self.alias, | ||
) | ||
|
||
# If default did not work, most likely on Zilliz Cloud |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
||
# If default did not work, most likely on Zilliz Cloud | ||
except MilvusException: | ||
# Use AUTOINDEX based index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
@omriel1 sorry for delay. thanks for your great contributing. and i have left some small comments here. all others LGTM |
Description
This PR introduces multi-vector support!
@efriis @zc277584121 I've moved and modified the PR previously submitted to the main langhcain repo: langchain-ai/langchain#26500
Milvus 2.4 introduced the option for multi-vector support, which is becoming increasingly popular, especially for use cases like hybrid search (dense + sparse embeddings).
Lately, @ohadeytan introduced the option to use sparse embeddings in this PR.
Additionally, @zc277584121 already introduced the
MilvusCollectionHybridSearchRetriever
, which enables hybrid search against pre-defined collections directly via pymilvus.However, this method doesn't take full advantage of the many useful features offered by
langchain_milvus
when building a collection: automatic schema creation, indexing and search parameter creation etc.This PR intend to make developers life easier, by allowing them to use single-vector or multi-vector with a single
langchain
interface, that create, connect, and searchMilvus
. For example, at IBM and IBM Research this feature is requested by many of our developers and researchers and will be very useful for us.Changes
This PR addresses the limitations described above by introducing the following changes:
We are eager to have this merged into
langchain-milvus
as we utilize many of langchain features, particularlylangchain-milvus
. We want to continue benefiting from the many valuable features this package provides. We'll make any required changes and will be glad to get any guidance to make it happen!Twitter handle: @EliyahuOmri, @ohadeytan