How to change FAISS distance_strategy ? #16224
-
So, currently I have a vector object
After reading the documentation, I know that FAISS is using by default eucledian distance. I want to experiment with Cosine and Inner Product. I also saw this in faiss.py :
How can I change the distance_strategy before initializing it using from_document? As I understand, once you initialize it, you cannot change the distance strategy anymore, is that correct ? |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 11 replies
-
🤖 You're correct in your understanding that the Here is an example of how you can do this: from langchain_community.embeddings.openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.vectorstores.faiss import DistanceStrategy
embeddings = OpenAIEmbeddings()
texts = ["FAISS is an important library", "LangChain supports FAISS"]
# Initialize the FAISS object with a different distance strategy
faiss = FAISS(
embedding_function=embeddings,
index=<your_index>,
docstore=<your_docstore>,
index_to_docstore_id=<your_index_to_docstore_id>,
distance_strategy=DistanceStrategy.MAX_INNER_PRODUCT, # Change this to your desired strategy
) Please replace The You can find more details about this in the FAISS vector store source code. I hope this helps! If you have any other questions, feel free to ask. Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
🤖 To change the from langchain_community.embeddings.openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.vectorstores.faiss import DistanceStrategy
embeddings = OpenAIEmbeddings()
texts = ["FAISS is an important library", "LangChain supports FAISS"]
# Initialize your index, docstore, and index_to_docstore_id here
your_index = ...
your_docstore = ...
your_index_to_docstore_id = ...
faiss = FAISS(
embedding_function=embeddings,
index=your_index,
docstore=your_docstore,
index_to_docstore_id=your_index_to_docstore_id,
distance_strategy=DistanceStrategy.INNER_PRODUCT # Change the distance strategy here
) In the above code, replace Please note that the Regarding your second question, the Here is the method signature: @classmethod
def from_embeddings(
cls,
text_embeddings: Iterable[Tuple[str, List[float]]],
embedding: Embeddings,
metadatas: Optional[Iterable[dict]] = None,
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> FAISS: The method first separates the texts and their embeddings from the Here is the relevant code: texts = [t[0] for t in text_embeddings]
embeddings = [t[1] for t in text_embeddings]
return cls.__from(
texts,
embeddings,
embedding,
metadatas=metadatas,
ids=ids,
**kwargs,
) This method is useful when you already have precomputed embeddings for your documents and you want to create a Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
🤖 To change the from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_community.vectorstores.faiss import DistanceStrategy
embeddings = OpenAIEmbeddings()
faiss = FAISS.from_texts(texts, embeddings, distance_strategy=DistanceStrategy.MAX_INNER_PRODUCT) In the above example, As for your second question, the Here is how it works:
Please note that the actual names of these variables might be different in the actual implementation. The names You can find more details in the FAISS implementation in the LangChain repository. Regarding the exact signature and parameters of the SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
@eyurtsev hello. I'd like to ask a follow up question. the is there a reason for that? am I missing something in my implementation? thanks. this is my code snippet
|
Beta Was this translation helpful? Give feedback.
-
When using the OpenAI Embedding model, the vectors are normalized after the sentences have been embedded. |
Beta Was this translation helpful? Give feedback.
from_documents
callsfrom_texts
andfrom_text
calls__from
. I think @dosu-bot got this one correctly. Pass in the distance strategyFWIW changing the distance strategy is unlikely to make a dramatic improvement in the quality of RAG.