Replies: 23 comments 2 replies
-
Right, it appears to be an implementation gap. I will push an update shortly. Thanks for raising the issue! |
Beta Was this translation helpful? Give feedback.
-
@danpak6 the version that supports querying with metadata filtering has been published: https://www.nuget.org/packages/Pinecone.NET/1.2.0 using var index = await pinecone.GetIndex("my-test-index");
var filter = new MetadataMap
{
["genre"] = "horror",
["year"] = 1963
};
var scored = await index.Query("birds", topK: 5, filter); or var scored = await index.Query("birds", topK: 5, new()
{
["genre"] = "horror",
["year"] = 1963
}); Standard limitations / rules for working with Pinecone's metadata type apply. |
Beta Was this translation helpful? Give feedback.
-
Hi @neon-sunset, Thank you for the quick response,
FYI The Pinecone docs, support vectors and metadata filtering.
|
Beta Was this translation helpful? Give feedback.
-
Pinecone allows to filter by metatdata only and you should just pass a json for metadata like the pinecone example. {
"genre": {"$eq": "documentary"},
"year": 2019
} index.query(
filter={
"genre": {"$eq": "documentary"},
"year": 2019
},
top_k=1,
include_metadata=True
)
|
Beta Was this translation helpful? Give feedback.
-
I don't think that's true. Metadata delcaration format in C# is subject to strong type system which makes it somewhat clunkier to work with when compared to JS or Python, therefore an expression var filter = new MetadataMap
{
["$and"] = new MetadataValue[]
{
new MetadataMap { ["genre"] = "comedy" },
new MetadataMap { ["genre"] = "documentary" }
}
} MongoDB-style conditionals just happen to hit the worst case scenario when it comes to syntax. If you look at how the |
Beta Was this translation helpful? Give feedback.
-
I understand your point, and I wanted to provide an explanation of the Pinecone SDK's functionality. |
Beta Was this translation helpful? Give feedback.
-
Thanks, I'm going to close the issue as done. Should you have any questions, feel free to open a new one or start a discussion (enabled now) 😄 Closed by #9 |
Beta Was this translation helpful? Give feedback.
-
I've been searching for a C# version for quite some time, and you're the first one to provide it. This is fantastic! In fact, I had to create an Azure Function just to be able to interact with Pinecone. Keep up the good work! 👍 |
Beta Was this translation helpful? Give feedback.
-
One more thing how can I filter all the metadata without the Id? |
Beta Was this translation helpful? Give feedback.
-
I'm not sure, after brief double-checking of the official documentation there does not seem to be an apparent way to do so. It is similarity search after all. Theoretically speaking, you could apply a metadata filter, and then provide a dummy vector (i.e. |
Beta Was this translation helpful? Give feedback.
-
I am using this on my Azure Function using Pinecone Python SDK. Query an index import pinecone
pinecone.init(api_key="YOUR_API_KEY", environment="us-west1-gcp")
index = pinecone.Index("example-index")
query_response = index.query(
namespace="example-namespace",
top_k=10,
include_values=True,
include_metadata=True,
vector=[0.1, 0.2, 0.3, 0.4],
filter={
"genre": {"$in": ["comedy", "documentary", "drama"]}
}
) |
Beta Was this translation helpful? Give feedback.
-
Using namespaces vs. metadata filtering |
Beta Was this translation helpful? Give feedback.
-
This would directly translate to C# code with this library: using Pinecone;
using var pinecone = new PineconeClient("YOUR_API_KEY", "us-west1-gcp")
using var index = await pinecone.GetIndex("example-index");
var queryResponse = await index.Query(
new[] { 0.1f, 0.2f, 0.3f, 0.4f },
topK: 10,
filter: new()
{
["genre"] = new MetadataMap { ["$in"] = new MetadataValue[] { "comedy", "documentary", "drama" } }
},
indexNamespace: "example-namespace",
includeMetadata: true); the only "creative liberty" I took is |
Beta Was this translation helpful? Give feedback.
-
Your code example has
|
Beta Was this translation helpful? Give feedback.
-
Pinecone allows two modes of querying:
The "birds" example assumes the index already has a vector with id "birds". This is just to showcase what methods the library has (and that they match the official API). |
Beta Was this translation helpful? Give feedback.
-
If I have KB with metadata like tenantId A & B and I want to filter documents on tenant A, how can I do that without having to use namespacing? |
Beta Was this translation helpful? Give feedback.
-
I'm not sure, but this is probably best answered either by Pinecone documentation (see the examples section) or community portal. I'm still learning vector DBs myself and wrote this library mostly because there was none in C#, which I love and wanted to use over Python🙂 |
Beta Was this translation helpful? Give feedback.
-
Got it, it will be hard to find or partition data in Pinecone without the metadata or if anybody is using it for semantic search and need to reference a document or page. |
Beta Was this translation helpful? Give feedback.
-
Have you checked https://docs.pinecone.io/docs/semantic-text-search ? It does seem like name-spacing would be relatively safe/logical choice to achieve tenant separation. You could try tagging vectors with metadata value "tenant": "tenantName" such approach does not seem to be safe enough, and would run into scaling issues, same as with name-spacing. If I were to implement a knowledge base, I would probably (depending on estimated dataset size) choose to use per-tenant indexes instead - that way, for example, you can archive data by snapshotting it into collection and then terminating an index, etc. |
Beta Was this translation helpful? Give feedback.
-
Here is a sample code that I am using Pinecone SDK and Azure Functions It's a KB system and I have several companies uploading their documentation
I need to filter public documents as '00000000-0000-0000-0000-000000000000' and their private docs 'XXXXXX-XXXXXXX' Below is the code. def search_company_documents(index, text, company_id, folder_ids, top_k=50, include_metadata=True):
company_ids = ['00000000-0000-0000-0000-000000000000', company_id]
search_response = index.query(
top_k=top_k,
vector=vectorize_query(text),
filter={'company_id': {"$in": company_ids}, 'folder_id': {'$in': folder_ids}},
include_metadata=include_metadata)
return search_response['matches'] |
Beta Was this translation helpful? Give feedback.
-
The above code is expressible in C#, feel free to refer to one of the previous examples on how to work with metadata. Converting this to a discussion because the feature support has been solved by 1.2.0. |
Beta Was this translation helpful? Give feedback.
-
@neon-sunset As I mentioned earlier, the library is great and I hope it will continue to evolve. For now, I will continue to use the Pinecone SDK. Good luck! |
Beta Was this translation helpful? Give feedback.
-
@neon-sunset First of all, thank you for the all the help and new functionality. using Pinecone;
using var pinecone = new PineconeClient("YOUR_API_KEY", "us-west1-gcp")
using var index = await pinecone.GetIndex("example-index");
List<Guid> ids = Enumerable.Range(1, 10).Select(x => Guid.NewGuid()).ToList();
await index.Delete(
filter: new()
{
["BookID"] = new MetadataMap { ["$in"] = new MetadataValue[] { ?? } }
},
); |
Beta Was this translation helpful? Give feedback.
-
How can I query by metadata filter?
https://docs.pinecone.io/docs/metadata-filtering
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions