This is a Kafka sink connector for Milvus. It allows you to stream vector data from Kafka to Milvus.
This current version supports connection from Confluent Cloud (hosted Kafka) and Open-Source Kafka to Milvus (self-hosted or Zilliz Cloud).
Zilliz Cloud and Milvus are vector databases where you can ingest, store and search vector data. An entity in Zilliz Cloud or Milvus contains one vector field and multiple scalar fields such as string, integer and float. To stream data, the Kafka message schema must match the schema of the Milvus collection, containing one and only one vector field, and zero to many scalar fields. The name of each field must be exactly the same on both sides.
In this quick start guide we show how to setup Confluent Cloud and Zilliz Cloud to ingest vector data. For self-hosted Milvus the setup is similar.
- If you want to use this connector with open source Kafka, see README_OSS.md
Complete the following steps to download the kafka-connect-milvus plugin.
- download the latest plugin zip file
zilliz-kafka-connect-milvus-xxx.zip
from here.
Ensure you have Confluent Cloud and Zilliz Cloud setup and properly configured.
- If you don't already have a topic in Confluent Cloud, create a topic (e.g.
topic_0
) in Confluent Cloud. - If you don't already have a collection in Zilliz Cloud, create a collection with a vector field (in this example the vector has
dimension=8
). You can use the following example schema on Zilliz Cloud:
Note: Make sure the schema on both sides match each other. In the schema, there is exactly one vector field. The names of each field on both sides are exactly the same.
- Go to the Connectors section in your Confluent Cloud cluster.
- Click on
Add Plugin
. - Upload the
zilliz-kafka-connect-milvus-xxx.zip
file you downloaded in Step 1.
- Fill in Connector plugin details
- Set
com.milvus.io.kafka.MilvusSinkConnector
as Connector class. - For Sensitive properties, fill in the config field you want to hide, such as
token
field. Please see custom-connector for more details.
-
Go to the
Connectors
section in your Confluent Cloud cluster. -
Click on
Get Started
. -
Enter Confluent Cloud API Key and Secret
-
Add Configuration for the connector
Provide the Milvus connector config as follows:
{ "public.endpoint": "https://<public.endpoint>:port", "token": "*****************************************", "collection.name": "topic_0", "topics": "topic_0" }
- The token field is either the API token or
<username>:<password>
, depending on the instance type of your collection in Milvus or Zilliz Cloud.
- The token field is either the API token or
-
Enter Connection endpoints, which is the public endpoint of your Milvus or Zilliz Cloud instance. Like:
in01-034b8444ab99cab.aws-us-west-2.vectordb.zillizcloud.com
-
Choose size, the number of tasks to be run
-
Review and launch
We support Json, JsonSchema, Avro, ProtoBuf format for Kafka topic,
if your topic has an output format other than plain json, you need to choose the converter type in Auto Configure Schema Registry
.
- Sample Config for kafka topic with schema, check here for more details
Start the connector to begin streaming data from Kafka to Milvus.
- Try produce a message to the Kafka topic you just created in Confluent Cloud
{
"id": 0,
"title": "The Reported Mortality Rate of Coronavirus Is Not Important",
"title_vector": [0.041732933, 0.013779674, -0.027564144, -0.013061441, 0.009748648, 0.00082446384, -0.00071647146, 0.048612226],
"link": "https://medium.com/swlh/the-reported-mortality-rate-of-coronavirus-is-not-important-369989c8d912"
}
- Topic Page on Confluent Cloud
- Check if the entity has been inserted into the collection in Zilliz Cloud. Here is what it looks like on Zilliz Cloud if the insertion succeeds:
If you require any assistance or have questions regarding the Kafka Connect Milvus Connector, please feel free to reach out to our support team: Email: [email protected]
Regarding Kafka hosting, currently this connector only supports Confluent Cloud. If used for the other forms of Kafka hosting, including but not limited to Confluent Platform and self-hosted Kafka instances, the security and usability of this connector has not been fully verified. Please use at your discretion or contact us if you intend to use it in such settings.
Regarding Milvus, both the self-hosted Milvus and Zilliz Cloud are supported.