Skip to content

Commit

Permalink
feat: add LiteLLM vectorizer integration
Browse files Browse the repository at this point in the history
This commit adds support for configuring the vectorizer to use [LiteLLM]
to obtain vector embeddings.

This adds support for the following vector embedding providers:
- Azure OpenAI
- AWS Bedrock
- Cohere
- HuggingFace
- Mistral
- Vertex AI

[LiteLLM]: https://www.litellm.ai/
  • Loading branch information
JamesGuthrie committed Jan 24, 2025
1 parent fa0bf86 commit 0fb7e46
Show file tree
Hide file tree
Showing 30 changed files with 3,501 additions and 326 deletions.
Binary file added docs/images/azure_openai.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
215 changes: 215 additions & 0 deletions docs/vectorizer-api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -251,10 +251,225 @@ generated for your data.

The embedding functions are:

- [ai.embedding_litellm](#aiembedding_litellm)
- [ai.embedding_openai](#aiembedding_openai)
- [ai.embedding_ollama](#aiembedding_ollama)
- [ai.embedding_voyageai](#aiembedding_voyageai)

### ai.embedding_litellm

You call the `ai.embedding_litellm` function to use LiteLLM to generate embeddings for models from multiple providers.

The purpose of `ai.embedding_litellm` is to:
- Define the embedding model to use.
- Specify the dimensionality of the embeddings.
- Configure optional, provider-specific parameters.
- Set the name of the environment variable that holds the value of your API key.

#### Example usage

Use `ai.embedding_litellm` to create an embedding configuration object that is passed as an argument to [ai.create_vectorizer](#create-vectorizers):

1. Set the required API key for your provider.

The API key should be set as an environment variable which is available to either the Vectorizer worker, or the
Postgres process.

2. Create a vectorizer using LiteLLM to access the 'microsoft/codebert-base' embedding model on huggingface:

```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'huggingface/microsoft/codebert-base',
768,
api_key_name => 'HUGGINGFACE_API_KEY',
extra_options => '{"wait_for_model": true}'::jsonb
),
-- other parameters...
);
```

#### Parameters

The function takes several parameters to customize the LiteLLM embedding configuration:

| Name | Type | Default | Required | Description |
|---------------|-------|---------|----------|----------------------------------------------------------------------------------------------------------------------------------------------------------|
| model | text | - | ✔ | Specify the name of the embedding model to use. Refer to the [LiteLLM embedding documentation] for an overview of the available providers and models. |
| dimensions | int | - | ✔ | Define the number of dimensions for the embedding vectors. This should match the output dimensions of the chosen model. |
| api_key_name | text | - | ✖ | Set the name of the environment variable that contains the API key. This allows for flexible API key management without hardcoding keys in the database. |
| extra_options | jsonb | - | ✖ | Set provider-specific configuration options. |

[LiteLLM embedding documentation]: https://docs.litellm.ai/docs/embedding/supported_embedding


#### Returns

A JSON configuration object that you can use in [ai.create_vectorizer](#create-vectorizers).

#### Provider-specific configuration examples

The following subsections show how to configure the vectorizer for all supported providers.

##### Cohere

```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'cohere/embed-english-v3.0',
1024,
api_key_name => 'COHERE_API_KEY',
),
-- other parameters...
);
```

Note: The [Cohere documentation on input_type] specifies that the `input_type` parameter is required.
By default, LiteLLM sets this to `search_document`. The input type can be provided
via `extra_options`, i.e. `extra_options => '{"input_type": "search_document"}'::jsonb`.

[Cohere documentation on input_type]: https://docs.cohere.com/v2/docs/embeddings#the-input_type-parameter

#### Mistral

```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'mistral/mistral-embed',
1024,
api_key_name => 'MISTRAL_API_KEY',
),
-- other parameters...
);
```

Note: Mistral limits the maximum input per batch to 16384 tokens.

##### Azure OpenAI

To set up a vectorizer with Azure OpenAI you require three values from the Azure AI Foundry console:
- deployment name
- base URL
- version
- API key

The deployment name is visible in the "Deployment info" section. The base URL and version are
extracted from the "Target URI" field in the "Endpoint section". The Target URI has the form:
`https://your-resource-name.openai.azure.com/openai/deployments/your-deployment-name/embeddings?api-version=2023-05-15`.
In this example, the base URL is: `https://your-resource-name.openai.azure.com` and the version is `2023-05-15`.

![Azure AI Foundry console example](./images/azure_openai.png)

Configure the vectorizer, note that the base URL and version are configured through `extra_options`:

```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'azure/<deployment name here>',
1024,
api_key_name => 'AZURE_API_KEY',
extra_options => '{"api_base": "<base URL here>", "api_version": "<version here>"}'::jsonb
),
-- other parameters...
);
```

#### AWS Bedrock

To set up a vectorizer with AWS Bedrock, you must ensure that the vectorizer
is authenticated to make API calls to the AWS Bedrock endpoint. The vectorizer
worker uses boto3 under the hood, so there are multiple ways to achieve this.

The simplest method is to provide the `AWS_ACCESS_KEY_ID`,
`AWS_SECRET_ACCESS_KEY`, and `AWS_REGION_NAME` environment variables to the
vectorizer worker. Consult the [boto3 credentials documentation] for more
options.

[boto3 credentials documentation]: (https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html)

```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'bedrock/amazon.titan-embed-text-v2:0',
1024,
api_key_name => 'AWS_SECRET_ACCESS_KEY', -- optional
extra_options => '{"aws_access_key_id": "<access key id>", "aws_region_name": "<region name>"}'::jsonb -- optional
),
-- other parameters...
);
```

You can also only configure the secret in the database, and provide the
`api_key_name` parameter to prompt the vectorizer worker to load the api key
from the database. When you do this, you may need to pass `aws_access_key_id`
and `aws_region_name` through the `extra_options` parameter:

```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'bedrock/amazon.titan-embed-text-v2:0',
1024,
api_key_name => 'AWS_SECRET_ACCESS_KEY', -- optional
extra_options => '{"aws_access_key_id": "<access key id>", "aws_region_name": "<region name>"}'::jsonb -- optional
),
-- other parameters...
);
```

#### Vertex AI

To set up a vectorizer with Vertex AI, you must ensure that the vectorizer
can make API calls to the Vertex AI endpoint. The vectorizer worker uses
GCP's authentication under the hood, so there are multiple ways to achieve
this.
The simplest method is to provide the `VERTEX_PROJECT`, and
`VERTEX_CREDENTIALS` environment variables to the vectorizer worker. These
correspond to the project id, and the path to a file containing credentials for
a service account. Consult the [Authentication methods at Google] for more
options.
[Authentication methods at Google]: https://cloud.google.com/docs/authentication
```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'vertex_ai/text-embedding-005',
768
),
-- other parameters...
);
```
You can also only configure the secret in the database, and provide the
`api_key_name` parameter to prompt the vectorizer worker to load the api key
from the database. When you do this, you may need to pass `vertex_project` and
`vertex_location` through the `extra_options` parameter.
Note: `VERTEX_CREDENTIALS` should contain the path to a file
containing the API key, the vectorizer worker requires to have access to this
file in order to load the credentials.
```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'vertex_ai/text-embedding-005',
768,
api_key_name => 'VERTEX_CREDENTIALS', -- optional
extra_options => '{"vertex_project": "<project id>", "vertex_location": "<vertex location>"}'::jsonb -- optional
),
-- other parameters...
);
```
### ai.embedding_openai
You call the `ai.embedding_openai` function to use an OpenAI model to generate embeddings.
Expand Down
10 changes: 9 additions & 1 deletion docs/vectorizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,19 @@ textual, data analysis, and semantic search:

## Select an embedding provider and set up your API Keys

Vectorizer supports the following vector embedding providers:
Vectorizer supports the following vector embedding providers as first-party integrations:
- [Ollama](https://ollama.com/)
- [Voyage AI](https://www.voyageai.com/)
- [OpenAI](https://openai.com/)

Additionally, through the [LiteLLM](https://litellm.ai) provider we support:
- [Cohere](https://cohere.com/)
- [HuggingFace Inference Endpoints](https://endpoints.huggingface.co/)
- [Mistral](https://mistral.ai/)
- [Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service)
- [AWS Bedrock](https://aws.amazon.com/bedrock/)
- [Vertex AI](https://cloud.google.com/vertex-ai)

When using an external embedding service, you need to setup your API keys to access
the service. To store several API keys, you give each key a name and reference them
in the `embedding` section of the Vectorizer configuration. The default API key
Expand Down
26 changes: 26 additions & 0 deletions projects/extension/sql/idempotent/008-embedding.sql
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,30 @@ $func$ language plpgsql immutable security invoker
set search_path to pg_catalog, pg_temp
;

-------------------------------------------------------------------------------
-- embedding_litellm
create or replace function ai.embedding_litellm
( model pg_catalog.text
, dimensions pg_catalog.int4
, api_key_name pg_catalog.text default null
, extra_options pg_catalog.jsonb default null
) returns pg_catalog.jsonb
as $func$
begin
return json_object
( 'implementation': 'litellm'
, 'config_type': 'embedding'
, 'model': model
, 'dimensions': dimensions
, 'api_key_name': api_key_name
, 'extra_options': extra_options
absent on null
);
end
$func$ language plpgsql immutable security invoker
set search_path to pg_catalog, pg_temp
;

-------------------------------------------------------------------------------
-- _validate_embedding
create or replace function ai._validate_embedding(config pg_catalog.jsonb) returns void
Expand All @@ -100,6 +124,8 @@ begin
-- ok
when 'voyageai' then
-- ok
when 'litellm' then
-- ok
else
if _implementation is null then
raise exception 'embedding implementation not specified';
Expand Down
1 change: 1 addition & 0 deletions projects/extension/tests/contents/output16.expected
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ CREATE EXTENSION
function ai.create_vectorizer(regclass,name,jsonb,jsonb,jsonb,jsonb,jsonb,jsonb,name,name,name,name,name,name,name[],boolean)
function ai.disable_vectorizer_schedule(integer)
function ai.drop_vectorizer(integer,boolean)
function ai.embedding_litellm(text,integer,text,jsonb)
function ai.embedding_ollama(text,integer,text,jsonb,text)
function ai.embedding_openai(text,integer,text,text,text)
function ai.embedding_voyageai(text,integer,text,text)
Expand Down
1 change: 1 addition & 0 deletions projects/extension/tests/contents/output17.expected
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ CREATE EXTENSION
function ai.create_vectorizer(regclass,name,jsonb,jsonb,jsonb,jsonb,jsonb,jsonb,name,name,name,name,name,name,name[],boolean)
function ai.disable_vectorizer_schedule(integer)
function ai.drop_vectorizer(integer,boolean)
function ai.embedding_litellm(text,integer,text,jsonb)
function ai.embedding_ollama(text,integer,text,jsonb,text)
function ai.embedding_openai(text,integer,text,text,text)
function ai.embedding_voyageai(text,integer,text,text)
Expand Down
4 changes: 4 additions & 0 deletions projects/extension/tests/privileges/function.expected
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,10 @@
f | bob | execute | no | ai | drop_vectorizer(vectorizer_id integer, drop_all boolean)
f | fred | execute | no | ai | drop_vectorizer(vectorizer_id integer, drop_all boolean)
f | jill | execute | YES | ai | drop_vectorizer(vectorizer_id integer, drop_all boolean)
f | alice | execute | YES | ai | embedding_litellm(model text, dimensions integer, api_key_name text, extra_options jsonb)
f | bob | execute | no | ai | embedding_litellm(model text, dimensions integer, api_key_name text, extra_options jsonb)
f | fred | execute | no | ai | embedding_litellm(model text, dimensions integer, api_key_name text, extra_options jsonb)
f | jill | execute | YES | ai | embedding_litellm(model text, dimensions integer, api_key_name text, extra_options jsonb)
f | alice | execute | YES | ai | embedding_ollama(model text, dimensions integer, base_url text, options jsonb, keep_alive text)
f | bob | execute | no | ai | embedding_ollama(model text, dimensions integer, base_url text, options jsonb, keep_alive text)
f | fred | execute | no | ai | embedding_ollama(model text, dimensions integer, base_url text, options jsonb, keep_alive text)
Expand Down
Loading

0 comments on commit 0fb7e46

Please sign in to comment.