Skip to content

Latest commit

 

History

History

colbert

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
#Vespa

Vespa sample applications - Simple hybrid search with ColBERT

This semantic search application uses a single vector embedding model for retrieval and ColBERT (multi-token vector representation) for re-ranking. The app demonstrates the colbert-embedder and the tensor expressions for ColBERT MaxSim

It also features reciprocal rank fusion to fuse different rankings.

Requires at least Vespa 8.338.38

To try this application

Follow Vespa getting started through the vespa deploy step, cloning colbert instead of album-recommendation.

Feed documents (this includes embed inference in Vespa):

vespa feed ext/*.json

Example queries:

vespa query 'yql=select * from doc where userQuery() or ({targetHits: 100}nearestNeighbor(embedding, q))'\
 'input.query(q)=embed(e5, @query)' \
 'input.query(qt)=embed(colbert, @query)' \
 'query=space contains many suns'
vespa query 'yql=select * from doc where userQuery() or ({targetHits: 100}nearestNeighbor(embedding, q))'\
 'input.query(q)=embed(e5, @query)' \
 'input.query(qt)=embed(colbert, @query)' \
 'query=shipping stuff over the sea'
 
vespa query 'yql=select * from doc where userQuery() or ({targetHits: 100}nearestNeighbor(embedding, q))'\
 'input.query(q)=embed(e5, @query)' \
 'input.query(qt)=embed(colbert, @query)' \
 'query=exchanging information by sound'
 

Export ColBERT models from HF

See the model2onnx.py script for exporting the ColBERT model from Hugging Face to ONNX format.

Notice that these three models use different embedding dimensionality.

Example usage:

This is the recommended colbert model for this application as it is optimized for speed and accuracy. See blog post

python3 model2onnx.py --hf_model answerdotai/answerai-colbert-small-v1 --dims 96 

Can be used with:

field colbert type tensor<int8>(dt{}, x[12])
python3 model2onnx.py --hf_model mixedbread-ai/mxbai-colbert-large-v1 --dims 128 
python3 model2onnx.py --hf_model vespa-engine/col-minilm --dims 32 

Can be used with:

field colbert type tensor<int8>(dt{}, x[4])

Terminate container

Remove the container after use:

$ docker rm -f vespa