At its core, Locality-Sensitive Hashing is a technique that allows for the efficient approximate nearest neighbor search in high-dimensional spaces. Unlike traditional methods that require exhaustive comparisons, LSH employs hashing functions to map similar items to the same bucket with high probability, thereby dramatically reducing the search space.
LSH operates under the premise of locality sensitivity, which implies that similar items are likely to remain close to each other in the hashed space. By exploiting this property, LSH enables rapid retrieval of approximate nearest neighbors, making it an invaluable tool in scenarios where exact matches are not necessary.
To use this library, you need to have Go installed. You can install the library using the following command:
go get github.com/agtabesh/lsh
Import the library in your Go code:
import (
"context"
"fmt"
"github.com/agtabesh/lsh"
"github.com/agtabesh/lsh/types"
)
To create an instance of LSH, you need to specify the configuration parameters using the LSHConfig
struct and call the NewLSH
function:
config := lsh.LSHConfig{
SignatureSize: 128, // It is better to be a power of 2
}
hashFamily := lsh.NewXXHASH64HashFamily(config.SignatureSize)
similarityMeasure := lsh.NewHammingSimilarity()
store := lsh.NewInMemoryStore()
instance, err := lsh.NewLSH(config, hashFamily, similarityMeasure, store)
if err != nil {
// Handle error
}
You can add vectors to the LSH service using the Add
method:
ctx := context.Background()
vectors := []types.Vector{
{"feat1": 1, "feat2": 1, "feat3": 1},
{"feat1": 1, "feat4": 1, "feat5": 1},
{"feat1": 1, "feat2": 1, "feat6": 1, "feat7": 1},
}
for i, vector := range vectors {
vectorID := types.VectorID(fmt.Sprint(i))
err := instance.Add(ctx, vectorID, vector)
if err != nil {
// Handle error
}
}
Note
Vectors can be in any dimension or length.
You can perform a query using a vector ID using the QueryByVectorID
method:
ctx := context.Background()
vectorID := types.VectorID("0")
count := 5
similarVectorsID, err := instance.QueryByVectorID(ctx, vectorID, count)
if err != nil {
// Handle error
}
fmt.Println("similarVectors", similarVectorsID)
You can perform a query using a vector using the QueryByVector
method:
ctx := context.Background()
vector := types.Vector{"feat1": 1, "feat2": 1, "feat3": 1}
count := 5
similarVectorsID, err := instance.QueryByVector(ctx, vector, count)
if err != nil {
// Handle error
}
fmt.Println("similarVectors", similarVectorsID)