Skip to content

Everything you always wanted to know about ANNS but were afraid to ask 🥰

License

Notifications You must be signed in to change notification settings

AcKing-Sam/Awesome-Vector-Search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 

Repository files navigation

Awesome-Vector-ANNS

Everything you always wanted to know about ANNS but were afraid to ask 🥰 This repo is going to update frequently. Welcome any advice or questions, feel free to send emails to connect with me.

Papers

Information Retrieval

[ICLR'21] Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. blog

Sparse Vector

[SIGIR'24] Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations.

[CIKM'24] Pairing Clustered Inverted Indexes with 𝜅-NN Graphs for Fast Approximate Retrieval over Learned Sparse Representations.

Saerch

[KDD'20] Embedding-based Retrieval in Facebook Search. blog

Disk or Second-tier Memory

[ATC'24] Scalable Billion-point Approximate Nearest Neighbor Search Using SmartSSDs.

[SIGMOD'24] Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment.

[CIKM'19] GRIP: Multi-Store Capacity-Optimized High-Performance Nearest Neighbor Search for Vector Search Engine.

[ArXiv'24] Characterizing the Dilemma of Performance and Index Size in Billion-Scale Vector Search and Breaking It with Second-Tier Memory.

Multi-core

[PPoPP'23] iQAN : Fast and Accurate Vector Search with Efficient Intra-Query Parallelism on Multi-Core Architectures.

[PPoPP'24] ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate Nearest Neighbor Search Algorithms.

Disk-memory

[CIKM'19] GRIP: Multi-Store Capacity-Optimized High-Performance Nearest Neighbor Search for Vector Search Engine. blog

[NIPS'20] HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory. blog

[BD'23] LM-DiskANN: Low Memory Footprint in Disk-Native Dynamic Graph-Based ANN Indexing.

Learned

[SIGMOD'18] The Case For Learned Index Structures. blog

[ICLR'20] Learning Space Partitions for Nearest Neighbor Search. blog

[TPAMI'19] Learning to Index for Nearest Neighbor Search.

[ICML'19] Learning to Route in Similarity Graphs.

Knowledge Distillation of Indexes

[NIPS'23] Knowledge Distillation for High Dimensional Search Index.

Learned Representation of Vectors

[NIPS'23] AdANNS: A Framework for Adaptive Semantic Search.

[NIPS'22] Matryoshka Representation Learning.

LSH

[TKDE'19] A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search.

[NIPS'15] Practical and Optimal LSH for Angular Distance. blog

[STOC'15] Optimal Data-Dependent Hashing for Approximate Near Neighbors. talk

[VLDB'07] Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search. blog

[VLDB'24] DET-LSH: A Locality-Sensitive Hashing Scheme with Dynamic Encoding Tree for Approximate Nearest Neighbor Search.

Graph

[WWW'11] Efficient K-Nearest Neighbor Graph Construction for Generic Similarity Measures. blog

[PR'19] Hierarchical Clustering-Based Graphs for Large Scale Approximate Nearest Neighbor Search. blog

[WSDM'22] GraSP: Optimizing Graph-based Nearest Neighbor Search with Subgraph Sampling and Pruning. blog

[VLDB'22] HVS: Hierarchical Graph Structure Based on Voronoi Diagrams for Solving Approximate Nearest Neighbor Search. blog

[MM'23] Relative NN-Descent: A Fast Index Construction for Graph-Based Approximate Nearest Neighbor Search.

[VLDB'21] A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search.

[ICMR'24] An Exploration Graph with Continuous Refinement for Efficient Multimedia Retrieval.

[arxiv'24] Revisiting the Index Construction of Proximity Graph-Based Approximate Nearest Neighbor Search.

[CVPR'18] Link and code: Fast indexing with graphs and compact regression codes. blog

Quantization

[CVPR'12] The Inverted Multi-Index.

Tree

OOD

[VLDB'24] RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor Search.

MIPS (Maximum Inner Product Search)

[ICDE'24] Efficient Approximate Maximum Inner Product Search over Sparse Vectors.

[SIGIR'24]

Good blogs

https://zhuanlan.zhihu.com/p/133526632

Books

Blogs & Talks & Tutorials

Tools

tmux tutorial

CMake tutorial

SIMD

SIMD Programming(A little out of date, using VMX and MMX)

CUDA

C++

https://eecs280staff.github.io/notes/

https://changkun.de/modern-cpp/

C++ Concurrency

Effective Modern C++

C++ Memory Model.blog

C++ Concurrency.blog

Linear Algebra

SVD

Implementation

About

Everything you always wanted to know about ANNS but were afraid to ask 🥰

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published