Everything you always wanted to know about ANNS but were afraid to ask 🥰 This repo is going to update frequently. Welcome any advice or questions, feel free to send emails to connect with me.
[ICLR'21] Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. blog
[SIGIR'24] Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations.
[CIKM'24] Pairing Clustered Inverted Indexes with 𝜅-NN Graphs for Fast Approximate Retrieval over Learned Sparse Representations.
[KDD'20] Embedding-based Retrieval in Facebook Search. blog
[ATC'24] Scalable Billion-point Approximate Nearest Neighbor Search Using SmartSSDs.
[SIGMOD'24] Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment.
[CIKM'19] GRIP: Multi-Store Capacity-Optimized High-Performance Nearest Neighbor Search for Vector Search Engine.
[ArXiv'24] Characterizing the Dilemma of Performance and Index Size in Billion-Scale Vector Search and Breaking It with Second-Tier Memory.
[PPoPP'23] iQAN : Fast and Accurate Vector Search with Efficient Intra-Query Parallelism on Multi-Core Architectures.
[PPoPP'24] ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate Nearest Neighbor Search Algorithms.
[CIKM'19] GRIP: Multi-Store Capacity-Optimized High-Performance Nearest Neighbor Search for Vector Search Engine. blog
[NIPS'20] HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory. blog
[BD'23] LM-DiskANN: Low Memory Footprint in Disk-Native Dynamic Graph-Based ANN Indexing.
[SIGMOD'18] The Case For Learned Index Structures. blog
[ICLR'20] Learning Space Partitions for Nearest Neighbor Search. blog
[TPAMI'19] Learning to Index for Nearest Neighbor Search.
[ICML'19] Learning to Route in Similarity Graphs.
[NIPS'23] Knowledge Distillation for High Dimensional Search Index.
[NIPS'23] AdANNS: A Framework for Adaptive Semantic Search.
[NIPS'22] Matryoshka Representation Learning.
[TKDE'19] A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search.
[NIPS'15] Practical and Optimal LSH for Angular Distance. blog
[STOC'15] Optimal Data-Dependent Hashing for Approximate Near Neighbors. talk
[VLDB'07] Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search. blog
[VLDB'24] DET-LSH: A Locality-Sensitive Hashing Scheme with Dynamic Encoding Tree for Approximate Nearest Neighbor Search.
[WWW'11] Efficient K-Nearest Neighbor Graph Construction for Generic Similarity Measures. blog
[PR'19] Hierarchical Clustering-Based Graphs for Large Scale Approximate Nearest Neighbor Search. blog
[WSDM'22] GraSP: Optimizing Graph-based Nearest Neighbor Search with Subgraph Sampling and Pruning. blog
[VLDB'22] HVS: Hierarchical Graph Structure Based on Voronoi Diagrams for Solving Approximate Nearest Neighbor Search. blog
[MM'23] Relative NN-Descent: A Fast Index Construction for Graph-Based Approximate Nearest Neighbor Search.
[VLDB'21] A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search.
[ICMR'24] An Exploration Graph with Continuous Refinement for Efficient Multimedia Retrieval.
[arxiv'24] Revisiting the Index Construction of Proximity Graph-Based Approximate Nearest Neighbor Search.
[CVPR'18] Link and code: Fast indexing with graphs and compact regression codes. blog
[CVPR'12] The Inverted Multi-Index.
[VLDB'24] RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor Search.
[ICDE'24] Efficient Approximate Maximum Inner Product Search over Sparse Vectors.
[SIGIR'24]
https://zhuanlan.zhihu.com/p/133526632
SIMD Programming(A little out of date, using VMX and MMX)
https://eecs280staff.github.io/notes/