FlagEmbedding holds a whole curriculum for retrieval, embedding models, RAG, etc. This section is currently being actively updated, suggestions are very welcome. No matter you are new to NLP or a veteran, we hope you can find something helpful!
If you are new to embedding and retrieval, check out the 5 minute quick start!
This module includes tutorials and demos showing how to use BGE and Sentence Transformers, as well as other embedding related topics.
- Intro to embedding model
- BGE series
- Usage of BGE
- BGE-M3
- BGE-ICL
- ...
In this part, we show popular similarity functions and techniques about searching.
- Similarity metrics
- Evaluation metrics
- ...
Although not included in the quick start, indexing is a very important part in practical cases. This module shows how to use popular libraries like Faiss and Milvus to do indexing.
- Intro to Faiss
- Using GPU in Faiss
- Indexes
- Quantizers
- Faiss Index Choosing
- Milvus
- ...
In this module, we'll show the full pipeline of evaluating an embedding model, as well as popular benchmarks like MTEB and C-MTEB.
- Evaluate MSMARCO
- Intro to MTEB
- MTEB Leaderboard Eval
- C-MTEB
- ...
To balance accuracy and efficiency tradeoff, many retrieval system use a more efficient retriever to quickly narrow down the candidates. Then use more accurate models do reranking for the final results.
- Intro to reranker
- ...
RAG is one of the most popular approach to enchance the capabilities of LLMs by integrating information retrieval with them. In this module, we will cover the implementation, popular tools and libraries, and more advanced techniques.
- RAG from scratch
- RAG with LangChain
- RAG with LlamaIndex
- ...