update docs

FlagOpen · Jan 9, 2025 · 757db32 · 757db32
1 parent 4efa19d
commit 757db32
Show file tree

Hide file tree

Showing 3 changed files with 39 additions and 1 deletion.
diff --git a/docs/source/API/index.rst b/docs/source/API/index.rst
@@ -2,7 +2,6 @@ API
 ===
 
 .. toctree::
-   :hidden:
    :maxdepth: 1
 
    abc

diff --git a/docs/source/Introduction/IR.rst b/docs/source/Introduction/IR.rst
@@ -0,0 +1,38 @@
+Information Retrieval
+=====================
+
+What is Information Retrieval?
+------------------------------
+
+Simply put, Information Retrieval (IR) is the science of searching and retrieving information from a large collection of data based on a user's query. 
+The goal of an IR system is not just to return a list of documents but to ensure that the most relevant ones appear at the top of the results.
+
+A very straightforward example of IR is library catalog. One wants to find the book that best matches the query, but there are thousands or millions of books on the shelf.
+The library's catalog system helps you find the best matches based on your search terms. 
+In modern digital world, search engines and databases work in a similar way, using sophisticated algorithms and models to retrieve, rank and return the most relevant results.
+And the resource categories are expanding from text to more modalities such as images, videos, 3D objects, music, etc.
+
+IR and Embedding Model
+----------------------
+
+Traditional IR methods, like TF-IDF and BM25, rely on statistical and heuristic techniques to rank documents based on term frequency and document relevance.
+These methods are efficient and effective for keyword-based search but often struggle with understanding the deeper context or semantics of the text.
+
+.. seealso::
+
+    Take a very simple example with two sentences:
+
+    .. code:: python
+
+        sentence_1 = "watch a play"
+        sentence_2 = "play with a watch"
+
+    Sentence 1 means going for a show/performance, which has watch as a verb and play as a noun.
+
+    However sentence 2 means someone is interacting with a timepiece on wrist, which has play as a verb and watch as a noun.
+
+These two sentences could be regard as very similar to each other when using the traditional IR methods though they actually have totally different semantic meaning. 
+Then how could we solve this? The best answer up until now is embedding models.
+
+Embedding models have revolutionized IR by representing text as dense vectors in a high-dimensional space, capturing the semantic meaning of words, sentences, or even entire documents. 
+This allows for more sophisticated search capabilities, such as semantic search, where results are ranked based on meaning rather than simple keyword matching.
diff --git a/docs/source/Introduction/index.rst b/docs/source/Introduction/index.rst
@@ -24,5 +24,6 @@ Quickly get started with:
    :maxdepth: 1
    :caption: Concept
 
+   IR
    model
    retrieval_demo
-Original file line number
+Diff line change
@@ Expand Up / @@ -2,7 +2,6 @@ API @@
     ===
     .. toctree::
-       :hidden:
        :maxdepth: 1
        abc
@@ Expand Down @@