Skip to content

An intelligent document assistant powered by Open-Source Large Language Models

License

Notifications You must be signed in to change notification settings

harimkang/docsense

Repository files navigation

DocSense 📚

PyPI version License: MIT Python Versions Tests codecov

An intelligent document assistant powered by Open-Source Large Language Models 🤖

DocSense is a powerful tool that helps you interact with your documents using natural language. It leverages the open-source Qwen language model (with plans to support more open-source models) to understand and answer questions about your documents with high accuracy and context awareness, all completely free to use.

Features ✨

  • 🔍 Advanced semantic search using FAISS
  • 💡 Intelligent question answering with open-source LLMs (currently Qwen)
  • 📝 Support for multiple document formats (txt, md, rst, etc.)
  • ⚡ GPU acceleration for faster processing
  • 🔄 Batch processing for memory efficiency
  • 💾 Persistent vector storage

Installation 🛠️

CPU Version

pip install docsense

GPU Version (Recommended)

First, install PyTorch with CUDA support:

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

Then install FAISS with GPU support:

conda install -c conda-forge faiss-gpu

Finally, install DocSense:

pip install docsense

Usage 🚀

Creating Document Index

Index your documents directory:

docsense index /path/to/your/documents

Asking Questions

Ask a question to your documents:

docsense ask "How to use this library?"

Interactive Mode

Start an interactive session for multiple questions:

docsense daemon

Command Line Options

All commands support the following options:

  • --model-name: Specify the Qwen model to use (default: "Qwen/Qwen2-7B")
  • --device: Choose computing device ("cuda" or "cpu", default: "cuda")
  • --index-path: Set custom path for the vector index

Example with options:

docsense index /path/to/your/documents --model-name "Qwen/Qwen2-7B" --device "cuda" --index-path /path/to/your/index

License 📄

This project is licensed under the MIT License - see the LICENSE file for details.

Star History 🌟

Star History Chart

About

An intelligent document assistant powered by Open-Source Large Language Models

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages