Skip to content
/ DocsGPT Public

Chatbot for documentation, that allows you to chat with your data. Privately deployable, provides AI knowledge sharing and integrates knowledge into your AI workflow

License

Notifications You must be signed in to change notification settings

arc53/DocsGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

DocsGPT ๐Ÿฆ–

Open-Source RAG Assistant

DocsGPT is an open-source genAI tool that helps users get reliable answers from any knowledge source, while avoiding hallucinations. It enables quick and reliable information retrieval, with tooling and agentic system capability built in.

link to main GitHub showing Stars number link to main GitHub showing Forks number link to license file link to discord X (formerly Twitter) URL

video-example-of-docs-gpt

Key Features:

  • ๐Ÿ—‚๏ธ Wide Format Support: Reads PDF, DOCX, CSV, XLSX, EPUB, MD, RST, HTML, MDX, JSON, PPTX, and images.
  • ๐ŸŒ Web & Data Integration: Ingests from URLs, sitemaps, Reddit, GitHub and web crawlers.
  • โœ… Reliable Answers: Get accurate, hallucination-free responses with source citations viewable in a clean UI.
  • ๐Ÿ”— Actionable Tooling: Connect to APIs, tools, and other services to enable LLM actions.
  • ๐Ÿงฉ Pre-built Integrations: Use readily available HTML/React chat widgets, search tools, Discord/Telegram bots, and more.
  • ๐Ÿ”Œ Flexible Deployment: Works with major LLMs (OpenAI, Google, Anthropic) and local models (Ollama, llama_cpp).
  • ๐Ÿข Secure & Scalable: Run privately and securely with Kubernetes support, designed for enterprise-grade reliability.

Roadmap

You can find our roadmap here. Please don't hesitate to contribute or create issues, it helps us improve DocsGPT!

Production Support / Help for Companies:

We're eager to provide personalized assistance when deploying your DocsGPT to a live environment.

Get a Demo ๐Ÿ‘‹โ 

Send Email ๐Ÿ“ง

Our Open-Source Models Optimized for DocsGPT:

Name Base Model Requirements (or similar)
Docsgpt-7b-mistral Mistral-7b 1xA10G gpu
Docsgpt-14b llama-2-14b 2xA10 gpu's
Docsgpt-40b-falcon falcon-40b 8xA10G gpu's

If you don't have enough resources to run it, you can use bitsnbytes to quantize.

End to End AI Framework for Information Retrieval

Architecture chart

Useful Links

Project Structure

  • Application - Flask app (main application).

  • Extensions - Chrome extension.

  • Scripts - Script that creates similarity search index for other libraries.

  • Frontend - Frontend uses Vite and React.

QuickStart

Note

Make sure you have Docker installed

On Mac OS or Linux, write:

./setup.sh

It will install all the dependencies and allow you to download the local model, use OpenAI or use our LLM API.

Otherwise, refer to this Guide for Windows:

  1. Download and open this repository with git clone https://github.com/arc53/DocsGPT.git

  2. Create a .env file in your root directory and set the env variables and VITE_API_STREAMING to true or false, depending on whether you want streaming answers or not. It should look like this inside:

    LLM_NAME=[docsgpt or openai or others] 
    VITE_API_STREAMING=true
    API_KEY=[if LLM_NAME is openai]
    

    See optional environment variables in the /.env-template and /application/.env_sample files.

  3. Run ./run-with-docker-compose.sh.

  4. Navigate to http://localhost:5173/.

To stop, just run Ctrl + C.

Development Environments

Spin up Mongo and Redis

For development, only two containers are used from docker-compose.yaml (by deleting all services except for Redis and Mongo). See file docker-compose-dev.yaml.

Run

docker compose -f docker-compose-dev.yaml build
docker compose -f docker-compose-dev.yaml up -d

Run the Backend

Note

Make sure you have Python 3.12 installed.

  1. Export required environment variables or prepare a .env file in the project folder:

(check out application/core/settings.py if you want to see more config options.)

  1. (optional) Create a Python virtual environment: You can follow the Python official documentation for virtual environments.

a) On Mac OS and Linux

python -m venv venv
. venv/bin/activate

b) On Windows

python -m venv venv
 venv/Scripts/activate
  1. Download embedding model and save it in the model/ folder: You can use the script below, or download it manually from here, unzip it and save it in the model/ folder.
wget https://d3dg1063dc54p9.cloudfront.net/models/embeddings/mpnet-base-v2.zip
unzip mpnet-base-v2.zip -d model
rm mpnet-base-v2.zip
  1. Install dependencies for the backend:
pip install -r application/requirements.txt
  1. Run the app using flask --app application/app.py run --host=0.0.0.0 --port=7091.
  2. Start worker with celery -A application.app.celery worker -l INFO.

Start Frontend

Note

Make sure you have Node version 16 or higher.

  1. Navigate to the /frontend folder.
  2. Install the required packages husky and vite (ignore if already installed).
npm install husky -g
npm install vite -g
  1. Install dependencies by running npm install --include=dev.
  2. Run the app using npm run dev.

Contributing

Please refer to the CONTRIBUTING.md file for information about how to get involved. We welcome issues, questions, and pull requests.

Code Of Conduct

We as members, contributors, and leaders, pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. Please refer to the CODE_OF_CONDUCT.md file for more information about contributing.

Many Thanks To Our Contributorsโšก

Contributors

License

The source code license is MIT, as described in the LICENSE file.

This project is supported by: