This project demonstrates how to use LangChain to build agents that can process natural language queries and interact with SQL databases. The agents leverage a language model to interpret user queries, translate them into SQL statements, execute these statements against a database, and present the results.
The project includes two example scripts:
quote.py
: Connects to an SQLite database (example.db
) containing quotes and allows users to query them using natural language.country.py
: Uses the Chinook sample database to answer queries about music sales data, customers, and countries.
+-----------+ +----------------+ +-------------------+
| Client | ---> | API Gateway | ---> | Lambda |
| (Request)| | (REST API) | | (Invoke Model) |
+-----------+ +----------------+ +-------------------+
|
v
+-----------------+
| GPT |
| Model |
+-----------------+
|
v
+-----------------+
| SQL to Snowflake|
+-----------------+
|
v
+-----------------+
| Response |
+-----------------+
- Python 3.7 or higher
- An OpenAI API key
- Required Python packages (listed in
requirements.txt
)
pip3 install --upgrade --quiet langchain-community
pip3 install --upgrade --quiet langchainhub langgraph
pip3 install -qU langchain-openai
pip3 install snowflake-sqlalchemy
-
Clone the repository:
git clone https://github.com/yourusername/your-repo-name.git cd your-repo-name
-
Create and activate a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows, use venv\Scripts\activate
-
Install the required packages:
pip install -r requirements.txt
The
requirements.txt
file should include:langchain langchain_openai langchain_community langgraph sqlalchemy requests python-dotenv
-
Set up the environment variables:
Create a
.env
file in the project directory and add your OpenAI API key:OPENAI_API_KEY=your-openai-api-key
-
Prepare the SQLite database:
- Ensure you have an
example.db
SQLite database in the project directory. - If you don't have one, you can create it using SQLite tools or scripts.
- Ensure you have an
-
Run the script:
python quote.py
-
Interact with the agent:
-
When prompted, enter your natural language query.
-
Example:
Enter your query (or 'q' to quit): Tell me the quotes which are pending.
-
-
Run the script:
python country.py
-
Interact with the agent:
-
Enter your natural language query when prompted.
-
Example:
Enter your query (or 'q' to quit): Which country's customers spent the most?
-
-
"Tell me the quotes which are pending."
- Retrieves all quotes from the database with a status of pending.
-
"Show me the pending quotes that were created in February."
- Finds all pending quotes created in the month of February.
-
"Which country's customers have spent the most?"
- Calculates the total spending per country and identifies the country with the highest total.
-
"List the top 5 countries by total sales."
- Generates a list of the top 5 countries based on total sales amounts.
-
Changing the Database for
quote.py
:- Replace
example.db
with your own SQLite database file. - Ensure the database schema matches the queries you intend to run.
- Replace
-
Modifying the Language Model:
-
Adjust the
ChatOpenAI
initialization to use a different model (e.g.,gpt-3.5-turbo
orgpt-4
).llm = ChatOpenAI(model="gpt-3.5-turbo")
-
-
Customizing the Prompt Template:
- Modify the system prompt by adjusting the
prompt_template
or creating your own prompt.
- Modify the system prompt by adjusting the
-
Module Not Found Errors:
- Ensure all required packages are installed.
- If you encounter
ModuleNotFoundError
, verify the package name and installation.
-
API Errors:
- Verify that your OpenAI API key is correct and has the necessary permissions.
- Check your API usage limits and quotas.
-
Database Errors:
- Ensure the database files are correctly set up and accessible.
- Verify that the tables and data exist as expected.
This project is licensed under the MIT License.
- LangChain for providing tools to build language model applications.
- The Chinook Database for the sample database used in
country.py
.