Get Started with Cerebras API: Simple Chatbot

This guide will help you set up and use the Cerebras API to interact with generative AI models. You’ll learn how to configure your environment, install the necessary library, and get a simple chatbot running!

What You'll Learn

Setting up your developer environment
Installing the Cerebras Inference library
Running a simple chatbot script powered by Cerebras

Step 1: Set up your API Key

Obtain Your API Key: Log in to your Cerebras account, navigate to the “API Keys” section, and generate a new API key.
Set the API Key as an Environment Variable: For security, store your API key as a secret in your repl! Here's more information on how to do that.

This ensures that your API key is available to your script without hardcoding it directly.

Step 2: Install dependencies

Let's make sure we have all of the requirements for this project installed!

pip install -r requirements.txt

Step 3: Start Chatting

With the API key set and the library installed, you’re ready to run the chatbot. This script will allow you to interact with the Cerebras API and process chat completions.

API Key Initialization:
```
client = Cerebras(api_key=os.environ.get("CEREBRAS_API_KEY"))
```
The Cerebras client is initialized with an API key fetched from environment variables. This key is necessary for authenticating requests to the Cerebras API.
User Input and Response Handling:
```
user_input = input("User: ")
user_message = {"role": "user", "content": user_input}
```
User input is collected and formatted into a message object.
Model Interaction:
```
response = client.chat.completions.create(
    messages=[user_message],
    model="llama3.1-8b"
)
```
This API call sends the user message to the specified model (e.g., "llama3.1-8b") and retrieves the assistant’s response.
Performance Metrics Calculation:
```
total_tokens = response.usage.total_tokens
total_time = response.time_info.total_time
tokens_per_second = total_tokens / total_time
```
After receiving the response, the total tokens used and the total processing time are extracted from the response object. Tokens per second are then calculated by dividing the total tokens by the total time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Get Started with Cerebras API: Simple Chatbot

What You'll Learn

Step 1: Set up your API Key

Step 2: Install dependencies

Step 3: Start Chatting

Files

README.md

Latest commit

History

README.md

File metadata and controls

Get Started with Cerebras API: Simple Chatbot

What You'll Learn

Step 1: Set up your API Key

Step 2: Install dependencies

Step 3: Start Chatting