Cerebras Inference API Demos

Welcome to the Cerebras Inference API demo repository! This repository contains various examples showcasing the power of the Cerebras Wafer-Scale Engines and CS-3 systems for AI model inference.

🚀 Introduction

The Cerebras API offers developers a low-latency solution for AI model inference powered by Cerebras Wafer-Scale Engines and CS-3 systems. We invite developers to explore the new possibilities that our high-speed inferencing solution unlocks.

Currently, the Cerebras API provides access to two models: Meta’s Llama 3.1 8B and 70B models. Both models are instruction-tuned and can be used for conversational applications.

🧠 Models Available

Llama-3.1-8B
- Parameters: 8 billion
- Knowledge Cutoff: March 2023
- Context Length: 8192
- Training Tokens: 15 trillion
Llama-3.1-70B
- Parameters: 70 billion
- Knowledge Cutoff: December 2023
- Context Length: 8192
- Training Tokens: 15 trillion

📚 Resources

📁 Projects Overview

This repository contains multiple example projects, each demonstrating different capabilities of the Cerebras Inference API. Each project is located in its own folder and contains a detailed README.

🔗 Example Projects

Getting Started with Cerebras Inference API
- Learn how to get started with the Cerebras Inference API for your AI projects.
Conversational Memory with Langchain
- Explore how to build conversational memory for LLMs using Langchain.
RAG with Pinecone + Docker
- Implement Retrieval-Augmented Generation (RAG) using Pinecone and Docker.
RAG with Weaviate + HuggingFace
- Implement Retrieval-Augmented Generation (RAG) using Weaviate and HuggingFace.
Getting Started with Cerebras + Streamlit
- Learn how to integrate Cerebras with Streamlit to build interactive applications.
AI Agentic Workflow with LlamaIndex
- Build an AI agentic workflow using LlamaIndex.
AI Agentic Workflow with Langchain
- Build an AI agentic workflow using Langchain.
Multi AI Agentic Workflow
- Create a multi-agentic AI workflow with Langgraph and LangSmith.

🌟 Getting Started

To explore each project, simply navigate to the corresponding folder and follow the instructions in the README. Happy coding!

🛠️ Requirements

Python 3.7+
Docker (for RAG examples)
Streamlit (for Cerebras + Streamlit example)
Other dependencies as noted in each project’s README.

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

👥 Contributors

We welcome contributions! Feel free to submit a pull request or open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
ai-workflow-langchain		ai-workflow-langchain
ai-workflow-llamaindex		ai-workflow-llamaindex
cerebras-streamlit		cerebras-streamlit
cerebras-wandb-weave		cerebras-wandb-weave
conversational-memory-langchain		conversational-memory-langchain
getting-started		getting-started
gist		gist
marketing-agent		marketing-agent
multi-ai-workflow		multi-ai-workflow
rag-pinecone-docker		rag-pinecone-docker
rag-weaviate-huggingface		rag-weaviate-huggingface
synthetic-data		synthetic-data
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cerebras Inference API Demos

🚀 Introduction

🧠 Models Available

📚 Resources

📁 Projects Overview

🔗 Example Projects

🌟 Getting Started

🛠️ Requirements

📝 License

👥 Contributors

About

Releases

Packages

Contributors 4

Languages

Cerebras/inference-examples

Folders and files

Latest commit

History

Repository files navigation

Cerebras Inference API Demos

🚀 Introduction

🧠 Models Available

📚 Resources

📁 Projects Overview

🔗 Example Projects

🌟 Getting Started

🛠️ Requirements

📝 License

👥 Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages