This is a basic demo to show how to use picoLLM on web browsers, using the IIFE version of the library (i.e. an HTML script tag). It instantiates a picoLLM inference engine on a web worker and enables a back-and-forth conversation with the LLM, similar to ChatGPT.
PicoLLM requires a valid Picovoice AccessKey
at initialization. AccessKey
acts as your credentials when using
Picovoice SDKs.
You can get your AccessKey
for free. Make sure to keep your AccessKey
secret.
Signup or Login to Picovoice Console to get your AccessKey
.
picoLLM Inference Web Engine supports the following open-weight models. The models are on Picovoice Console.
- Gemma
gemma-2b
gemma-2b-it
- Llama-2
llama-2-7b
llama-2-7b-chat
llama-3-8b
llama-3-8b-instruct
- Mistral
mistral-7b-v0.1
mistral-7b-instruct-v0.1
mistral-7b-instruct-v0.2
- Phi-2
phi2
NOTE: Only gemma and Phi-2 models have been tested on multiple browsers across different platforms. The rest of the models depend on the user's system in order to run properly.
- Use
yarn
ornpm
to install the dependencies - Run
start
script to start a local web server hosting the demo.
yarn
yarn start
(or)
npm install
npm run start
- Open
localhost:5000
in your web browser, as hinted at in the output:
Available on:
http://localhost:5000
Hit CTRL-C to stop the server
- Enter your access key, select a model file and press
Init picoLLM
. Wait until loading completes and start chatting withpicoLLM
.