NeonAI LLM FastChat

Proxies API calls to FastChat.

Request Format

API requests should include history, a list of tuples of strings, and the current query

Example Request:

{
 "history": [["user", "hello"], ["llm", "hi"]],
 "query": "how are you?"
}

Response Format

Responses will be returned as dictionaries. Responses should contain the following:

response - String LLM response to the query

Docker Configuration

When running this as a docker container, the XDG_CONFIG_HOME envvar is set to /config. A configuration file at /config/neon/diana.yaml is required and should look like:

MQ:
  port: <MQ Port>
  server: <MQ Hostname or IP>
  users:
    neon_llm_fastchat:
      password: <neon_fastchat user's password>
      user: neon_fastchat
LLM_FASTCHAT:
  context_depth: 3
  max_tokens: 256
  num_parallel_processes: 2
  num_threads_per_process: 4

For example, if your configuration resides in ~/.config:

export CONFIG_PATH="/home/${USER}/.config"
docker run -v ${CONFIG_PATH}:/config neon_llm_fastchat

Note: If connecting to a local MQ server, you may need to specify --network host

GPU

System setup

# Nvidia Docker
sudo apt install curl
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

Run docker

export CONFIG_PATH="/home/${USER}/.config"
docker run --gpus 0 -v ${CONFIG_PATH}:/config neon_llm_fastchat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NeonAI LLM FastChat

Request Format

Response Format

Docker Configuration

GPU

Files

README.md

Latest commit

History

README.md

File metadata and controls

NeonAI LLM FastChat

Request Format

Response Format

Docker Configuration

GPU