Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have the basics of ollama #4

Open
wants to merge 39 commits into
base: fix-and-improve
Choose a base branch
from

Conversation

DanielMarchand
Copy link

@DanielMarchand DanielMarchand commented Jul 14, 2024

The basics work. The problem is the code base is not very well-designed to handled custom prompting depending on the model. For example wake up dates require longer token limits with the llama3 models than with opean ai ones. Also I had to switch from system to assistant in the chat complete to get better answers, there are other subtle differences in how the prompts need to be set up, would be nice to discuss an overall architecture for this. Otherwise I think this is a really cool direction letting people with decent GPUs (tested on 3080, i'm sure 4090 would be even more special) get nice results at no cost.

This is heavily based on joonspk-research#155 by ketsapiwiq I had do some aspects differently but much of the logic is the same

chowington referenced this pull request in crcresearch/agentic_collab Sep 30, 2024
Support vLLM on EC2 instances
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant