You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I understand that LangChain already supports several LLM models right away as documented here
But I'm a bit lost on which one to choose if I'm searching for fast inference/prediction to run on my desktop computer.
In particular, I want to make my hands dirty coding some ideas using agents, but using chatgpt is outside my budget for now.
I'm interested in researching the capabilities of several agents interacting which each other (as in Generative Agents so a fast response would be really useful, besides the quality of the response (Later on I can afford to use better models).
Any idea of which direction should I search? I'm now testing GPT4All as described here (and using this script to fix the weigths) and the speed is about 1/3 or less of the user interface of GPT4.
llama_print_timings: load time = 14159.32 ms
llama_print_timings: sample time = 226.36 ms / 194 runs ( 1.17 ms per run)
llama_print_timings: prompt evaltime = 0.00 ms / 1 tokens ( 0.00 ms per token)
llama_print_timings: evaltime = 120506.00 ms / 237 runs ( 508.46 ms per run)
llama_print_timings: total time = 191409.42 ms
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I understand that LangChain already supports several LLM models right away as documented here
But I'm a bit lost on which one to choose if I'm searching for fast inference/prediction to run on my desktop computer.
In particular, I want to make my hands dirty coding some ideas using agents, but using chatgpt is outside my budget for now.
I'm interested in researching the capabilities of several agents interacting which each other (as in Generative Agents so a fast response would be really useful, besides the quality of the response (Later on I can afford to use better models).
Any idea of which direction should I search? I'm now testing GPT4All as described here (and using this script to fix the weigths) and the speed is about 1/3 or less of the user interface of GPT4.
My computer specs are:
Beta Was this translation helpful? Give feedback.
All reactions