Which are the fastests and lightweight models for fast testing and development? #3319

LecJackS · 2023-04-21T21:44:13Z

LecJackS
Apr 21, 2023

I understand that LangChain already supports several LLM models right away as documented here

But I'm a bit lost on which one to choose if I'm searching for fast inference/prediction to run on my desktop computer.

In particular, I want to make my hands dirty coding some ideas using agents, but using chatgpt is outside my budget for now.

I'm interested in researching the capabilities of several agents interacting which each other (as in Generative Agents so a fast response would be really useful, besides the quality of the response (Later on I can afford to use better models).

Any idea of which direction should I search? I'm now testing GPT4All as described here (and using this script to fix the weigths) and the speed is about 1/3 or less of the user interface of GPT4.

llama_print_timings:        load time = 14159.32 ms
llama_print_timings:      sample time =   226.36 ms /   194 runs   (    1.17 ms per run)
llama_print_timings: prompt eval time =     0.00 ms /     1 tokens (    0.00 ms per token)
llama_print_timings:        eval time = 120506.00 ms /   237 runs   (  508.46 ms per run)
llama_print_timings:       total time = 191409.42 ms

My computer specs are:

GTX 960M (4GB vram)
16GB ram
Intel Core i7-4720HQ CPU @ 2.60GHz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Which are the fastests and lightweight models for fast testing and development? #3319

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Which are the fastests and lightweight models for fast testing and development? #3319

LecJackS Apr 21, 2023

Replies: 0 comments

LecJackS
Apr 21, 2023