Resolves #2905 openai compatible model provider add llama.cpp rerank support #2906

ziyu4huang · 2024-10-20T03:33:27Z

What problem does this PR solve?

Resolve #2905

due to the in-consistent of token size, I make it safe to limit 500 in code, since there is no config param to control

my llama.cpp run set -ub to 1024:

${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl 99 -m $gguf_file --reranking "$@"

Type of change

New Feature (non-breaking change which adds functionality)

Here is my test Ragflow use llama.cpp

lot update_slots: id  0 | task 458 | prompt done, n_past = 416, n_tokens = 416
slot      release: id  0 | task 458 | stop processing: n_past = 416, truncated = 0
slot launch_slot_: id  0 | task 459 | processing task
slot update_slots: id  0 | task 459 | tokenizing prompt, len = 2
slot update_slots: id  0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111
slot update_slots: id  0 | task 459 | kv cache rm [0, end)
slot update_slots: id  0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000
slot update_slots: id  0 | task 459 | prompt done, n_past = 111, n_tokens = 111
slot      release: id  0 | task 459 | stop processing: n_past = 111, truncated = 0
srv  update_slots: all slots are idle
request: POST /rerank 172.23.0.4 200

Resolves infiniflow#2905

77e57c6

ziyu4huang changed the title ~~Resolves #2905~~ Resolves #2905 openai compatible model provider add llama.cpp rerank support Oct 20, 2024

yingfeng added the ci Continue Integration label Oct 21, 2024

KevinHuSh merged commit e5f7733 into infiniflow:main Oct 21, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolves #2905 openai compatible model provider add llama.cpp rerank support #2906

Resolves #2905 openai compatible model provider add llama.cpp rerank support #2906

ziyu4huang commented Oct 20, 2024 •

edited by yingfeng

Loading

Resolves #2905 openai compatible model provider add llama.cpp rerank support #2906

Resolves #2905 openai compatible model provider add llama.cpp rerank support #2906

Conversation

ziyu4huang commented Oct 20, 2024 • edited by yingfeng Loading

What problem does this PR solve?

Type of change

ziyu4huang commented Oct 20, 2024 •

edited by yingfeng

Loading