Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolves #2905 openai compatible model provider add llama.cpp rerank support #2906

Merged
merged 1 commit into from
Oct 21, 2024

Conversation

ziyu4huang
Copy link
Contributor

@ziyu4huang ziyu4huang commented Oct 20, 2024

What problem does this PR solve?

Resolve #2905

due to the in-consistent of token size, I make it safe to limit 500 in code, since there is no config param to control

my llama.cpp run set -ub to 1024:

${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl 99 -m $gguf_file --reranking "$@"

Type of change

  • New Feature (non-breaking change which adds functionality)

Here is my test Ragflow use llama.cpp

lot update_slots: id  0 | task 458 | prompt done, n_past = 416, n_tokens = 416
slot      release: id  0 | task 458 | stop processing: n_past = 416, truncated = 0
slot launch_slot_: id  0 | task 459 | processing task
slot update_slots: id  0 | task 459 | tokenizing prompt, len = 2
slot update_slots: id  0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111
slot update_slots: id  0 | task 459 | kv cache rm [0, end)
slot update_slots: id  0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000
slot update_slots: id  0 | task 459 | prompt done, n_past = 111, n_tokens = 111
slot      release: id  0 | task 459 | stop processing: n_past = 111, truncated = 0
srv  update_slots: all slots are idle
request: POST /rerank 172.23.0.4 200

@ziyu4huang ziyu4huang changed the title Resolves #2905 Resolves #2905 openai compatible model provider add llama.cpp rerank support Oct 20, 2024
@yingfeng yingfeng added the ci Continue Integration label Oct 21, 2024
@KevinHuSh KevinHuSh merged commit e5f7733 into infiniflow:main Oct 21, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continue Integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: add rerank support to llama.cpp rerank
3 participants