Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于vLLM加速推理GLM-4V #705

Open
elesun2018 opened this issue Jan 17, 2025 · 4 comments
Open

关于vLLM加速推理GLM-4V #705

elesun2018 opened this issue Jan 17, 2025 · 4 comments
Assignees

Comments

@elesun2018
Copy link

关于vLLM加速推理GLM-4V

代码如下:

Image

请问如何进行批量推理,一次推理4张图?

单张图片推理速度为 38 tokoens/s 如果进行遍历文件夹推理图片,则推理速度变为:8 tokons/s。
有遇到过这个问题吗?
谢谢!

@zRzRzRzRzRzRzR
Copy link
Member

这个模型只支持一张图呀

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR self-assigned this Jan 20, 2025
@elesun2018
Copy link
Author

意思是说,不支持同时推理多个图文对话?

还有
单张图片推理速度为 38 tokoens/s 如果进行遍历文件夹推理图片,则推理速度变为:8 tokons/s。
这个可能原因是,
谢谢!

@zRzRzRzRzRzRzR
Copy link
Member

对,不止支持多个图,一次一张图,但是对话可以连续。

单张图片推理速度为 38 tokoens/s 如果进行遍历文件夹推理图片,则推理速度变为:8 tokons/s。

你只需要打印出来你的input_ids长度,越长,输出速度越慢

@kyle-kw
Copy link

kyle-kw commented Jan 25, 2025

@elesun2018 请问你使用vLLM的版本是多少啊?我使用最新版的vllm部署glm-4v-9b提示没有message模板

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants