You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try to deploy a qwen2-vl fine-tuned model with tgi and vllm, and I've found some results between these two frameworks are different. Seems that tgi consume more tokens compared to vLLM. I checked TGI's code and seems there miss the image resize logic? For Qwen2-VL pipeline, we will resize the image based on two args max_pixels and min_pixels.
Information
Docker
The CLI directly
Tasks
An officially supported command
My own modifications
Reproduction
Deploy a Qwen2-VL-7B model on the inference endpoint, and upload a large image will trigger an error that the input tokens are larger than 32768
### Expected behavior
The server will resize the image based on preprocessor_config.json(max_pixels and min_pixels) and make sure the image tokens will not be too many for a request.
The text was updated successfully, but these errors were encountered:
AHEADer
changed the title
Different result when eval with TGI and vLLM, need to know if the preprocessing is right or not
Does tgi support image resize for qwen2-vl pipeline?
Jan 22, 2025
System Info
I try to deploy a qwen2-vl fine-tuned model with tgi and vllm, and I've found some results between these two frameworks are different. Seems that tgi consume more tokens compared to vLLM. I checked TGI's code and seems there miss the image resize logic? For Qwen2-VL pipeline, we will resize the image based on two args max_pixels and min_pixels.
Information
Tasks
Reproduction
Deploy a Qwen2-VL-7B model on the inference endpoint, and upload a large image will trigger an error that the input tokens are larger than 32768
The text was updated successfully, but these errors were encountered: