multi images inference preprocess method #390

zyandtom · 2025-01-12T10:50:23Z

Hi, I found that the default image preprocess method is only for single image input.

in process_images func, we will use process_anyres_image as default preprocessor, which will cause a huge increase for image input tokens(surpass 32768) when we have multi images input. My solution for inference is change to preprocessor used in training

LLaVA-NeXT/llava/train/train.py

Line 1147 in 79ef45a

image = [self.process_image(f, "pad") for f in image_file]

is that correct?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi images inference preprocess method #390

multi images inference preprocess method #390

zyandtom commented Jan 12, 2025 •

edited

Loading

multi images inference preprocess method #390

multi images inference preprocess method #390

Comments

zyandtom commented Jan 12, 2025 • edited Loading

zyandtom commented Jan 12, 2025 •

edited

Loading