You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ValueError Traceback (most recent call last)
Cell In[4], line 7
4 image = Image.open("/teamspace/studios/this_studio/Screenshot 2024-12-22 132529.png") # Replace with your image file path
6 # Ensure both image and text are passed correctly
----> 7 inputs = processor(images=image, text="Extract the text from this image.", return_tensors="pt")
9 # Generate predictions
10 outputs = model.generate(**inputs)
File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/transformers/models/mllama/processing_mllama.py:309, in MllamaProcessor.call(self, images, text, audio, videos, **kwargs) 307 raise ValueError("No image were provided, but there are image tokens in the prompt") 308 else:
--> 309 raise ValueError( 310 f"The number of image token ({sum(n_images_in_text)}) should be the same as in the number of provided images ({sum(n_images_in_images)})" 311 ) 313 if images is not None: 314 image_features = self.image_processor(images, **images_kwargs)
ValueError: The number of image token (0) should be the same as in the number of provided images (1)
The text was updated successfully, but these errors were encountered:
Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText
processor = AutoProcessor.from_pretrained("unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit")
model = AutoModelForImageTextToText.from_pretrained("unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit")
from PIL import Image
Open the image
image = Image.open("/teamspace/studios/this_studio/Screenshot 2024-12-22 132529.png") # Replace with your image file path
Ensure both image and text are passed correctly
inputs = processor(images=image, text="Extract the text from this image.", return_tensors="pt")
Generate predictions
outputs = model.generate(**inputs)
Decode the model's output
extracted_text = processor.decode(outputs[0], skip_special_tokens=True)
print("Extracted Text:", extracted_text)
After running this code iam getting this error:
ValueError Traceback (most recent call last)
Cell In[4], line 7
4 image = Image.open("/teamspace/studios/this_studio/Screenshot 2024-12-22 132529.png") # Replace with your image file path
6 # Ensure both image and text are passed correctly
----> 7 inputs = processor(images=image, text="Extract the text from this image.", return_tensors="pt")
9 # Generate predictions
10 outputs = model.generate(**inputs)
File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/transformers/models/mllama/processing_mllama.py:309, in MllamaProcessor.call(self, images, text, audio, videos, **kwargs)
307 raise ValueError("No image were provided, but there are image tokens in the prompt")
308 else:
--> 309 raise ValueError(
310 f"The number of image token ({sum(n_images_in_text)}) should be the same as in the number of provided images ({sum(n_images_in_images)})"
311 )
313 if images is not None:
314 image_features = self.image_processor(images, **images_kwargs)
ValueError: The number of image token (0) should be the same as in the number of provided images (1)
The text was updated successfully, but these errors were encountered: