You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/mnt/home/miniforge3/envs/vek/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/miniforge3/envs/vek/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/miniforge3/envs/vek/lib/python3.11/site-packages/transformers/models/llava_onevision/modeling_llava_onevision.py", line 688, in forward
raise ValueError(
ValueError: Image features and image tokens do not match: tokens: 7332, features 7261
Expected behavior
Expected: Output correctly without errors.
This is a follow-up issue of #34625, where the behavior is the same but for different reasons. The reproduction example is a slight modification of the one provided by @chchch0109.
The text was updated successfully, but these errors were encountered:
I found the cause: the Processor and the model's vision token unpadding differs by a rounding function. Sometimes they give different results because of precision issues. I added the rounding function in LlavaOnevisionProcessor to match the behavior of the model. PR: #35779.
I got the same issue: ValueError: Image features and image tokens do not match: tokens: 4589, features 4588 when using Llava-v1.6-vicuna-7b-hf (llava-next model) and the transformers version is 4.47.0. I followed PR: #35779 and modified processing_llava_next.py shown in the following code. But it doesn't work.
System Info
transformers
version: 4.48.0- distributed_type: FSDP
- mixed_precision: bf16
- use_cpu: False
- debug: False
- num_processes: 1
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- enable_cpu_affinity: False
- fsdp_config: {'fsdp_activation_checkpointing': True, 'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch': 'BACKWARD_PRE', 'fsdp_cpu_ram_efficient_loading': True, 'fsdp_forward_prefetch': False, 'fsdp_offload_params': False, 'fsdp_sharding_strategy': 'FULL_SHARD', 'fsdp_state_dict_type': 'SHARDED_STATE_DICT', 'fsdp_sync_module_states': True, 'fsdp_transformer_layer_cls_to_wrap': '', 'fsdp_use_orig_params': True}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
Who can help?
@amyeroberts @qubvel @zucchini-nlp
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Traceback as follows:
Expected behavior
Expected: Output correctly without errors.
This is a follow-up issue of #34625, where the behavior is the same but for different reasons. The reproduction example is a slight modification of the one provided by @chchch0109.
The text was updated successfully, but these errors were encountered: