You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
https://arxiv.org/abs/2408.08459 shows you can avoid using a trained encoder. For trying JPEG codes as input of my VLM I can tell you it works wonderfully well. Besides, unlike in your paper we do not have unstability during training due to different type of modalities.
Please Meta, talk to each other and retrain Chameleon with JPEG codes, because it seems your teams dont know what other teams do.
The text was updated successfully, but these errors were encountered:
https://arxiv.org/abs/2408.08459 shows you can avoid using a trained encoder. For trying JPEG codes as input of my VLM I can tell you it works wonderfully well. Besides, unlike in your paper we do not have unstability during training due to different type of modalities.
Please Meta, talk to each other and retrain Chameleon with JPEG codes, because it seems your teams dont know what other teams do.
The text was updated successfully, but these errors were encountered: