You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your amazing contributions and for sharing such an exciting project.
As I understand, the llava-onevision-qwen2-7b-ov-chat model is built upon the llava-onevision-qwen2-7b-ov model, with preference data generated by LLaVA-Critic during each iteration.
I found the script for DPO training [dpo_ov7b.sh]
but I need additional code or guide to generate the preference data using LLaVA-Critic.
Is there any detailed guideline for reproducing the preference data for training the llava-onevision-qwen2-7b-ov-chat model?
Thank you so much,
Jeehye
The text was updated successfully, but these errors were encountered:
Thank you for your amazing contributions and for sharing such an exciting project.
As I understand, the llava-onevision-qwen2-7b-ov-chat model is built upon the llava-onevision-qwen2-7b-ov model, with preference data generated by LLaVA-Critic during each iteration.
I found the script for DPO training [dpo_ov7b.sh]
but I need additional code or guide to generate the preference data using LLaVA-Critic.
Is there any detailed guideline for reproducing the preference data for training the llava-onevision-qwen2-7b-ov-chat model?
Thank you so much,
Jeehye
The text was updated successfully, but these errors were encountered: