You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that tulu3 preference data pipeline which in report https://arxiv.org/abs/2411.15124 is inconsistent with this code-repo in docs/algorithms/synthetic_preference_dataset.md or docs/algorithms/rejection_sampling.md
report in tulu3 preference data pipeline
docs/algorithms/synthetic_preference_dataset.md
open_instruct/rejection_sampling/generation.py 3 sample from model.
open_instruct/rejection_sampling/synthetic_preference_dataset.py use preference annotation in
Hi @scattw ! The code for the synthetic preference pipeline for that particular section of the paper can be found in this path. I think you're referencing a different set of experiments (related to rejection sampling) that's why it looks inconsistent. Specifically, the code for response generation can be found in this file and the preference annotation (+ parsing) here.
great work for LLM!!!
I noticed that tulu3 preference data pipeline which in report https://arxiv.org/abs/2411.15124 is inconsistent with this code-repo in
docs/algorithms/synthetic_preference_dataset.md
ordocs/algorithms/rejection_sampling.md
report in tulu3 preference data pipeline
docs/algorithms/synthetic_preference_dataset.md
open_instruct/rejection_sampling/generation.py
3 sample from model.open_instruct/rejection_sampling/synthetic_preference_dataset.py
use preference annotation inopen-instruct/open_instruct/rejection_sampling/synthetic_preference_dataset.py
Line 79 in c0dcdaf
docs/algorithms/rejection_sampling.md
open_instruct/rejection_sampling/generation.py
3 sample from model.open_instruct/rejection_sampling/rejection_sampling.py
use LLM_as_judge and RM to voting to get preference data.are there any experiments to report this diff.
Looking forward to your reply!
The text was updated successfully, but these errors were encountered: