Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tulu3 preference data pipeline which in report is inconsistent with this code-repo #491

Open
scattw opened this issue Dec 19, 2024 · 1 comment

Comments

@scattw
Copy link

scattw commented Dec 19, 2024

great work for LLM!!!

I noticed that tulu3 preference data pipeline which in report https://arxiv.org/abs/2411.15124 is inconsistent with this code-repo in docs/algorithms/synthetic_preference_dataset.md or docs/algorithms/rejection_sampling.md

report in tulu3 preference data pipeline

image

docs/algorithms/synthetic_preference_dataset.md

  1. open_instruct/rejection_sampling/generation.py 3 sample from model.
  2. open_instruct/rejection_sampling/synthetic_preference_dataset.py use preference annotation in
    # The prompt comes from https://arxiv.org/pdf/2203.02155, p. 37
    to get preference data.

docs/algorithms/rejection_sampling.md

  1. open_instruct/rejection_sampling/generation.py 3 sample from model.
  2. open_instruct/rejection_sampling/rejection_sampling.py use LLM_as_judge and RM to voting to get preference data.

are there any experiments to report this diff.

Looking forward to your reply!

@ljvmiranda921
Copy link
Member

Hi @scattw ! The code for the synthetic preference pipeline for that particular section of the paper can be found in this path. I think you're referencing a different set of experiments (related to rejection sampling) that's why it looks inconsistent. Specifically, the code for response generation can be found in this file and the preference annotation (+ parsing) here.

Hopefully that clears things up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants