tulu3 preference data pipeline which in report is inconsistent with this code-repo #491

scattw · 2024-12-19T15:31:55Z

great work for LLM!!!

I noticed that tulu3 preference data pipeline which in report https://arxiv.org/abs/2411.15124 is inconsistent with this code-repo in docs/algorithms/synthetic_preference_dataset.md or docs/algorithms/rejection_sampling.md

report in tulu3 preference data pipeline

docs/algorithms/synthetic_preference_dataset.md

open_instruct/rejection_sampling/generation.py 3 sample from model.
open_instruct/rejection_sampling/synthetic_preference_dataset.py use preference annotation in

open-instruct/open_instruct/rejection_sampling/synthetic_preference_dataset.py

Line 79 in c0dcdaf

# The prompt comes from https://arxiv.org/pdf/2203.02155, p. 37

to get preference data.

docs/algorithms/rejection_sampling.md

open_instruct/rejection_sampling/generation.py 3 sample from model.
open_instruct/rejection_sampling/rejection_sampling.py use LLM_as_judge and RM to voting to get preference data.

are there any experiments to report this diff.

Looking forward to your reply!

The text was updated successfully, but these errors were encountered:

ljvmiranda921 · 2025-01-08T00:49:27Z

Hi @scattw ! The code for the synthetic preference pipeline for that particular section of the paper can be found in this path. I think you're referencing a different set of experiments (related to rejection sampling) that's why it looks inconsistent. Specifically, the code for response generation can be found in this file and the preference annotation (+ parsing) here.

Hopefully that clears things up!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tulu3 preference data pipeline which in report is inconsistent with this code-repo #491

tulu3 preference data pipeline which in report is inconsistent with this code-repo #491

scattw commented Dec 19, 2024 •

edited

Loading

ljvmiranda921 commented Jan 8, 2025

tulu3 preference data pipeline which in report is inconsistent with this code-repo #491

tulu3 preference data pipeline which in report is inconsistent with this code-repo #491

Comments

scattw commented Dec 19, 2024 • edited Loading

report in tulu3 preference data pipeline

docs/algorithms/synthetic_preference_dataset.md

docs/algorithms/rejection_sampling.md

ljvmiranda921 commented Jan 8, 2025

scattw commented Dec 19, 2024 •

edited

Loading