Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Update unsloth_integration.md
#2742 opened Feb 2, 2025 by AdeebAldkheel Loading…
5 tasks
📖Nit Fix in Documentation
#2740 opened Feb 2, 2025 by ParagEkbote Loading…
1 task done
feat: Add cliprange to GRPO loss
#2739 opened Feb 2, 2025 by joey00072 Draft
1 of 5 tasks
feat: Add vLLM dtype configuration for GRPO trainer
#2738 opened Feb 2, 2025 by joey00072 Loading…
5 tasks
Dynamically load LoRA weights when using vLLM
#2730 opened Feb 1, 2025 by tgaddair Loading…
⚡ Fix GRPO PEFT
#2725 opened Jan 31, 2025 by qgallouedec Draft
5 tasks
WIP: RLOOV2
#2724 opened Jan 31, 2025 by mnoukhov Draft
3 tasks
Update ppo_trainer.md documentation
#2720 opened Jan 31, 2025 by JohnConnor123 Loading…
5 tasks
🔁 🦈 Support iterative GRPO
#2700 opened Jan 30, 2025 by shirinyamani Loading…
4 of 5 tasks
[GRPO] add reward weight in multi-reward settings
#2676 opened Jan 28, 2025 by hesamsheikh Loading…
1 task
🔧 Optimize GRPO VRAM Usage by Computing Prompt Tokens Just Once
#2669 opened Jan 27, 2025 by andyl98 Loading…
2 of 5 tasks
share parameters between model and ref model
#2668 opened Jan 27, 2025 by GeeeekExplorer Loading…
2 of 5 tasks
[SFT] add token accuracy metric
#2597 opened Jan 21, 2025 by kashif Loading…
5 tasks
🐍 Support Python 3.13
#2593 opened Jan 20, 2025 by qgallouedec Draft
5 tasks
[WIP] [Liger] liger JSD support
#2573 opened Jan 16, 2025 by Mecoli1219 Draft
5 tasks
Reduce memory consumption when training with PPO
#2571 opened Jan 15, 2025 by summerspringwei Loading…
5 tasks
[Liger] liger DPO support
#2568 opened Jan 14, 2025 by kashif Loading…
Add _compute_score method to PPOTrainer
#2560 opened Jan 11, 2025 by oliveiraeliel Draft
2 of 5 tasks
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.