generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix device placement for GRPO attention mask in compute_loss
#2747
opened Feb 3, 2025 by
tgaddair
Loading…
feat: Add vLLM dtype configuration for GRPO trainer
#2738
opened Feb 2, 2025 by
joey00072
Loading…
5 tasks
GRPO: Expose vllm_init_kwargs to enable vllm configuration
#2728
opened Feb 1, 2025 by
mirceapricop
Loading…
5 tasks
[GRPO] add reward weight in multi-reward settings
#2676
opened Jan 28, 2025 by
hesamsheikh
Loading…
1 task
🔧 Optimize GRPO VRAM Usage by Computing Prompt Tokens Just Once
#2669
opened Jan 27, 2025 by
andyl98
Loading…
2 of 5 tasks
share parameters between model and ref model
#2668
opened Jan 27, 2025 by
GeeeekExplorer
Loading…
2 of 5 tasks
Add Optional ZeRO-3 Weight Gathering for GRPO in Sequence Generation
#2667
opened Jan 27, 2025 by
SeungyounShin
Loading…
5 tasks done
[Not meant to be merged] Support branch for Trainer refactor
#2594
opened Jan 20, 2025 by
qgallouedec
•
Draft
5 tasks
Reduce memory consumption when training with PPO
#2571
opened Jan 15, 2025 by
summerspringwei
Loading…
5 tasks
Add
_compute_score
method to PPOTrainer
#2560
opened Jan 11, 2025 by
oliveiraeliel
•
Draft
2 of 5 tasks
Add generation caching in TextEnvironment and fix bugs in TextEnvironment
#2556
opened Jan 10, 2025 by
konrad-gerlach
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.