Skip to content

Commit

Permalink
fix rccl hip streams section in workload tuning guide (ROCm#4140)
Browse files Browse the repository at this point in the history
(cherry picked from commit 78f9adc)
  • Loading branch information
peterjunpark committed Dec 9, 2024
1 parent 5c25c3e commit b5955a8
Showing 1 changed file with 4 additions and 5 deletions.
9 changes: 4 additions & 5 deletions docs/how-to/tuning-guides/mi300x/workload.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2062,11 +2062,10 @@ collectives.
Multi-node FSDP and RCCL settings
---------------------------------
It's recommended to use high-priority HIP streams with RCCL.
The simplest way to enable this is by using the nightly PyTorch wheels, as the required changes from
`PR #122830 <https://github.com/pytorch/pytorch/pull/122830>`_ were not included in the PyTorch 2.3
release but are available in the nightly builds.
When using PyTorch's FSDP (Full Sharded Data Parallel) feature, the HIP
streams used by RCCL and HIP streams used for compute kernels do not
always overlap well. As a workaround, it's recommended to use
high-priority HIP streams with RCCL.
To configure high-priority streams:
Expand Down

0 comments on commit b5955a8

Please sign in to comment.