Skip to content

Commit

Permalink
[OptRed] Extend -tritonintelgpu-optimize-reduction-locality to supp…
Browse files Browse the repository at this point in the history
…ort `repCluster[0] > 2`

Support `repCluster[0] > 2` by using 7-D tensors and adding a `convert_layout` operation before the final `reshape`.

See code for implementation details.

Signed-off-by: victor-eds <[email protected]>
  • Loading branch information
victor-eds committed Oct 21, 2024
1 parent 5ed11cb commit fdd0512
Show file tree
Hide file tree
Showing 3 changed files with 356 additions and 191 deletions.
Loading

0 comments on commit fdd0512

Please sign in to comment.