Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move trace_link.py from facebookresearch/param to mlcommons/chakra #36

Closed
wants to merge 16 commits into from

Conversation

TaekyungHeo
Copy link
Contributor

Summary

Move trace_link.py from facebookresearch/param to mlcommons/chakra based on the discussion between Meta and NVIDIA.

Test Plan


The `handle_kineto_segmentation` function is intended to support kineto traces
cross multiple iterations by splitting a trace into several segments according
to the provided annotations. Unfortunately, this function is not operating as
expected, leading to errors. It is advisable to remove it.
The multi-iteration support feature for PyTorch execution traces is designed to
facilitate the handling of traces over multiple iterations. Unfortunately, this
feature is not functioning as expected and is leading to errors. It is advisable
to remove it.
This commit introduces support for inter-thread dependencies within the Chakra
framework. By examining Kineto traces via chrome://tracing, one can observe
multiple CPU threads and their implicit dependencies. This update explicitly
encodes these dependencies in the output trace, enabling accurate handling by
subsequent tools.
This commit adds stream ID encoding to GPU operators. This ensures that all
operators within the same stream are executed in the correct order, supporting
intra-stream dependencies.
Introduced exclusive duration calculation for Kineto operators in the TraceLinker
class.  This update differentiates between inclusive and exclusive durations,
providing a clearer distinction in the profiling data. Exclusive durations are
now calculated to identify the actual time spent in individual operations,
excluding overlaps with child operators.
Copy link

github-actions bot commented Apr 8, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@TaekyungHeo TaekyungHeo closed this Apr 8, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Apr 8, 2024
@TaekyungHeo TaekyungHeo deleted the trace-link branch May 15, 2024 01:43
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant