Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON format for Chakra ET #145

Merged
merged 27 commits into from
Dec 6, 2024
Merged

Conversation

rvinaybharadwaj
Copy link
Contributor

Summary

  • Added JSON format for Chakra ET
  • To maintain backward compatibility, a wrapper is included that presents a common API for Protobuf and JSON

Test Plan

  • Added wrapperTests.cpp and test data

@rvinaybharadwaj rvinaybharadwaj requested a review from a team as a code owner August 2, 2024 15:52
Copy link

github-actions bot commented Aug 2, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@JoongunPark
Copy link
Contributor

JoongunPark commented Sep 3, 2024

Hi @rvinaybharadwaj,

Thank you for the PR! I have a few requests:

  1. Could you ensure that the code passes all checks (file formatting, tests, etc.)?
  2. Could you provide step-by-step commands to help us reproduce the results?
  3. What is the expected outcome, and how can we verify it? (Could you share us example input and output?)

Thanks again!

@rvinaybharadwaj
Copy link
Contributor Author

Hi @rvinaybharadwaj,

Thank you for the PR! I have a few requests:

  1. Could you ensure that the code passes all checks (file formatting, tests, etc.)?
  2. Could you provide step-by-step commands to help us reproduce the results?
  3. What is the expected outcome, and how can we verify it? (Could you share us example input and output?)

Thanks again!

Hi @JoongunPark,

  1. I have run clang-format on the code and the tests are breaking on not finding a module during the setup phase. So, I don't think it's an issue related to my code. And I am not sure why the CLA check is failing even though I am a member of MLCommons.

  2. I have added wrapper tests, which are functionally equivalent to the feeder tests and have included test data. You can run it through the existing google test framework.

  3. The data in chakra.0.json is the same as the data in chakra.0.et You should be able to achieve the same result as the protobuf ET and the wrapper tests should all pass.

Let me know if you have any more questions.

@JoongunPark
Copy link
Contributor

Hi @rvinaybharadwaj,

Thank you for the update!

  1. It looks like this branch has new conflicts after the recent merge. Could you please resolve them?
  2. For the cpp_linter, you might want to update the GitHub workflow to fix it. Here’s the link: cpp_lint.yml.
  3. Regarding the CLA issue, I believe someone from MLCommons needs to add you. @tushar-krishna or @srinivas212 should be able to help with this.

I'll run some tests and share the results.

@rvinaybharadwaj
Copy link
Contributor Author

rvinaybharadwaj commented Sep 6, 2024

Hi @rvinaybharadwaj,

Thank you for the update!

  1. It looks like this branch has new conflicts after the recent merge. Could you please resolve them?
  2. For the cpp_linter, you might want to update the GitHub workflow to fix it. Here’s the link: cpp_lint.yml.
  3. Regarding the CLA issue, I believe someone from MLCommons needs to add you. @tushar-krishna or @srinivas212 should be able to help with this.

I'll run some tests and share the results.

Hi @JoongunPark ,
Thanks for the feedback. Merge conflicts have been fixed.

@JoongunPark
Copy link
Contributor

Hi @rvinaybharadwaj,

Thank you for updating your PR!
I have two last requests:

  1. Could you please ensure that the "CLA" test passes? It's mandatory, and unfortunately, I can't assist with this part.
  2. It seems you've successfully addressed the original issue with the C++ Link (the module installation problem). However, there are still a few minor errors. You can check them here: link. Could you please take a look and fix those as well?

Thank you for your efforts!

@rvinaybharadwaj rvinaybharadwaj force-pushed the jsonify branch 2 times, most recently from ae9ec40 to 333a9a3 Compare September 19, 2024 16:03
@rvinaybharadwaj
Copy link
Contributor Author

Hi @rvinaybharadwaj,

Thank you for updating your PR! I have two last requests:

  1. Could you please ensure that the "CLA" test passes? It's mandatory, and unfortunately, I can't assist with this part.
  2. It seems you've successfully addressed the original issue with the C++ Link (the module installation problem). However, there are still a few minor errors. You can check them here: link. Could you please take a look and fix those as well?

Thank you for your efforts!

Hi @JoongunPark,
All errors have been fixed.

@JoongunPark
Copy link
Contributor

JoongunPark commented Sep 24, 2024

@rvinaybharadwaj
Thank you for resolving all the issues! I've started reviewing your code in detail, and here’s my understanding so far:

1. New Classes

  • JSONNode: Handles execution trace (ET) parsing from JSON files.
  • WrapperNode: Abstracts ET handling for multiple formats (e.g., .et and .json), working within the ETFeeder component.

2. Dependency Changes

  • pyproject.toml: Updated to support protobuf==5.*. This might resolve some installation conflicts with the original version.

3. Test Coverage

  • Unit Tests (tests/feeder/wrapperTests.cpp): Verifies the JSON parsing functionality.
  • Test Data (tests/data): Includes new JSON-format test files.

4. Linting Updates

  • clang-format-lint: Clang format (11->16). Linter version (v0.11 -> v0.18.1)

5. New Third-Party Library

  • json.hpp (src/third_party/utils/json.hpp): Utilized by JSONNode for JSON parsing.

Also, I have a question regarding JSON format trace. This PR does not include trace conversion from ET to JSON.
Is this PR assuming to get JSON trace with exsting Jsonizer?

Thank you again!

@rvinaybharadwaj
Copy link
Contributor Author

rvinaybharadwaj commented Sep 24, 2024

@rvinaybharadwaj Thank you for resolving all the issues! I've started reviewing your code in detail, and here’s my understanding so far:

1. New Classes

  • JSONNode: Handles execution trace (ET) parsing from JSON files.
  • WrapperNode: Abstracts ET handling for multiple formats (e.g., .et and .json), working within the ETFeeder component.

2. Dependency Changes

  • pyproject.toml: Updated to support protobuf==5.*. This might resolve some installation conflicts with the original version.

3. Test Coverage

  • Unit Tests (tests/feeder/wrapperTests.cpp): Verifies the JSON parsing functionality.
  • Test Data (tests/data): Includes new JSON-format test files.

4. Linting Updates

  • clang-format-lint: Clang format (11->16). Linter version (v0.11 -> v0.18.1)

5. New Third-Party Library

  • json.hpp (src/third_party/utils/json.hpp): Utilized by JSONNode for JSON parsing.

Also, I have a question regarding JSON format trace. This PR does not include trace conversion from ET to JSON. Is this PR assuming to get JSON trace with exsting Jsonizer?

Thank you again!

@JoongunPark
Thank you for reviewing the code and summarizing. Your understanding of different parts of the code is correct. The existing JSONizer outputs a slightly different format, but I was assuming it is easy to do a JSON dump directly from Pytorch or any other tool.

@JoongunPark
Copy link
Contributor

@rvinaybharadwaj Thank you for resolving all the issues! I've started reviewing your code in detail, and here’s my understanding so far:

1. New Classes

  • JSONNode: Handles execution trace (ET) parsing from JSON files.
  • WrapperNode: Abstracts ET handling for multiple formats (e.g., .et and .json), working within the ETFeeder component.

2. Dependency Changes

  • pyproject.toml: Updated to support protobuf==5.*. This might resolve some installation conflicts with the original version.

3. Test Coverage

  • Unit Tests (tests/feeder/wrapperTests.cpp): Verifies the JSON parsing functionality.
  • Test Data (tests/data): Includes new JSON-format test files.

4. Linting Updates

  • clang-format-lint: Clang format (11->16). Linter version (v0.11 -> v0.18.1)

5. New Third-Party Library

  • json.hpp (src/third_party/utils/json.hpp): Utilized by JSONNode for JSON parsing.

Also, I have a question regarding JSON format trace. This PR does not include trace conversion from ET to JSON. Is this PR assuming to get JSON trace with exsting Jsonizer?
Thank you again!

@JoongunPark Thank you for reviewing the code and summarizing. Your understanding of different parts of the code is correct. The existing JSONizer outputs a slightly different format, but I was assuming it is easy to do a JSON dump directly from Pytorch or any other tool.

Regarding the JSON format trace, I believe we should discuss whether to align it with the existing JSONizer or consider using a different format. This decision might be crucial, as I recall your initial plan was to create a merged, unified JSON trace. We can also involve @TaekyungHeo for further discussion on this matter.

Copy link
Contributor

@TaekyungHeo TaekyungHeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left several comments and will leave more when I have more time.

.github/workflows/cpp_lint.yml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
src/feeder/JSONNode.cpp Outdated Show resolved Hide resolved
src/feeder/JSONNode.h Outdated Show resolved Hide resolved
src/feeder/WrapperNode.cpp Outdated Show resolved Hide resolved
src/feeder/et_feeder.cpp Show resolved Hide resolved
tests/feeder/wrapperTests.cpp Outdated Show resolved Hide resolved
src/feeder/JSONNode.cpp Outdated Show resolved Hide resolved
src/feeder/JSONNode.cpp Outdated Show resolved Hide resolved
tests/data/jsonData.tar.gz Outdated Show resolved Hide resolved
tests/feeder/tests.cpp Show resolved Hide resolved
tests/feeder/wrapperTests.cpp Outdated Show resolved Hide resolved
tests/feeder/wrapperTests.cpp Outdated Show resolved Hide resolved
tests/feeder/wrapperTests.cpp Outdated Show resolved Hide resolved
tests/feeder/wrapperTests.cpp Outdated Show resolved Hide resolved
tests/feeder/wrapperTests.cpp Outdated Show resolved Hide resolved
tests/feeder/wrapperTests.cpp Outdated Show resolved Hide resolved
@TaekyungHeo TaekyungHeo mentioned this pull request Oct 7, 2024
@rvinaybharadwaj
Copy link
Contributor Author

rvinaybharadwaj commented Dec 2, 2024

Based on the discussion in the Chakra meeting on Dec 02, 2024 -

  1. I check for the file format in wrapper_node.cpp #L26 . Since, I use filename.find_last_of("."), renaming the JSON file to filename.et.json should work without any changes.
  2. In the future, if the protobuf ET files are renamed to filename.et.pb, we need to make the corresponding change in wrapper_node.cpp #L27

Copy link
Contributor

@tushar-krishna tushar-krishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Vinay for the PR, and to Taekyung, Joongun and William for the review.
This looks good to merge.

@tushar-krishna tushar-krishna merged commit 9247489 into mlcommons:main Dec 6, 2024
9 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Dec 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants