Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Utilization][S3 Replicator][fortesting db]Add replicator logic to insert util data into clickhouse #6217

Merged
merged 8 commits into from
Feb 3, 2025

Conversation

yangw-dev
Copy link
Contributor

@yangw-dev yangw-dev commented Jan 24, 2025

currently i add the debug folder for testing, and the data will be inserted into tables in db fortesting on clickhouse

others:
add missing field tags

Related Prs:
setup composite action for data pipeline: pytorch/pytorch#145310
add permission for composite action to access S3 bucket: https://github.com/pytorch-labs/pytorch-gha-infra/pull/595
set up data pipeline script: pytorch/pytorch#145327

Copy link

vercel bot commented Jan 24, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Updated (UTC)
torchci ⬜️ Ignored (Inspect) Visit Preview Jan 30, 2025 0:59am

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 24, 2025
@yangw-dev yangw-dev changed the title [S3 Replicator] Add replicator logic to insert util data into db table. [Utilization][S3 Replicator] Add replicator logic to insert util data into db table. Jan 24, 2025
@yangw-dev yangw-dev requested a review from huydhn January 27, 2025 20:25
@yangw-dev yangw-dev changed the title [Utilization][S3 Replicator] Add replicator logic to insert util data into db table. [Utilization][S3 Replicator][fortesting db]Add replicator logic to insert util data into clickhouse Jan 27, 2025
Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! You probably want to manually test the insert queries with the provide schemas like https://github.com/pytorch/test-infra/wiki/How-to-add-a-new-custom-table-on-ClickHouse#testing

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Jan 29, 2025
upload_utilization_script to generate db-ready-insert records to s3
- generate two files: metadata and timeseries in ossci-utilization buckets
- convert log record to db format ones
- add unit test job for tools/stats/

Related Prs:
setup composite action for data pipeline: #145310
add permission for composite action to access S3 bucket: https://github.com/pytorch-labs/pytorch-gha-infra/pull/595
add insert logic in s3 replicator: pytorch/test-infra#6217
Pull Request resolved: #145327
Approved by: https://github.com/huydhn

Co-authored-by: Huy Do <[email protected]>
@yangw-dev yangw-dev marked this pull request as ready for review January 30, 2025 00:57
@yangw-dev yangw-dev merged commit cb9f062 into main Feb 3, 2025
6 checks passed
@yangw-dev yangw-dev deleted the addS3 branch February 3, 2025 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants