Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Update README to highlight Comet benefits #497

Merged
merged 6 commits into from
May 31, 2024

Conversation

andygrove
Copy link
Member

Which issue does this PR close?

N/A

Rationale for this change

By focusing on the benefits of Comet, and showing some performance results, we can hopefully make the project more compelling to users who may consider trying it out.

What changes are included in this PR?

New README. You can see the rendered version here

How are these changes tested?

README.md Outdated Show resolved Hide resolved

<a href="docs/source/_static/images/comet-overview.png"><img src="docs/source/_static/images/comet-overview.png" align="center" height="600" width="750" ></a>
Comet is not yet achieving full DataFusion speeds in all cases, but with future work we aim to provide a 2x-4x speedup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if pure DF gives 3.9x, is that possible Comet built on top of DF to give 4x?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was rounding up to the nearest whole percent here, but of course it will be challenging. DataFusion isn't performing shuffle operations, but I think that DataFusion performance sets a hard limit on what we can do with Comet, so 2-4x seems to be our expected range (unless there are future optimizations in DataFusion).

README.md Outdated Show resolved Hide resolved
Co-authored-by: Oleks V <[email protected]>
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks @andygrove

Copy link
Contributor

@kazuyukitanimura kazuyukitanimura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my comments are nits

README.md Show resolved Hide resolved
@andygrove
Copy link
Member Author

Thanks for the reviews @viirya @kazuyukitanimura @comphead. I have addressed the feedback so will go ahead and merge this and we can follow up with future PRs if we want to refine this more.

@andygrove andygrove merged commit 1f81c38 into apache:main May 31, 2024
@andygrove andygrove deleted the update-readme2 branch May 31, 2024 16:36
himadripal pushed a commit to himadripal/datafusion-comet that referenced this pull request Sep 7, 2024
* Update README to show current performance info

* Update README.md

Co-authored-by: Oleks V <[email protected]>

* address feedback

* Add hardware info to benchmarking page

* add raw benchmark results

---------

Co-authored-by: Oleks V <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants