Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unnecessary merge #1001

Open
Tracked by #924
ShiKaiWi opened this issue Jun 19, 2023 · 0 comments
Open
Tracked by #924

Avoid unnecessary merge #1001

ShiKaiWi opened this issue Jun 19, 2023 · 0 comments
Labels
A-query-engine Area: Query engine feature New feature or request

Comments

@ShiKaiWi
Copy link
Member

Describe This Problem

Currently, merge sort is applied over all the ssts in one specific segment, in order for following dedup procedure. However, what is required by dedup is much less than the global order of all the rows of all the ssts, that is to say, current merge sort is unecessary.

Proposal

What we need for dedup is just to gather all the rows sharing the same primary key in the order of their sequence number. And that is to say, there is no need to do merge sort over the ssts whose key range don't overlap between each other.

Additional Context

No response

@ShiKaiWi ShiKaiWi added the feature New feature or request label Jun 19, 2023
This was referenced Jun 19, 2023
@jiacai2050 jiacai2050 added A-analytic-engine Area: Analytic Engine A-query-engine Area: Query engine and removed A-analytic-engine Area: Analytic Engine labels Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query-engine Area: Query engine feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants