Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExprBuilder for Physical Aggregate Expr #11617

Merged
merged 9 commits into from
Jul 24, 2024

Conversation

jayzhan211
Copy link
Contributor

@jayzhan211 jayzhan211 commented Jul 23, 2024

Which issue does this PR close?

Part of #11543.

Since create_aggregate_expr is not stable yet, probably will change after refactor #11359 .

I only create builder API, but not mark create_aggregate_expr as deprecated for now.

I also update create_aggregate_expr to avoid breaking change on is_reverse field.

Ideally, we could have count_udaf().call_physical(args)....build() after #11359

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Signed-off-by: jayzhan211 <[email protected]>
@github-actions github-actions bot added the core Core DataFusion crate label Jul 23, 2024
Signed-off-by: jayzhan211 <[email protected]>
@jayzhan211 jayzhan211 force-pushed the physical-expr-builder branch from 953aceb to 315b8e9 Compare July 23, 2024 14:03
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
}))
}

pub fn name(mut self, name: impl Into<String>) -> Self {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not if it is necessary, maybe we can build it from function+args, so keep it out from new() so it is easier for deprecation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree creating the name automatically from existing information would be better. Perhaps we can do that as a follow on PR

self
}

pub fn schema(mut self, schema: SchemaRef) -> Self {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether we need schema or dfschema after refactor, so keep them out from new to prepare for possible deprecation

@jayzhan211 jayzhan211 marked this pull request as ready for review July 24, 2024 10:16
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jayzhan211 -- I think it looks great

false,
)
.unwrap()
AggregateExprBuilder::new(count_udaf(), vec![self.column()])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is much easier to understand 💯

}))
}

pub fn name(mut self, name: impl Into<String>) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree creating the name automatically from existing information would be better. Perhaps we can do that as a follow on PR

None::<String>,
field_name,
))),
expr: Box::new(col(field_name)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹

@@ -50,6 +50,7 @@ chrono = { workspace = true }
datafusion = { workspace = true, default-features = true }
datafusion-common = { workspace = true, default-features = true }
datafusion-expr = { workspace = true }
datafusion-physical-expr-common = { workspace = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since proto already relies on the core datafusion crate this is not a new (direct) dependency.

Though maybe it highlights that it would be good to pub use the AggregateExprBuilder in datafusion so using it doesn't need a new explicit dependency 🤔

}

#[derive(Debug, Clone)]
pub struct AggregateExprBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub struct AggregateExprBuilder {
/// Builder for physical [`AggregateExpr`]
///
/// `AggregateExpr` contains the information necessary to call
/// an aggregate expression.
pub struct AggregateExprBuilder {

Signed-off-by: jayzhan211 <[email protected]>
@@ -545,6 +545,11 @@ pub mod optimizer {
pub use datafusion_optimizer::*;
}

/// re-export of [`datafusion_physical_expr`] crate
pub mod physical_expr_common {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems other crate import *, follow them

Signed-off-by: jayzhan211 <[email protected]>
@jayzhan211 jayzhan211 merged commit 1356934 into apache:main Jul 24, 2024
24 checks passed
@jayzhan211 jayzhan211 deleted the physical-expr-builder branch July 24, 2024 14:30
@jayzhan211
Copy link
Contributor Author

Thanks @alamb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants