-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: move acos() to function crate #9297
refactor: move acos() to function crate #9297
Conversation
Thanks @SteveLauC -- it appears there are some CI failuers |
Indeed, I am working on the sqllogictest failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self review
DataType::Float32 => { | ||
Arc::new(make_function_scalar_inputs_return_type!( | ||
&args[0], | ||
"x", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"x", | |
self.name(), |
I forgot to change this one, BTW, should this be done? I saw that isnan()
is using "x"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -15,15 +15,18 @@ | |||
// specific language governing permissions and limitations | |||
// under the License. | |||
|
|||
//! "core" DataFusion functions | |||
//! "math" DataFusion functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be a copy-paste error, so I changed it
@@ -15,7 +15,7 @@ | |||
// specific language governing permissions and limitations | |||
// under the License. | |||
|
|||
//! Encoding expressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, seems to be a copy-paste error
@@ -589,34 +586,6 @@ async fn roundtrip_parquet_exec_with_table_partition_cols() -> Result<()> { | |||
roundtrip_test(Arc::new(ParquetExec::new(scan_config, None, None))) | |||
} | |||
|
|||
#[test] | |||
fn roundtrip_builtin_scalar_function() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this test because we are going to elimiate buit-in functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until we have completed the migration, can you please simply change to use a different built in function so we don't lose coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
datafusion/proto/Cargo.toml
Outdated
@@ -47,6 +47,7 @@ chrono = { workspace = true } | |||
datafusion = { path = "../core", version = "36.0.0" } | |||
datafusion-common = { workspace = true } | |||
datafusion-expr = { workspace = true } | |||
datafusion-functions = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should remove this as it is unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Hi @alamb, I think I might need some help on the sqllogictest CI error, with the
Honestly, I have no idea about the above behavios... |
Update: I just realized that it is a mistake that I made in It feels that when we do
It seems that DataFusion will handle those special values for you. |
datafusion/optimizer/Cargo.toml
Outdated
@@ -44,6 +44,7 @@ async-trait = { workspace = true } | |||
chrono = { workspace = true } | |||
datafusion-common = { workspace = true } | |||
datafusion-expr = { workspace = true } | |||
datafusion-functions = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it only used in testing? Then it should be placed under dev-dependencies
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Will do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI @Omega359 had a similar challenge when looking at timestamp functions. We are thinking it might be better to move the tests rather than add a new dependency -- see #9291 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can add this as a dependency as it prevents the crates from being published to crates.io. See #9277
I think we should move these tests out of the optimizer crate
For this PR, however, perhaps we could simply update this test to use a function that has not yet been ported and maybe move the tests in another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this PR, however, perhaps we could simply update this test to use a function that has not yet been ported and maybe move the tests in another PR
done
@@ -545,7 +545,7 @@ message InListNode { | |||
|
|||
enum ScalarFunction { | |||
Abs = 0; | |||
Acos = 1; | |||
// 1 was Acos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider keeping it, perhaps by renaming it to Deprecated_Acos, and then decode it into the new ScalarUDF to maintain better backward compatibility?
We might need to mark the BuiltinScalarFunction::Acos
as deprecated as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the delay in review, I was out last week)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that what @jonahgao would allow better backwards compatibility
However, given that we don't really provide any guarantee for compatibility now anyways (I recently made this explicit in the documentation, https://github.com/apache/arrow-datafusion/blob/8f3d1ef23f93cd4303745eba76c0850b39774d07/datafusion/proto/src/lib.rs#L37-L41) I don't think we should add complexity for these functions
DataFusion CLI v36.0.0
❯ select acos();
thread 'main' panicked at datafusion/functions/src/math/acos.rs:57:25:
index out of bounds: the len is 0 but the index is 0 In the main branch, it just throws an error. DataFusion CLI v36.0.0
❯ select acos();
Error during planning: No function matches the given name and argument types 'acos()'. You might need to add explicit type casts.
Candidate functions:
acos(Float64/Float32) |
Huge thanks for catching it! Just realized $ ./target/debug/datafusion-cli
DataFusion CLI v36.0.0
❯ select isnan();
thread 'main' panicked at /home/steve/Documents/workspace/GitHub/arrow-datafusion/datafusion/functions/src/math/nans.rs:67:39:
index out of bounds: the len is 0 but the index is 0 And cc @yyy1000,
Yeah, we should do it before invoking the function |
Yeah, thanks for telling me @SteveLauC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @SteveLauC and @jonahgao -- I apologize for the delay in reviewing this PR.
I think the new dependency between the optimizer and core crates needs to be fixed prior to merging this PR (I left suggestions on how to do so).
Otherwise, I think this PR is pretty close to ready to go other than some test tweaks
Thank you again so much
{ f32::acos } | ||
)) | ||
}, | ||
other => return internal_err!("Unsupported data type {other:?} for function {}", self.name()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other => return internal_err!("Unsupported data type {other:?} for function {}", self.name()), | |
other => return exec_err!("Unsupported data type {other:?} for function {}", self.name()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
datafusion/optimizer/Cargo.toml
Outdated
@@ -44,6 +44,7 @@ async-trait = { workspace = true } | |||
chrono = { workspace = true } | |||
datafusion-common = { workspace = true } | |||
datafusion-expr = { workspace = true } | |||
datafusion-functions = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can add this as a dependency as it prevents the crates from being published to crates.io. See #9277
I think we should move these tests out of the optimizer crate
For this PR, however, perhaps we could simply update this test to use a function that has not yet been ported and maybe move the tests in another PR
@@ -545,7 +545,7 @@ message InListNode { | |||
|
|||
enum ScalarFunction { | |||
Abs = 0; | |||
Acos = 1; | |||
// 1 was Acos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that what @jonahgao would allow better backwards compatibility
However, given that we don't really provide any guarantee for compatibility now anyways (I recently made this explicit in the documentation, https://github.com/apache/arrow-datafusion/blob/8f3d1ef23f93cd4303745eba76c0850b39774d07/datafusion/proto/src/lib.rs#L37-L41) I don't think we should add complexity for these functions
@@ -589,34 +586,6 @@ async fn roundtrip_parquet_exec_with_table_partition_cols() -> Result<()> { | |||
roundtrip_test(Arc::new(ParquetExec::new(scan_config, None, None))) | |||
} | |||
|
|||
#[test] | |||
fn roundtrip_builtin_scalar_function() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until we have completed the migration, can you please simply change to use a different built in function so we don't lose coverage?
bed7322
to
e455453
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @SteveLauC -- this looks great to me 🙏
@@ -889,14 +889,14 @@ mod test { | |||
// test that automatic argument type coercion for scalar functions work | |||
let empty = empty(); | |||
let lit_expr = lit(10i64); | |||
let fun: BuiltinScalarFunction = BuiltinScalarFunction::Acos; | |||
let fun: BuiltinScalarFunction = BuiltinScalarFunction::Floor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I took the liberty of merging up from main and resolving a conflict on this PR |
29909d9
to
b75f1cf
Compare
Thanks again @SteveLauC |
Which issue does this PR close?
Part of #9285
Rationale for this change
What changes are included in this PR?
Move math function
acos()
to the function crateAre these changes tested?
Let's try this in CI:<
Are there any user-facing changes?
I guess no