-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: [comet-parquet-exec] Unit test fixes, default scan impl to native_comet #1265
chore: [comet-parquet-exec] Unit test fixes, default scan impl to native_comet #1265
Conversation
@parthchandra could you run |
and |
Fixed the build. Yes, we should get the CI to turn green with this PR before we attempt any more changes to the feature branch |
Updated the plns for Spark 3.5 and Spark 4.0. However plan generation for the native_datafusion impl is failing which will not affect the ci, but which needs to be addressed at some point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @parthchandra
Added the plans for Spark 4.0 for the NATIVE_DATAFUSION and NATIVE_ICEBERG_COMPAT scans |
Notable changes:
The scan implementation can be selected by setting the conf
spark.comet.scan.impl
or by setting the environment variableCOMET_PARQUET_SCAN_IMPL
Plan compatibility suites generate a different plan based on the implementation. As a result, we now have three sets of expected plans based on the scan implementation chosen
We now use the Spark Session timezone instead of UTC while reading timestamp fields. This is so that we can compare them with literal timestamps (Spark apparently automatically applies the session timezone to those)