-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrites PLUS and BITWISE_AND implementations using new modeling #1647
base: v1
Are you sure you want to change the base?
Conversation
Updates how function instances are chosen and carried
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## v1 #1647 +/- ##
=========================================
Coverage 80.03% 80.03%
Complexity 47 47
=========================================
Files 19 19
Lines 506 506
Branches 23 23
=========================================
Hits 405 405
Misses 88 88
Partials 13 13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@ParameterizedTest | ||
@MethodSource("plusTestCases") | ||
@Execution(ExecutionMode.CONCURRENT) | ||
fun plusTests(tc: SuccessTestCase) = tc.assert() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are very telling. Could you possible factor them out of PartiQLEvaluatorTest so we can have something like funciton/operations tests. I'm primarily concerned with how large this one class is growing, and I think it would be nice to have things like ArithemeticTests and StringTests etc. for the various SQL operator sections.
e.g. a CharacterStringsTests corresponds to "operations involving character strings". I was hoping to factor implementations in a similar way but I couldn't get it done at the time.
* @param type specifies the output type of the dynamic dispatch. This may be specified if all candidate functions | ||
* return the same type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to enforce this in the method? Off the top of my head, I'd think not.
* coercion groups. | ||
* | ||
* TODO: [UNKNOWN] should likely be removed in the future. However, it is needed due to literal nulls and missings. | ||
* TODO: [DYNAMIC] should likely be removed in the future. This is currently only kept to map function signatures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious if VARIANT should go to the DYNAMIC coercion family – as this family effectively says, "we don't know what it is, you have to ask the value itself", but I also don't know exactly how that might effect the plumbing.
No changes requested, just trying to brainstorm how coercions of variants work / not happen.
return matches.first().match | ||
val match = matches.first() | ||
val fn = match.match.getInstance(args.toTypedArray()) ?: return null | ||
return FnMatch.Static(fn, match.mapping) | ||
} | ||
|
||
// TODO: Do we care about preferred types? This is a PostgreSQL concept. | ||
// 5. Run through all candidates and keep those that accept preferred types (of the input data type's type category) at the most positions where type conversion will be required. | ||
|
||
// 6. Find the highest precedence one. NOTE: This is a remnant of the previous implementation. Whether we want |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No changes request, just saying that it makes sense to have a "tie"-breaker otherwise we'll have another dynamic dispatch in a place that didn't have dynamic values. I think it makes sense to follow PostgreSQL to break ties when the number of exact matches is the same.
val types = node.candidates | ||
.map { it.fn.signature.getReturnType(emptyArray()) } | ||
.mapNotNull { it.fn.signature.getInstance(argTypes.toTypedArray())?.returns } | ||
.toMutableSet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic could be moved out of PlanTyper and to the getType()
implementation. Makes me think getType()
should be closed and all plan nodes are abstract classes rather than interfaces. This is the second time now I've seen the argument for abstract classes over interfaces on the plan nodes. This would allow user-defined implementations of plan nodes that cannot change typing logic. Also no need to specify types in the plan factory since they can all be computed. I really do think PartiQL has a "data evaluator" and a "type evaluator", but I'm curious where is best for the "type evaluator" to go..
Fn_PLUS__INT8_INT8__INT8, | ||
Fn_PLUS__INT16_INT16__INT16, | ||
Fn_PLUS__INT32_INT32__INT32, | ||
Fn_PLUS__INT64_INT64__INT64, | ||
Fn_PLUS__INT_INT__INT, | ||
Fn_PLUS__FLOAT32_FLOAT32__FLOAT32, | ||
Fn_PLUS__FLOAT64_FLOAT64__FLOAT64, | ||
Fn_PLUS__DECIMAL_ARBITRARY_DECIMAL_ARBITRARY__DECIMAL_ARBITRARY, | ||
FnPlus, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😍
private val lookupTable: Array<Array<(PType, PType) -> Function.Instance?>> = Array(PType.Kind.entries.size) { | ||
Array(PType.Kind.entries.size) { | ||
{ _, _ -> null } | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some clarification on what this actually is? I know it's a lookupTable, but a lookup for what .. functions which ultimately return instances (deciphered from the return type)!
private val instances: ...
You can also typealias the two-dimensional array like from the cast lookup table implementation, and we can clarify the getInstance logic to.
return instances[lhs.kind.ordinal][rhs.kind.ordinal](newLhs, newRhs)
As-is, the table needs at least a different name, comment, or type alias to understand what it is mapping. Let's also try to reduce the use of "invoke, invocable, invocation" etc. since the invoke/invocation in getInstance is different than the one used in basic()
.
I couldn't think of a better name for the Function.Instance invoke(..)
but it was a play on Kotlin calling syntax. I'm not against "routine" but that may be too overloaded?
override fun getDecimalInstance(v1: PType, v2: PType): Function.Instance { | ||
return getNumericInstance(v1, v2) | ||
} | ||
|
||
override fun getRealInstance(realLhs: PType, realRhs: PType): Function.Instance { | ||
return getNumericInstance(realLhs, realRhs) | ||
} | ||
|
||
override fun getDoubleInstance(doubleLhs: PType, doubleRhs: PType): Function.Instance { | ||
return getNumericInstance(doubleLhs, doubleRhs) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to make decimalLhs: PType, decimalRhs: PType
like the others?
@@ -32,6 +32,7 @@ internal abstract class Fn_COLL_AGG__BAG__ANY( | |||
override fun getInstance(args: Array<PType>): Function.Instance = instance | |||
|
|||
private val instance = object : Function.Instance( | |||
name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do instances need names? Maybe its obvious, but I don't think these names should be user provided. Makes me think Function needs a builder with .addInstance(...) which controls adding the names so we can't have a scenario where a Function returns an instance with a different name.
/** | ||
* @return the precedence of the types for the PartiQL comparator. | ||
* @see .TYPE_PRECEDENCE | ||
*/ | ||
@Suppress("deprecation") | ||
internal val TYPE_PRECEDENCE: Map<Kind, Int> = listOf( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Top-level interval vals are ok afaik (functions were the leaked apis), but wouldn't hurt to put in a TypePrecedence object as a static field, kotlin is doing that anyways so let's at least control it.
Relevant Issues
Description
ArithmeticDiadicOperator
. It holds a table of pointers to functions given two argument types. The function is in charge of providing the closest function impl, but the engine is still in charge of coercions and final resolution.plusTests
inPartiQLEvaluatorTests
Performance Implications
This should make dynamic dispatches faster, since we now no longer need to loop over several candidates. They are consolidated. On top of that, the choosing of an implementation is internalized and uses a lookup table to find the most appropriate instance for PLUS and BITWISE_AND.
Edit: I ran some quick benchmarks to confirm this theory, and the assumption is correct. I've published these benchmarks via a tag on my fork. Comparing the new plus operator and the old minus operator:
The static execution time remains the same (makes sense), but the dynamic invocation sees performance gains. An approximate speedup of 1.85. This translates to almost twice as fast. And, looking at the implementation of
ExprCallDynamic
, I can see how we can make this even faster 👍 .Huge props to @RCHowell for this modelling change.
License Information
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.