-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP Validate lineage for all JexlNode rebuilding visitors #925
WIP Validate lineage for all JexlNode rebuilding visitors #925
Conversation
Multiple RebuildingVisitor implementations do not return a rebuilt query that is considered valid according to JexlASTHelper.validateLineage(). Ensure that existing RebuildingVisitors are tested, and when necessary, modified to return a JEXL tree with a valid lineage. Fixes #880 Fix AllTermsIndexedVisitor Verify DateIndexCleanupVisitor lineage Verify ExpandMultinormalizedTerms lineage Verify FixNegativeNumbersVisitor lineage Remove unnecessary query print Remove System.out.print Verify FixUnindexedNumericTerms lineage Verify QueryModelVisitor lineage Fix RangeCoalescingVisitor Verify RegexFunctionVisitor lineage
warehouse/query-core/src/main/java/datawave/query/jexl/visitors/AllTermsIndexedVisitor.java
Outdated
Show resolved
Hide resolved
warehouse/query-core/src/main/java/datawave/query/jexl/visitors/AllTermsIndexedVisitor.java
Show resolved
Hide resolved
|
||
for (int i = 0; i < toAttach.jjtGetNumChildren(); i++) { | ||
JexlNode node = copy(prunedNode); | ||
JexlNode attach = (JexlNode) toAttach.jjtGetChild(i).jjtAccept(this, data); | ||
attach.jjtSetParent(node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see lots of refactoring here but this is the only line that actually fixes a parenting problem for this class. Refactoring for the sake of refactoring may do more harm than good if anybody else is working on these visitors for any other reason. It also makes it very difficult for anybody to actually accept the entire package and merge. I think you may need to pull back a little from the refactoring efforts. Better yet a separate pull request per visitor that is only refactoring would be considerably more palatable.
import datawave.query.exceptions.DatawaveFatalQueryException; | ||
import datawave.query.jexl.JexlASTHelper; | ||
import datawave.query.jexl.JexlNodeFactory; | ||
import datawave.query.jexl.JexlNodeFactory.ContainerType; | ||
import datawave.query.model.QueryModel; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I am stumped in this visitor as to where any lineage problems were fixed. Is this one purely refactoring?
andNode.jjtAddChild(newChild, i); | ||
newChild.jjtSetParent(andNode); | ||
} | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this visitor looks good.
if (functionMetadata.name().equals("excludeRegex")) { | ||
newParent = new ASTAndNode(ParserTreeConstants.JJTANDNODE); | ||
|
||
switch (functionMetadata.name()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much cleaner, thanks
First I must explain what a (query property) "marker" node is. A portion of a query (jexl subtree) can be marked as have some attribute by attaching an assignment node to that subtree. So if I want to mark some node A with a marker N, then I replace A with ((N = true) && (A)). These markers can be thought of as query execution hints that the planner puts in place to allow the execution to do certain things. Second I must explain what "push down" means. To "push down" some portion of a query (jexl subtree) is to use the field index instead of the global index, or pushing down farther to the query evaluation phase. Query execution basically started with global index lookup, then field index lookup (if required), then query evaluation meaning bouncing the boolean logic against all of the values for a document. Here is some description of these visitors:
|
@ivakegg I have removed any refactoring not related to fixes. I will create separate PRs for that work as we discussed so that this PR is focused on fixes only. |
Closing this PR to break it up into individual PRs for each visitor. |
WIP for #880 for verifying/fixing JexlNode rebuilding visitors to ensure that they preserve proper parentage.
I have fixed/verified parentage for the following visitors:
AllTermsIndexedVisitor
BooleanOptimizationVisitor
FacetCheck
DateIndexCleanupVisitor
ExpandMultinormalizedTerms
TreeFlatteningRebuildingVisitor
FixNegativeNumbersVisitor
GeowavePruningVisitor
QueryModelVisitor
RangeCoalescingVisitor
RegexFunctionVisitor
I am currently working on fixing parentage for the following visitors:
UniqueExpressionTermsVisitor
ExpandCompositeTerms
There are a number of rebuilding visitors that do not have any unit tests associated with them that I could find, nor was there enough documentation for me to be able to extrapolate test cases for them. I would appreciate any help with examples or desired test cases so that I can write tests for each of the following and verify parentage.
BoundedRangeDetectionVisitor
FunctionIndexQueryExpansionVisitor
FunctionNormalizationRebuildingVisitor
IsNotNullIntentVisitor
ParallelIndexExpansion
PruneLessSelectiveFieldsVisitor
PushdownLargeFieldedListsVisitor
PushdownMissingIndexRangeNodesVisitor
PushFunctionsIntoExceededValueRanges
RangeConjunctionRebuildingVisitor
RangeExpansionThresholdRebuildingVisitor