-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance short circuit handling in CommonSubexprEliminate
#11197
Enhance short circuit handling in CommonSubexprEliminate
#11197
Conversation
Thanks @peter-toth -- I plan to review this tomorrow morning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @peter-toth -- I spent some time this morning playing with this PR and I think it makes sense to me.
Maybe @haohuaijin (as the original author of this feature in #8928) might have some time to review this PR as well
} | ||
let is_tree = expr.short_circuits(); | ||
let tnr = if is_tree { | ||
TreeNodeRecursion::Jump |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the meaning of is_tree
here.
Maybe we could add a comment explaining that Jump
will skip children but continue with siblings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can rename is_tree
to is_short_circuits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I forgot to add a comment here. Basically I wanted to express that we handle the expression as a subtree (not just a node) in this case.
I added a comment in c02bae9.
|
||
let extracted_short_circuit = col("a").eq(lit(0)).or(col("b").eq(lit(0))); | ||
let not_extracted_short_circuit_leg = (col("a") + col("b")).eq(lit(0)); | ||
let plan = LogicalPlanBuilder::from(table_scan.clone()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test covers the negative case too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @peter-toth and @alamb,
assert_optimized_plan_eq(expected, plan, None); | ||
|
||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test case like the one below to check if (a or b) can be extracted as a common subexpr?
select ((a or b) or d) as f1, ((a or b) or c) as f2 from t;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR doesn't extract surely evaluated children of short circuiting expressions so I kept that TODO
yet ( https://github.com/apache/datafusion/pull/11197/files#diff-351499880963d6a383c92e156e75019cd9ce33107724a9635853d7d4cd1898d0R955).
I will address that in a separate follow-up PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add the test (as a negative test perhaps)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've adjusted the test case to test both legs of an OR expression: 8b75d82. In this PR none of them are extraceted, but the after that follow-up PR the srurely evaluated first leg (called not_extracted_short_circuit_leg_1
now) will be extracted.
} | ||
let is_tree = expr.short_circuits(); | ||
let tnr = if is_tree { | ||
TreeNodeRecursion::Jump |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can rename is_tree
to is_short_circuits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @peter-toth and @haohuaijin -- I plan to merge this PR tomorrow unless there are other comments or other reviewers would like more time to review
Thanks again everyone |
Which issue does this PR close?
Part of #11194.
Rationale for this change
Curently
CommonSubexprEliminate
always skips expressions that can short circuit, but that's not quite right as only the children of those expressions should be skipped.I.e. from the plan:
the expression
a OR b
can be extracted as common expression, but from the plan:the expression
a + b
can't be extracted.What changes are included in this PR?
Changes the rule to handle short circuiting expressions better. As
VisitRecord::JumpMark
is no longer needed this PR removes it.Are these changes tested?
Yes, added new UT.
Are there any user-facing changes?
No.