Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-41630][SQL] Support implicit lateral column alias resolution on Project #38776
[SPARK-41630][SQL] Support implicit lateral column alias resolution on Project #38776
Changes from 16 commits
04959c2
6f44c85
725e5ac
660e1d2
fd06094
7d4f80f
b9704d5
97ee293
5785943
757cffb
29de892
72991c6
d45fe31
1f55f73
f753529
b9f706f
94d5c9e
8d20986
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we have a rule like
RemoveTempResolvedColumn
to restoreLateralColumnAliasReference
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't add it intentionally. This is because I don't want those attributes actually can be resolve as LCA but to show in the error msg as UnresolvedAttribute. Also note that unlike RemoveTempResolvedColumn, LCARef can't be directly resolved to the NamedExpression inside of it because the plan won't be right - there is no alias push down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this should not happen, we should throw an internal error
SparkThrowable.internalError
, so that it can include more debug information, instead ofUNRESOLVED_COLUMN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have to keep a mutable state,
TreeNodeTag
is a better choice. Directly usingvar
in catalyst TreeNode is strongly discouraged.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QQ: I didn't add it in the constructor of
OuterReference
due to binary compatibility. Is that concern valid?Actually, what is the risk to change the constructor, but also write anotherThis seems impossible without introducing a new object with another name, and still requires large portion of code change of pattern matching.unapply
function?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit tricky. Maybe we should invoke
WrapLateralColumnAliasReference
inResolveOuterReferences
, so that we don't need to re-resolve outer references and introduce this hack.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed before, I feel it is not safe to do so given the current solution in ResolveOuterReference that each rule is applied only once. I made up a query (it can't run, just for demonstration):
It is possible that
dept * 2.0
is not resolved because it needs type conversion, so the LCA rule doesn't apply. Then it just wraps theid
inid + 1 AS id2
as OuterReference.