Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-41985][SQL] Centralize more column resolution rules
### What changes were proposed in this pull request? This is a followup of apache#38888 . When I search for all the matching of `UnresolvedAttribute`, I found that there are still a few rules doing column resolution: 1. ResolveAggAliasInGroupBy 2. ResolveGroupByAll 3. ResolveOrderByAll 4. ResolveDefaultColumns This PR merges the first 3 into `ResolvedReferences`. The last one will be done with a separate PR, as it's more complicated. To avoid making the rule `ResolvedReferences` bigger and bigger, this PR pulls out the resolution code for `Aggregate` to a separated virtual rule (only be used by `ResolvedReferences`). The same to `Sort`. We can refactor and add more virtual rules later. ### Why are the changes needed? It's problematic to not centralize all the column resolution logic, as the execution order of the rules is not reliable. It actually leads to regression after apache#38888 : `select a from t where exists (select 1 as a group by a)`. The `group by a` should be resolved as `1 as a`, but now it's resolved as outer reference `a`. This is because `ResolveReferences` runs before `ResolveAggAliasInGroupBy`, and resolves outer references too early. ### Does this PR introduce _any_ user-facing change? Fixes a bug, but the bug is not released yet. ### How was this patch tested? new tests Closes apache#39508 from cloud-fan/column. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 40ca27c) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
- Loading branch information