chore: add GROUP BY support for any key names #4899
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
fixes: #4898
This commit sees the result of a GROUP BY on a single column reference have a schema with a key column matching the name of the column, e.g.
If the GROUP BY is on anything other than a single column reference then the key column will be a unique generated column name, e.g.
BREAKING CHANGE: Existing queries that reference a single GROUP BY column in the projection would fail if they were resubmitted, due to a duplicate column. The same existing queries will continue to run if already running, i.e. this is only a change for newly submitted queries. Existing queries will use the old query semantics.
This PR also includes an change to the internals of how GROUP BY is handled: the name of the columns from the source in the repartition and changelog topics had names like
KSQL_INTERNAL_COL_0
etc. With this source columns retain their original names within the internal topics. Names such asKSQL_INTERNAL_COL_0
are used only for additional columns used to track UDAF parameters. This change is fully backwards compatible, as the plan stores the column names used in the internal topics. However, the change has meant a whole load of new historical plans needed to be generated.Testing done
usual
Reviewer checklist