-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial support for arbitrary key column names. #4701
Conversation
Partial fix for: confluentinc#3536 First part of supporting key column names other than `ROWKEY`. With this initial pass you can now name your key columns anything you want in your `CREATE TABLE` and `CREATE STREAM` statements, e.g. ```sql CREATE STREAM S (ID INT KEY, NAME STRING) WITH (...); ``` Any GROUP BY, PARTITION BY or JOIN on the key column results any created data source having a key column with a matching name, e.g. ```sql -- schema of T: ID INT KEY, COUNT BIGINT CREATE TABLE T AS SELECT COUNT() AS COUNT FROM S GROUP BY ID; ``` Pull and push queries work as expected and quoted identifiers work too. However, this functionality is not complete yet. Hence it is guarded by the `ksql.any.key.name.enabled` feature flag, which defaults to off. The following big ticket items are remaining: * PARTITION BY a single value column should result in a stream with the key column that matches the value column name. * GROUP BY a single value column should result in a table with the key column that matches the value column name. * JOIN on a single value column should result in a stream/table with the key column that matches the value column name. This additional work will be tracked under the same ticket, e.g. confluentinc#3536
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I went through the main changes in more detail then just a quick skim of the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -123,4 +129,19 @@ public KeyField getKeyField() { | |||
getTimestampColumn() | |||
); | |||
} | |||
|
|||
private void validate() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we have this, do we need #4697?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not. But #4697 makes things explicit. I'm not 100% convinced its an improvement or just noise though...
ksql-functional-tests/src/test/resources/query-validation-tests/elements.json
Show resolved
Hide resolved
Merging to avoid merge hell. @agavra : reach out if you still think we need any changes and I'll pick them up in the next PR. Thanks for the reviews! |
Description
Partial fix for: #3536
First part of supporting key column names other than
ROWKEY
.With this initial pass you can now name your key columns anything you want in your
CREATE TABLE
andCREATE STREAM
statements, e.g.CREATE STREAM S (ID INT KEY, NAME STRING) WITH (...);
Any GROUP BY, PARTITION BY or JOIN on the key column results any created data source having a key column with a matching name, e.g.
Pull and push queries work as expected and quoted identifiers work too.
However, this functionality is not complete yet. Hence it is guarded by the
ksql.any.key.name.enabled
feature flag, which defaults to off. The following big ticket items are remaining:This additional work will be tracked under the same ticket, e.g. #3536
Reviewing notes:
Documentation changes will be done in a separate PR: #4686 (WIP)
Some changes are just renaming vars/funcs so that they no longer contain 'rowkey' in their names. This may seem unnecessary. However, use of
ROWKEY
is deeply ingrained in the code base. I'm having to search for 'rowkey' as part of this work. So these renames are mainly just so that they don't show up in my next search.PR is broken down into the following commits to help with reviewing:
ROWKEY
and a new version using a custom key column name. The oldROWKEY
version will be removed once this feature is complete.Testing done
Extensive (R)QTT testing
Reviewer checklist