feat: support PARTITION BY on multiple expressions #6803

vcrfxia · 2020-12-18T07:27:15Z

Description

This PR adds support for PARTITION BY with multiple expressions, resulting in multiple key columns. There are no backwards compatibility concerns as the ksqlDB syntax did not support this prior to this PR.

Docs will come in a separate PR.

Testing done

QTT.

Reviewer checklist

Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
Ensure relevant issues are linked (description should include text like "Fixes #")

agavra

LGTM! I went commit-by-commit so some of my comments may already be out-of-date, in which case you can just ignore them :)

ksqldb-engine/src/main/java/io/confluent/ksql/planner/LogicalPlanner.java

agavra · 2020-12-18T18:18:16Z

ksqldb-engine/src/main/java/io/confluent/ksql/planner/LogicalPlanner.java

@@ -405,7 +407,7 @@ private PreJoinRepartitionNode buildInternalRepartitionNode(
        ExpressionTreeRewriter.rewriteWith(plugin, joinExpression);

    final LogicalSchema schema =
-        buildRepartitionedSchema(source, rewrittenPartitionBy);
+        buildRepartitionedSchema(source, Collections.singletonList(rewrittenPartitionBy));


nit: since we already import ImmutableList let's just use ImmutableList.of here

You mean we import ImmutableList in this file or in the module? I don't see it in the file but I can add it.

oh huh, I must've been looking at a separate file. - in PreJoinRepartitionNode for example it was already there before this PR. It's nice to just standardize on one (and most of hte code uses ImmutableList#of, though there isn't really any good reason to use that over Collections.singletonList except for mabye that ImmutableList has more general usages)

agavra · 2020-12-18T18:18:38Z

ksqldb-engine/src/main/java/io/confluent/ksql/planner/plan/PreJoinRepartitionNode.java

@@ -111,7 +112,7 @@ public void setKeyFormat(final KeyFormat format) {
    return getSource().buildStream(builder)
        .selectKey(
            valueFormat.getFormatInfo(),
-            partitionBy,
+            Collections.singletonList(partitionBy),


nit: same thing about ImmutableList

Sure. For my edification, why is ImmutableList preferred?

ksqldb-engine/src/main/java/io/confluent/ksql/planner/plan/UserRepartitionNode.java

ksqldb-engine/src/main/java/io/confluent/ksql/structured/SchemaKStream.java

ksqldb-functional-tests/src/test/resources/query-validation-tests/partition-by.json

agavra · 2020-12-18T18:32:22Z

ksqldb-parser/src/main/java/io/confluent/ksql/parser/tree/PartitionBy.java

+    final HashSet<Object> partitionBys = new HashSet<>(partitionByExpressions.size());
+    partitionByExpressions.forEach(exp -> {
+      if (!partitionBys.add(exp)) {
+        throw new KsqlException("Duplicate PARTITION BY expression: " + exp);


what's wrong with a duplicate partition by expression? I don't see any reason why a user might want it, but I don't see why not either (e.g. maybe their output data expects userId, userSpecialId in the key and this stream always has the same value for both)

I get that the key name conflicts might be a little weird, so we can do this in a follow-up PR, but I don't think we should prohibit it

This is consistent with what we do for multi-column GROUP BY (which has been around for a long time):

ksql/ksqldb-parser/src/main/java/io/confluent/ksql/parser/tree/GroupBy.java

Lines 49 to 53 in e7f1f47

groupingExpressions.forEach(exp -> {

if (!groupBys.add(exp)) {

throw new KsqlException("Duplicate GROUP BY expression: " + exp);

}

});

I believe the reason indeed has to do with naming.

sounds good, we can keep it like that for now

ksqldb-parser/src/main/java/io/confluent/ksql/parser/tree/PartitionBy.java

agavra · 2020-12-18T18:51:45Z

ksqldb-streams/src/main/java/io/confluent/ksql/execution/streams/PartitionByParamsFactory.java

+
+      if (row != null) {
+        for (int i = 0; i < partitionByCol.size(); i++) {
+          if (partitionByCol.get(i).shouldAppend) {


instead of relying on the implicit ordering of the partition by, it might make sense to lookup the partitionByCol.name in the resultSchema at the cost of a bit of performance. thoughts?

Do you mean replacing shouldAppend with a check to see whether the partitionByCol.name is present in the resultSchema as a value column, or do you mean iterating through resultSchema rather than partitionByCols? The former doesn't seem like an improvement to me, if we're still relying on the ordering of partitionByCols in the iteration. To remove reliance on ordering, we can do the latter and replace the lists (of columns and evaluators) with maps keyed on column name instead, but it's not clear to me that's better. It feels slightly harder to reason about code-wise but I wouldn't mind making the change.

What's your concern regarding relying the ordering? Are you worried it's brittle, or something else?

When I was reviewing the code I thought we could just set it at the index (e.g. row.set(resultSchema.get(partitionCol.name).index())). So we still keep the iteration on partitionByCol but we set it in the row based on it's index in the schema.

What's your concern regarding relying the ordering? Are you worried it's brittle, or something else?

yeah - I'm worried that it's brittle.

Ah, interesting. (We'd have to first re-size the row before calling .set() but that's an implementation detail.) The advantage of your proposal is that we can change the details of the result schema without needing to update the logic here. If we extended that further, really we should also be setting the existing value fields in the new row based on the result schema, rather than leaving those intact and appending key columns.

I guess I'm not convinced this change is necessary since our test coverage in this area is quite good -- lots of tests would break if someone modified the result schema without corresponding updates here. OTOH, it's very possible I'm biased towards thinking this code is understandable as is since I've been working on it for a while. If your assessment differs as someone who hasn't worked with this code as much, I'm inclined to go with your judgment.

As for performance, LogicalSchema stores a list of columns so finding a particular column might be slow. If we implemented this we'd want to build the index mapping outside the creation of the actual mapper, and have the mapper use the index mapping directly. I'm not opposed to this. If you think it's preferable I can open a follow-up PR.

I'm happy to leave it as is, though I suspect we'll run into this discussion again if/when we stop copying things from the key into the value

agavra · 2020-12-18T18:54:35Z

ksqldb-functional-tests/src/test/resources/query-validation-tests/partition-by.json

+        "example. To fix this, we'd have to add special handling to detect when a key expression",
+        "depends only on ROWTIME, similar to how today we have special handling to detect when a key",
+        "expression depends only on key columns."


or... we can finally swap over to using PAPI :)

vcrfxia added 4 commits December 17, 2020 17:01

chore: syntax changes

09db344

chore: null handling

27844a8

chore: cleanup

7af9b8e

chore: historic plans

aa4ce58

vcrfxia requested a review from a team as a code owner December 18, 2020 07:27

vcrfxia added 6 commits December 18, 2020 00:20

test: fix test

52533b6

chore: switch List to ImmutableList

73bc73e

test: fix tests

b8030fe

chore: fix required fields

bf93e15

test: remove obsolete test

6d4302d

Merge branch 'master' into partition-by-multi

043f12d

agavra approved these changes Dec 18, 2020

View reviewed changes

chore: feedback

5e81662

vcrfxia merged commit 5a6b48e into confluentinc:master Dec 18, 2020

vcrfxia deleted the partition-by-multi branch December 18, 2020 21:59

vcrfxia mentioned this pull request Dec 21, 2020

docs: docs for multi-column PARTITION BY (MINOR) #6810

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support PARTITION BY on multiple expressions #6803

feat: support PARTITION BY on multiple expressions #6803

vcrfxia commented Dec 18, 2020

agavra left a comment

agavra Dec 18, 2020

vcrfxia Dec 18, 2020

agavra Dec 18, 2020

agavra Dec 18, 2020

vcrfxia Dec 18, 2020

agavra Dec 18, 2020

vcrfxia Dec 18, 2020

agavra Dec 18, 2020

agavra Dec 18, 2020

vcrfxia Dec 18, 2020

agavra Dec 18, 2020

vcrfxia Dec 18, 2020

agavra Dec 18, 2020

agavra Dec 18, 2020

	groupingExpressions.forEach(exp -> {
	if (!groupBys.add(exp)) {
	throw new KsqlException("Duplicate GROUP BY expression: " + exp);
	}
	});

feat: support PARTITION BY on multiple expressions #6803

feat: support PARTITION BY on multiple expressions #6803

Conversation

vcrfxia commented Dec 18, 2020

Description

Testing done

Reviewer checklist

agavra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment