fix: support unwrapped struct value inference #6446

vcrfxia · 2020-10-16T04:42:32Z

Description

The cause of the bug is that the DefaultSchemaInjector always unwraps struct schemas (assuming that they're wrapped), ignoring any serde features that should cause it to not do so (specifically, wrap_single_value=false). This PR fixes the bug by passing serde features into the schema translator, which is admittedly hacky and ugly but I couldn't think of a way around this.

Testing done

QTT

Reviewer checklist

Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
Ensure relevant issues are linked (description should include text like "Fixes #")

vcrfxia · 2020-10-16T04:46:23Z

ksqldb-engine/src/main/java/io/confluent/ksql/schema/ksql/inference/DefaultSchemaInjector.java

@@ -122,6 +123,7 @@ public DefaultSchemaInjector(final TopicSchemaSupplier schemaSupplier) {
        props.getKafkaTopic(),
        props.getKeySchemaId(),
        keyFormat,
+        SerdeFeatures.of(SerdeFeature.UNWRAP_SINGLES),


I don't like that this is hard-coded here. If it's to be hard-coded somewhere, I'd prefer it to be further down in the call stack, e.g., in the schema supplier or even in the schema translator. The drawback to that would be that we'd no longer be able to share code between the key and value inference pathways as easily.

First off, you should only be setting this flag if the format supports unwrapping! Or more correctly, if the format supports both unwrapping and wrapping.

As to hard coding it... humm, yes, so far it's been hard-coded in one place: SerdeFeaturesFactory.

We could do something like:

Suggested change

SerdeFeatures.of(SerdeFeature.UNWRAP_SINGLES),

SerdeFeaturesFactory.buildInternal(FormatFactory.of(keyFormat),

But, to be honest, this is only a temporary thing. At the moment, we only support unwrapped keys. Once we support something else we'll wire it up and this hardcoding will go.

Good call on only setting the feature if it's supported.

I worry the naming of buildInternal() will be confusing here since it's not clear what internal topics have to do with schema registry. Is it worth introducing another alias for the method such as setUnwrappingIfSupported() or would that only make things worse?

big-andy-coates

LGTM

big-andy-coates · 2020-10-16T13:05:25Z

...src/main/java/io/confluent/ksql/schema/ksql/inference/SchemaRegistryTopicSchemaSupplier.java

+      columns = translator.toColumns(
+          parsedSchema,
+          isKey,
+          serdeFeatures.enabled(SerdeFeature.UNWRAP_SINGLES)


I'd probably just pass all the features down. There could be other features later that also affect the schema.

big-andy-coates · 2020-10-16T13:13:37Z

ksqldb-serde/src/main/java/io/confluent/ksql/serde/connect/ConnectFormatSchemaTranslator.java

    Schema connectSchema = connectSrTranslator.toConnectSchema(schema);

-    if (connectSchema.type() != Type.STRUCT) {
+    if (connectSchema.type() != Type.STRUCT || unwrapSingle) {


I don't think we need to check both. Just checking the serde features is enough. So I think this should become:

@Override public List<SimpleColumn> toColumns( final ParsedSchema schema, final SerdeFeatures serdeFeatures, final boolean isKey ) { SerdeUtils.throwOnUnsupportedFeatures(serdeFeatures, format.supportedFeatures()); Schema connectSchema = connectSrTranslator.toConnectSchema(schema); if (serdeFeatures.enabled(SerdeFeature.UNWRAP_SINGLES)) { connectSchema = SerdeUtils.wrapSingle(connectSchema, isKey); } if (connectSchema.type() != Type.STRUCT) { if (isKey) { throw new IllegalStateException("Key schemas are always unwrapped."); } throw new KsqlException("Schema returned from schema registry is anonymous type. " + "To use this schema with ksqlDB, set '" + CommonCreateConfigs.WRAP_SINGLE_VALUE + "=false' in the WITH clause properties."); } final Schema rowSchema = connectKsqlTranslator.apply(connectSchema); return rowSchema.fields().stream() .map(ConnectFormatSchemaTranslator::toColumn) .collect(Collectors.toList()); }

Much cleaner! Thanks for the suggestion.

Wanted to check that we're OK removing the exception that's thrown if an unwrapped schema is encountered for a format that does not support unwrapping (link) -- it makes sense to me that we're removing this since the only way we'd end up in this situation is if ksqlDB designated a format the supports unwrapping as not supporting it, but maybe it's better to be defensive and leave the check in?

big-andy-coates · 2020-10-16T13:29:04Z

ksqldb-engine/src/main/java/io/confluent/ksql/schema/ksql/inference/DefaultSchemaInjector.java

@@ -122,6 +123,7 @@ public DefaultSchemaInjector(final TopicSchemaSupplier schemaSupplier) {
        props.getKafkaTopic(),
        props.getKeySchemaId(),
        keyFormat,
+        SerdeFeatures.of(SerdeFeature.UNWRAP_SINGLES),


First off, you should only be setting this flag if the format supports unwrapping! Or more correctly, if the format supports both unwrapping and wrapping.

As to hard coding it... humm, yes, so far it's been hard-coded in one place: SerdeFeaturesFactory.

We could do something like:

Suggested change

SerdeFeatures.of(SerdeFeature.UNWRAP_SINGLES),

SerdeFeaturesFactory.buildInternal(FormatFactory.of(keyFormat),

But, to be honest, this is only a temporary thing. At the moment, we only support unwrapped keys. Once we support something else we'll wire it up and this hardcoding will go.

vcrfxia · 2020-10-16T17:43:33Z

Merging in order to unblock the other PR behind this one -- will address anything that comes out of the open discussions in a follow-up PR.

vcrfxia requested a review from a team as a code owner October 16, 2020 04:42

vcrfxia added 2 commits October 15, 2020 21:43

fix: support unwrapped struct value inference

ce0ed79

chore: historic plans

d3a3ffc

vcrfxia force-pushed the unwrap-struct branch from 57780f4 to d3a3ffc Compare October 16, 2020 04:44

vcrfxia commented Oct 16, 2020

View reviewed changes

vcrfxia requested a review from big-andy-coates October 16, 2020 04:48

big-andy-coates approved these changes Oct 16, 2020

View reviewed changes

chore: feedback

2bbfbe9

vcrfxia mentioned this pull request Oct 16, 2020

feat: support non-primitive types as keys #6448

Merged

2 tasks

vcrfxia merged commit ed3ca5e into confluentinc:master Oct 16, 2020

vcrfxia deleted the unwrap-struct branch October 16, 2020 17:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: support unwrapped struct value inference #6446

fix: support unwrapped struct value inference #6446

vcrfxia commented Oct 16, 2020 •

edited

Loading

vcrfxia Oct 16, 2020

big-andy-coates Oct 16, 2020

vcrfxia Oct 16, 2020

big-andy-coates left a comment

big-andy-coates Oct 16, 2020

big-andy-coates Oct 16, 2020

vcrfxia Oct 16, 2020

big-andy-coates Oct 16, 2020

vcrfxia commented Oct 16, 2020

	SerdeFeatures.of(SerdeFeature.UNWRAP_SINGLES),
	SerdeFeaturesFactory.buildInternal(FormatFactory.of(keyFormat),

fix: support unwrapped struct value inference #6446

fix: support unwrapped struct value inference #6446

Conversation

vcrfxia commented Oct 16, 2020 • edited Loading

Description

Testing done

Reviewer checklist

vcrfxia Oct 16, 2020

Choose a reason for hiding this comment

big-andy-coates Oct 16, 2020

Choose a reason for hiding this comment

vcrfxia Oct 16, 2020

Choose a reason for hiding this comment

big-andy-coates left a comment

Choose a reason for hiding this comment

big-andy-coates Oct 16, 2020

Choose a reason for hiding this comment

big-andy-coates Oct 16, 2020

Choose a reason for hiding this comment

vcrfxia Oct 16, 2020

Choose a reason for hiding this comment

big-andy-coates Oct 16, 2020

Choose a reason for hiding this comment

vcrfxia commented Oct 16, 2020

vcrfxia commented Oct 16, 2020 •

edited

Loading