Kafka streams binder creates a -repartition topic even though no key modifying operations are done by client code #357

danishgarg · 2018-04-11T12:50:15Z

The kafka streams documentation states that the serialization/deserialization of kafka message keys is not handled by the framework and it is left to Kafka:

"It is worth to mention that Kafka Streams binder does not serialize the keys on outbound - it simply relies on Kafka itself"

It is worth to mention that Kafka Streams binder does not deserialize the keys on inbound - it simply relies on Kafka itself

However, when using kafka streams binder, the binder automatically creates a -repartition topic because the serialization/deserialization code uses the map function on the stream. This marks the stream for re-partitioning and any subsequent group by operation done by client code results in creation of a repartition topic.
The use of Spring cloud stream should ideally be transparent and side effect free. Since there is no serialization/deserialization done by the framework when it comes to keys, there is a case to use mapValues instead of map when doing message value serialization/deserialization. This will prevent creation of any extra topics.
The following SO has the details:
(https://stackoverflow.com/questions/49704688/groupbykey-creates-repartition-topic-even-though-there-is-no-key-change?answertab=active#tab-top)

sobychacko · 2018-04-11T13:26:58Z

@danishgarg Thank you for reporting the issue. We will look into it soon. In the meantime, contributions are welcomed!

sabbyanandan · 2018-04-11T14:08:54Z

Hi, @danishgarg. Good write-up! Thanks for bringing it to our attention. Let us know if you have any other suggestions for improvements.

Resolves spring-attic#357

sobychacko added this to the 2.1.0.M1 milestone Apr 11, 2018

sabbyanandan added the ready label Apr 11, 2018

sabbyanandan assigned sobychacko Apr 11, 2018

danishgarg mentioned this issue Apr 11, 2018

Changed occurrences of map calls on kafka streams to mapValues - Issue #357 #358

Closed

sabbyanandan added in pr and removed ready labels Apr 11, 2018

sobychacko closed this as completed in a259269 Apr 11, 2018

sobychacko removed the in pr label Apr 11, 2018

sobychacko pushed a commit to sobychacko/spring-cloud-stream-binder-kafka that referenced this issue Apr 11, 2018

Changed occurances of map calls on kafka streams to mapValues

d141ad3

Resolves spring-attic#357

TimWillard mentioned this issue Jul 19, 2018

Streams get flagged for re-partitioning even when unnecessary. #412

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka streams binder creates a -repartition topic even though no key modifying operations are done by client code #357

Kafka streams binder creates a -repartition topic even though no key modifying operations are done by client code #357

danishgarg commented Apr 11, 2018 •

edited

Loading

sobychacko commented Apr 11, 2018

sabbyanandan commented Apr 11, 2018

Kafka streams binder creates a -repartition topic even though no key modifying operations are done by client code #357

Kafka streams binder creates a -repartition topic even though no key modifying operations are done by client code #357

Comments

danishgarg commented Apr 11, 2018 • edited Loading

sobychacko commented Apr 11, 2018

sabbyanandan commented Apr 11, 2018

danishgarg commented Apr 11, 2018 •

edited

Loading