Fix slow kafka sink when queue.buffering.max.ms is set to > 0. #585
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull request
Description
Our kafka sink was not only enqueueing every kafka record coming from an event, but also waiting for its delivery before returning. The setting
queue.buffering.max.ms
determined how the rdkafka broker thread waited upon receiving the message in its queue until it sends it off to the broker. The goal here is to possibly batch messages for more efficient transport. But we handle each sink in its own task, and this one task is blocked until message N is delivered. So there is no other message which can arrive beforequeue.buffering.max.ms
expires. Ergo, we not only always waited forqueue.buffering.max.ms
(if set to > 0), we never applied any kafka internal batching and always sent out 1 message to the broker. Yes, indeed!Now we only enqueue all messages coming from an event and then, in another spawned task, wait for the delivery to happen to send out acks/fails.
Related
Checklist