-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes output.kafka bulks sizes #12254
Fixes output.kafka bulks sizes #12254
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
@jsoriano Can you take a look? |
…k_max_frequency config options
49ccc89
to
f5c6288
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marqc Thanks for this contribution, this looks quite good to me, I only wonder if we should set the hard-limit instead of the "best effort" limit to be consequent with the current documentation.
Could you please also add a changelog entry for this?
…efininig max_bulk_size
@jsoriano changelog entry added |
jenkins, test this please |
jenkins, test this again please |
@jsoriano how about merging this? all is green |
@jsoriano will this be backported to 7.x? |
@marqc yes, this will be released in principle in 7.3.0. |
Set sarama `Producer.Flush` params based on `bulk_max_size` and `bulk_max_frequency` config options to control the bulk size when Kafka output is used.
Currently property "bulk_max_size" is not used.
output.kafka does not set any sarama Producer.Flush config options, which results in sending bulks as fast as possible. Example bulks sizes I observed when starting filebeat against file with about 9599 log entries looks like belowe:
count:1 count:1506 count:1712 count:877 count:171 count:1034 count:1709 count:1182 count:2 count:70 count:206 count:984 count:145
This change sets sarama's
Producer.Flush.Messages
to configured propertybulk_max_size
. To avoid waiting forever, when no new entries are collected new propertybulk_flush_frequency
of type Duration is introduced.When this options are set bulks observed on kafka server looks like below:
count:1711 count:1711 count:1709 count:1709 count:1709 count:1050
Default value for
bulk_flush_frequency
is 0 which keeps current "bulking" behavior.