Increase Fluentd Buffer Queue Size #1877

christianberg · 2019-03-05T08:05:40Z

This increases the number of chunks that can be queued to be sent to
S3. The documentation claims that this number is unlimited when
not set, but the default was in fact recently set to 1, which
causes a backlog of chunks to build up when there is a larger number
of log files.

This increases the number of chunks that can be queued to be sent to S3. The [documentation][1] claims that this number is unlimited when not set, but the default was in fact [recently set][2] to `1`, which causes a backlog of chunks to build up when there is a larger number of log files. [1]: https://docs.fluentd.org/v1.0/articles/buffer-section#buffering-parameters [2]: fluent/fluentd#2173 Signed-off-by: Christian Berg <berg.christian@gmail.com>

christianberg · 2019-03-05T08:08:20Z

👍

mikkeloscar · 2019-03-05T08:32:17Z

👍

hjacobs · 2019-03-05T08:43:00Z

How do we know that 100 is a good number?

christianberg · 2019-03-05T09:07:48Z

I admit that 100 is somewhat arbitrary.

After looking into the Fluentd source code and running a few tests (which I'm documenting in more detail and will link here when done), I believe it is safe to set this relatively high. Both the buffered chunks and the queued chunks are stored on disk, and the only difference between the two (that I can see), is that the former have a b in the filename, while the latter have a q.

This means that no additional resources are consumed by putting more chunks into the queue. But chunks are only moved into the queue at regular intervals (~ every 5 seconds with our time-slicing settings, if I followed the calculations correctly), and if the queue is too small (i.e. the default length of 1), the S3 upload thread is starved and a backlog builds up.

100 was sufficient in my tests to not starve the thread, but a lower number would probably also be fine. When the log throughput gets high enough for the queue to fill up with 100 chunks, other factors are actually more limiting (specifically CPU consumption of Fluentd) and would need to be addressed first. 100,000 events/minute can be handled with the current settings.

aermakov-zalando · 2019-03-05T14:49:23Z

👍

christianberg force-pushed the increase-fluentd-queue branch from 3c59f2c to cd297c1 Compare March 5, 2019 08:06

aermakov-zalando merged commit 55c2307 into dev Mar 5, 2019

aermakov-zalando deleted the increase-fluentd-queue branch March 5, 2019 14:49

This was referenced Mar 6, 2019

dev-to-alpha #1883

Merged

alpha-to-beta #1885

Merged

beta-to-stable #1893

Merged

aermakov-zalando added merged/stable merged/beta merged/alpha labels Apr 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase Fluentd Buffer Queue Size #1877

Increase Fluentd Buffer Queue Size #1877

christianberg commented Mar 5, 2019

christianberg commented Mar 5, 2019

mikkeloscar commented Mar 5, 2019

hjacobs commented Mar 5, 2019

christianberg commented Mar 5, 2019

aermakov-zalando commented Mar 5, 2019

Increase Fluentd Buffer Queue Size #1877

Increase Fluentd Buffer Queue Size #1877

Conversation

christianberg commented Mar 5, 2019

christianberg commented Mar 5, 2019

mikkeloscar commented Mar 5, 2019

hjacobs commented Mar 5, 2019

christianberg commented Mar 5, 2019

aermakov-zalando commented Mar 5, 2019