Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No fluentbit_input_records_total in metrics after upgrade to 2.2.0 #8287

Closed
hoooli opened this issue Dec 15, 2023 · 5 comments
Closed

No fluentbit_input_records_total in metrics after upgrade to 2.2.0 #8287

hoooli opened this issue Dec 15, 2023 · 5 comments
Labels
status: waiting-for-triage waiting-for-release This has been fixed/merged but it's waiting to be included in a release.

Comments

@hoooli
Copy link

hoooli commented Dec 15, 2023

Yesterday, I upgraded Fluent Bit to version 2.2.0, and since then, metric fluentbit_input_records_total is not working. It consistently returns 0, even though other metrics are functioning properly.

image
image

In grafana/prometheus I'm using
rate(fluentbit_input_records_total[1m])

  • Configuration:
       Flush         5
       Log_Level     info
       Parsers_File  parsers.conf
       Daemon        off
       HTTP_Server  On
       HTTP_Listen  0.0.0.0
       HTTP_PORT    2020

   [INPUT]
       Name              tail
       Tag               kube.*
       Path              /var/log/containers/*.log
       Exclude_Path      /var/log/containers/*test*.log
       Path_Key          path_filename
       Parser            docker
       DB                /var/log/flb_kube.db
       Mem_Buf_Limit     100MB
       Buffer_Max_Size   8MB
       Refresh_Interval  8
       Rotate_Wait       8
   [INPUT]
       Name              tail
       Tag               kube.test.*
       Path              /var/log/containers/*test*.log
       Path_Key          path_filename
       Parser            docker
       DB                /var/log/flb_kube_2.db
       Mem_Buf_Limit     100MB
       Buffer_Max_Size   8MB
       Refresh_Interval  8

  [FILTER]
       Name                kubernetes
       Match               kube.var.*
       Kube_Tag_Prefix     kube.var.log.containers.
       Merge_Log           On
       K8S-Logging.Parser  On
       K8S-Logging.Exclude On
       Annotations         Off
   [FILTER]
       Name                kubernetes
       Match               kube.test.*
       Kube_Tag_Prefix     kube.test.var.log.containers.
       Merge_Log           On
       K8S-Logging.Parser  On
       K8S-Logging.Exclude On
       Annotations         Off

   [OUTPUT]
       Name         forward
       Match        kube.var.*
       Host        fluentd_host
       Port         9880
       Retry_Limit  5
   [OUTPUT]
       Name        loki
       Match       *
       Host        loki_url
       Port        80
       Retry_Limit  1

   [PARSER]
       Name        docker
       Format      json
       Time_Key    time
       Time_Format %Y-%m-%dT%H:%M:%S.%L
       Time_Keep   On
       Decode_Field_As   escaped_utf8    log    do_next
       Decode_Field_As   json       log
       Reserve_Data On
       Preserve_Key On
@patrick-stephens
Copy link
Contributor

This should be resolved by #8223 but not released yet - credit to @braydonk.
There is a potential issue with that fix though: #8284.

Unstable nightly builds are available if you want to test them (not for production obviously) to confirm: ghcr.io/fluent/fluent-bit/unstable:master
There is also an AMD64-only master build on every commit: ghcr.io/fluent/fluent-bit/master

@patrick-stephens patrick-stephens added the waiting-for-release This has been fixed/merged but it's waiting to be included in a release. label Dec 15, 2023
@hoooli
Copy link
Author

hoooli commented Dec 15, 2023

Thank you, I can confirm that there is an issue with the payload in the nightly version.
However, the metrics were working for a few seconds :)

[2023/12/15 12:40:42] [error] [output:loki:loki.1] cannot compose request payload
[2023/12/15 12:40:42] [error] [engine] chunk '1-1702640427.501846521.flb' cannot be retried: task_id=4, input=tail.0 > output=loki.1
[2023/12/15 12:40:42] [error] [output:loki:loki.1] cannot compose request payload
[2023/12/15 12:40:42] [error] [engine] chunk '1-1702640428.418691142.flb' cannot be retried: task_id=9, input=tail.0 > output=loki.1
[2023/12/15 12:40:42] [error] [engine] chunk '1-1702640430.874134875.flb' cannot be retried: task_id=10, input=tail.0 > output=loki.1
[2023/12/15 12:40:43] [error] [output:loki:loki.1] cannot compose request payload
[2023/12/15 12:40:43] [error] [engine] chunk '1-1702640434.394118994.flb' cannot be retried: task_id=2, input=tail.0 > output=loki.1

image

@patrick-stephens
Copy link
Contributor

Yeah so sounds like this is resolved in that regard and now a duplicate of #8284 :)

@braydonk
Copy link
Contributor

braydonk commented Dec 15, 2023

Thanks for the test. Yeah, there was an unexpected side-effect in my first fix. Hopefully someone is able to take a look at the follow-up soon. #8229

@edsiper
Copy link
Member

edsiper commented Dec 27, 2023

closed as fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: waiting-for-triage waiting-for-release This has been fixed/merged but it's waiting to be included in a release.
Projects
None yet
Development

No branches or pull requests

4 participants