Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics: Counter expiring too soon #2333

Closed
divanikus opened this issue Jul 10, 2020 · 6 comments
Closed

Metrics: Counter expiring too soon #2333

divanikus opened this issue Jul 10, 2020 · 6 comments

Comments

@divanikus
Copy link

divanikus commented Jul 10, 2020

I'm using a simple stage which increments counter on specific log lines (WARN, ERROR, INFO, etc). Recently I've noticed that some counters just dissappear after a while. That could be a problem for rare lines. For example ERROR lines are kind of rare, maybe one in several hours, so If I try to look at them in Grafana, I see just a couple of points during day, not even connected in a line.

    pipeline_stages:
      - regex:
          expression: '^(?P<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) (?P<level>\w+) '
      - labels:
          level:
      - metrics:
          lines_total:
            type: Counter
            prefix: mc_log_
            source: level
            config:
              action: inc
      - match:
          selector: '{level="DEBUG"}'
          action: drop

image

I'm scraping promtail metrics from prometheus-server. But if I curl the /metrics endpoint of the promtail itself, I also do not see those counters after a while.

Did I miss any expiry option here? I believe it shouldn't expire so fast.

@cyriltovena
Copy link
Contributor

I think this is mostly how explore will show metrics, Have you tried using this query into a dashboard ? you can even connect zero in the options if you look around.

@divanikus
Copy link
Author

divanikus commented Jul 10, 2020

I mostly surprised why it is expiring so fast? As you might see, it expires within less than 30 minutes. I actually thought of the counter as a something long living, I don't think that it would eat too much resources. Could it be a tunable at least?

@cyriltovena
Copy link
Contributor

Sorry I didn't realize this was a counter.

@cyriltovena
Copy link
Contributor

I realize this is not well documented.

# Label values on metrics are dynamic which can cause exported metrics
# to go stale (for example when a stream stops receiving logs).
# To prevent unbounded growth of the /metrics endpoint any metrics which
# have not been updated within this time will be removed.
# Must be greater than or equal to '1s', if undefined default is '5m'
[max_idle_duration: <string>]

see https://github.com/grafana/loki/blob/master/docs/clients/promtail/stages/metrics.md

It applies to all metrics.

@cyriltovena
Copy link
Contributor

You should try 6h may be here ? depend how variable you stream is ? from what I can see in the labels, even 30d would work.

@divanikus
Copy link
Author

Yeah, it helped. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants