-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels vanishing after 30 minutes of inactivity #453
Comments
OK, it does seem to be something to do with start time; I am able to get all the logs I saw previously, by using:
where that start time is
But I still don't understand why the labels vanished. If I query even an empty time range, with start=end, I now get labels:
But I definitely wasn't getting any earlier, when writing up this ticket. |
I was able to reproduce, just leaving the system alone.
Suggests that maybe labels are only shown for log entries in the past half hour? Is this documented?
|
The
The last query covers the time up to 4 hours ago, but still no labels are seen. I also notice that It's now pretty clear that the labels are visible for 30 minutes after HH:17, which is the time that cron.hourly runs.
|
now |
I'm using curl, not logcli, and I'm following the API documentation. It says that the This could be a question of fixing the documentation (if this is intentional behaviour), or it could be a question of fixing the the implementation. One thing is for sure: I would be surprised to go into Grafana and query for available labels, and not find any. (This is what actually happened when I first tried Grafana+Loki; I switched to using the API directly to try to narrow down why logs were randomly disappearing).
No - Loki was left running continuously for the whole test. |
https://github.com/grafana/loki/blob/master/pkg/querier/http.go#L20 |
I'm not sure that's the same.
|
I did find this 30 minute constant:
Could it be something here? When chunks are flushed, the labels are forgotten? |
I will turn this into a reproducer which doesn't require promtail and doesn't depend on Ubuntu's
(Aside: for consistency I would have expected But the main thing is, labels have vanished. Even if you give an explicit time range. The time range specified above is from 24 hours before to 24 hours after the given ts value.
|
Based on what I learned from the source code the /api/prom/label only queries the labels from the ingestion instances, which means that you are getting labels which are currently "active" (or have just recently been) (see https://github.com/grafana/loki/blob/master/pkg/querier/querier.go#L130) When you query actual log data the query is sent to both ingestion nodes and the index backend. But for this you need to specify the "start" option as the label indices are spread along the time axis. I just created a PR #461 which describes this behaviour, but not the first (active labels). Does this explain what you are getting? It's another question should this be improved. |
Sounds like it could. The 30 minute ingester idle time is not documented anywhere. This behaviour isn't good. When I'm in the Grafana UI, I cannot see any logs (including historical logs) until I've selected at least one label. But if that particular label hasn't been active for >30 minutes, it's not shown. This makes loki pretty useless. |
This issue is being worked on over the next couple of days. Thanks for the thorough reporting. |
we are hitting the same issue. Any ETA on when the fix will be available? |
Fix is in review ! Thank you. |
Any more update on this? |
Is there a fix yet? I'm having the same problem with latest docker image. |
This particular issue was fixed with #521 where labels would be queried from the store. If you are missing labels in explore you should expand out the timeframe of your query and reload the page (the reload is only required to work around a temporary change made in explore to accommodate slow label query performance which is fixed in the v11 schema update that will be made the default as soon as we figure out how to best adapt this in the helm chart) |
Describe the bug
Test loki instance seemed to lose its labels after a while (30+ minutes of inactivity)
To Reproduce
Steps initially used to reproduce the behavior:
EDIT: A better way to reproduce this is shown below. The above behaviour depends on an Ubuntu container running cron.hourly at 17 minutes past the hour, and otherwise not logging anything. From 47 minutes past the hour until 17 minutes past the next hour, labels had vanished.
Expected behavior
At startup I was able to read labels, and query logs via the API:
But after going away and coming back again, no labels were shown:
And yet the chunks are still there:
After a few minutes, some logs reappeared:
It's as if the API doesn't always return all available labels and log records. However the API documentation doesn't mention this.
Actually, the API documentation doesn't give any defaults for the query parameters. I would expect 'end' to default to the current time. It's unclear if there's some default for 'limit', and/or 'start' defaults to some fixed offset before the end time.
Environment:
Additional
During all this time, no messages were logged to the screen where loki was running, except for the messages which were output when it started up.
The text was updated successfully, but these errors were encountered: