Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[loki-distributed] too many unhealthy instances in the ring #2154

Open
patsevanton opened this issue Jan 26, 2023 · 2 comments
Open

[loki-distributed] too many unhealthy instances in the ring #2154

patsevanton opened this issue Jan 26, 2023 · 2 comments

Comments

@patsevanton
Copy link
Contributor

patsevanton commented Jan 26, 2023

Hello! Thanks for Loki!

I set maxUnavailable 0 and change request, limit.
But now Loki don`t work.

Grafana datasource write: "Unable to fetch labels from Loki (Failed to call resource), please check the server logs for more details"

Loki logs write:

querier level=error caller=series_index_store.go:583 org_id=fake msg="error querying storage" err="query index: rpc error: code = Canceled desc = context canceled"
querier level=error caller=series_index_store.go:583 org_id=fake msg="error querying storage" err="query index: rpc error: code = Canceled desc = context canceled"
querier level=error caller=series_index_store.go:583 org_id=fake msg="error querying storage" err="query index: rpc error: code = Canceled desc = context canceled"
querier level=error caller=series_index_store.go:583 org_id=fake msg="error querying storage" err="query index: rpc error: code = Canceled desc = context canceled"
querier level=error caller=series_index_store.go:583 org_id=fake msg="error querying storage" err="query index: rpc error: code = Canceled desc = context canceled"
query-frontend level=error aller=retry.go:73 org_id=fake msg="error processing request" try=0 err="rpc error: code = Code(500) desc = too many unhealthy instances in the ring\n"
query-frontend level=error caller=retry.go:73 org_id=fake msg="error processing request" try=1 err="rpc error: code = Code(500) desc = too many unhealthy instances in the ring\n"
query-frontend level=error caller=retry.go:73 org_id=fake msg="error processing request" try=2 err="rpc error: code = Code(500) desc = too many unhealthy instances in the ring\n"
query-frontend level=error caller=retry.go:73 org_id=fake msg="error processing request" try=3 err="rpc error: code = Code(500) desc = too many unhealthy instances in the ring\n"
query-frontend level=error caller=retry.go:73 org_id=fake msg="error processing request" try=4 err="rpc error: code = Code(500) desc = too many unhealthy instances in the ring\n"

How delete unhealthy instances in the ring ?

@patsevanton patsevanton changed the title too many unhealthy instances in the ring [loki-distributed] too many unhealthy instances in the ring Jan 26, 2023
@patsevanton
Copy link
Contributor Author

patsevanton commented Jan 26, 2023

Fix by hand:
curl -H "Accept: application/json" http://loki-loki-distributed-distributor:3100/ring | jq ".shards[] | select(.state==\"UNHEALTHY\") | .id" | sed 's|"||g' | xargs -I{} curl -d "forget={}" -d 'csrf_token=$__CSRF_TOKEN_PLACEHOLDER__' -H "Accept: application/json" http://loki-loki-distributed-distributor:3100/ring

Need try cortexproject/cortex#1521

ingester:
  autoforget_unhealthy: true

@mrr4cc00n
Copy link

mrr4cc00n commented Sep 4, 2024

You can also port-forward your distributor service (port 3100) and open it in the browser e.g. localhost:/ring, there you will get a UI with all the ingesters you can click forget to remove the unhealthy ones

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants