Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the ring never removes old ingester even if the ingester pod is evicted #1521

Open
wuyafang opened this issue Jul 19, 2019 · 46 comments
Open
Labels

Comments

@wuyafang
Copy link

wuyafang commented Jul 19, 2019

I have a similar problem as #1502
when my ingester pod was evicted , a new ingester pod will be created .
now the ring has two ingester, but only one (the new one) is healthy. the old one will not be removed from the ring, even if I delete the evict pod manually.
the ring information as follows:

`

                                        <tr>
						<td>ingester-7fc8759d7f-nzb6g</td>
						<td>ACTIVE</td>
						<td>172.16.0.62:9095</td>
						<td>2019-07-19 03:33:32 &#43;0000 UTC</td>
						<td>128</td>
						<td>45.739077787319914%</td>
						<td><button name="forget" value="ingester-7fc8759d7f-nzb6g" type="submit">Forget</button></td>
					</tr>
					
					<tr>
						<td>ingester-7fc8759d7f-wmnms</td>
						<td>Unhealthy</td>
						<td>172.16.0.93:9095</td>
						<td>2019-07-18 14:46:18 &#43;0000 UTC</td>
						<td>128</td>
						<td>54.260922212680086%</td>
						<td><button name="forget" value="ingester-7fc8759d7f-wmnms" type="submit">Forget</button></td>
					</tr>

`
and the ingester's status is always unready, with distributor's error

level=warn ts=2019-07-19T03:41:45.413839063Z caller=server.go:1995 traceID=daf4028f530860f msg="POST /api/prom/push (500) 727.847µs Response: \"at least 1 live ingesters required, could only find 0\\n\" ws: false; Connection: close; Content-Encoding: snappy; Content-Length: 3742; Content-Type: application/x-protobuf; User-Agent: Prometheus/2.11.0; X-Forwarded-For: 172.16.0.17; X-Forwarded-Host: perf.monitorefk.huawei.com; X-Forwarded-Port: 443; X-Forwarded-Proto: https; X-Original-Uri: /api/prom/push; X-Prometheus-Remote-Write-Version: 0.1.0; X-Real-Ip: 172.16.0.17; X-Request-Id: 62a470dc6de7a83c8974e3411fa63e40; X-Scheme: https; X-Scope-Orgid: custom; "

I wonder if there is any solution to deal with the situaton automatically?
maybe to check the replicas-refactor and remove unhealthy excess ingesters from the ring?

@bboreham
Copy link
Contributor

If the ingester shut down cleanly, even on eviction, then it would not be in the ring. So, the first task is to find out why it did not shut down cleanly, and if possible fix that.

Everything else you report is deliberate. We return not-ready to halt a rolling update.

@bboreham
Copy link
Contributor

Actually I don’t understand could only find 0, since your ring shows 1 active.

@wuyafang

This comment has been minimized.

@bboreham
Copy link
Contributor

I mean the ingester went through its exit sequence, rather than being abruptly terminated from outside.

There are two main cases: hand-over to another ingester, and flush to store. In both cases the time required is a function of how much data is in memory.

When using an explicitly provisioned store (eg DynamoDB) it would be nice to scale up specifically for a “save everything” operation. There’s no code to do that currently.

@wuyafang
Copy link
Author

I try to reproduce the problem by delete pod --force. and a new ingester pod is produced by deployment controller immediately.
nowI get an ring has two ingester(one is active,the other is unhealthy),an unready state for my new ingester(because there is an unhealthy pod, so 503 returned for readiness check), and a distributor log like this
level=warn ts=2019-07-19T08:18:25.511613228Z caller=logging.go:49 traceID=5eabc2f0f6b837e9 msg="POST /api/prom/push (500) 266.759µs Response: "at least 2 live ingesters required, could only find 1\n" ws: false; Connection: close; Content-Encoding: snappy; Content-Length: 5162; Content-Type: application/x-protobuf; User-Agent: Prometheus/2.11.1; X-Forwarded-For: 100.95.185.106; X-Forwarded-Host: 100.95.137.223; X-Forwarded-Port: 443; X-Forwarded-Proto: https; X-Original-Uri: /api/prom/push; X-Prometheus-Remote-Write-Version: 0.1.0; X-Real-Ip: 100.95.185.106; X-Request-Id: 89b05193a60c7935ac6a7bcd090b9a16; X-Scheme: https; X-Scope-Orgid: primary

I'm confused because my -distributor.replication-factor
is 1, so by minSuccess := (replicationFactor / 2) + 1 , my distributor only need at 1 live ingester, but the log tells me I need two.

so is there anything I misunderstood?

I wonder when the ring adds ingester and when to remove? Is consul do it by itself , or ingester tell it what to do? I notice when ingester start and shutdown, it will tell ring. But what if the ingester is shutdown unclearly ,is there any solutions to automatically clean the unhealthy pod in the ring ?

by the way , after I restart my consul , the ring will only have the active one and anything works well.

@wuyafang
Copy link
Author

wuyafang commented Jul 19, 2019

I know...
you do this

        replicationFactor := r.cfg.ReplicationFactor
	if len(ingesters) > replicationFactor {
		replicationFactor = len(ingesters)
	}

so the replicationFactor is 2 now, instead of what I set in -distributor.replication-factor.
it is in case of node joining/leaving, but will cause write failure in my case above.

@bboreham
Copy link
Contributor

That sounds like the same problem as #1290
@tomwilkie can you remember what that check is for?

if len(ingesters) > replicationFactor {

@YaoZengzeng
Copy link

Actually If I deploy one ingester and replicationFactor is 1, then ingester pod was evicted because of low memory and kubelet restart another ingester pod.

However the previous ingester didn't exit cleanly, actually the corresponding entry in the ring of consul will never be cleaned. So at this moment:

replicationFactor == len(ingesters) = 2 (1 eviceted ingester and 1 running ingester)

minSuccess = (replicationFactor / 2) + 1 = 2

However len(liveIngesters) = 1 < minSuccess ---> There is is a deadlock: unhealthy ingester never cleaned from the ring and we'll never reach minSuccess.

Two problems here:

  1. Why len(ingester) > replicationFactor, then replicationFactor = len(ingesters). ---> In fact, the ingesters more than original replicationFactor is usually not normal.

  2. If the ingester is unhealthy for a long time, why not clean it out from the ring. ---> The residual ingester will affect the replication strategy.

@bboreham @tomwilkie @csmarchbanks Any ideas ?

@bboreham
Copy link
Contributor

The current design requires that you set terminationGracePeriodSeconds long enough to shut down the first ingester cleanly.

Your point 1 seems the same as #1290

Point 2 because we don't have enough experience of situations that need this. We would probably add it as an option if someone was to submit a PR.

@YaoZengzeng
Copy link

@bboreham If ingester is killed because of OOM (Actually ingester consume a lot memory and it's very common in k8s, at least very common in my k8s environment), then it will never have terminationGracePeriodSeconds to shut down gracefully 😂

For point 1, I think replicationFactor is configured by user, so set it constant maybe more reasonable.

For point 2, I may need to read more code to better understand the design intent. If it's necessary, I'd like to make an PR to fix it.

@bboreham
Copy link
Contributor

"killed because of OOM" is not the same thing as "evicted". A pod that is OOM-killed will restart with the same identity on the same node, hence pick up the same entry in the Cortex ring.
Unless you have evidence to the contrary?

@YaoZengzeng
Copy link

@bboreham You are right.

In our environment, the kubelet was configured with hard eviction, so the ingester pod was evicted without graceful period.

However even configure kubelet with soft eviction, I have no idea of how to configure the eviction-max-pod-grace-period. Because the grace period needed by ingester may related to the amount of data it contains.

If after configured grace period, ingester still can't exit cleanly. Then the problem still can't be solved.

@miklosbarabasForm3
Copy link

miklosbarabasForm3 commented Jul 30, 2019

Hi
Apparently I ran into the same issue, although I saw msg="error removing stale clients" err="too many failed ingesters" as well in the logs.

Having a look at the code my assumption is the following (please correct me if im wrong):

  • pkg/ingester/client/pool.go#removeStaleClients() calls to p.ring.GetAll()
  • pkg/ring/ring.go#GetAll() returns error with too many failed ingesters because of the way how the logic is implemented using maxErrors.

Some questions:

  • why does GetAll need to check against the non-healthy instances at all?
  • if that check is needed, what is the purpose behind calculating the maxErrors from the Unhealthy instances instead of calculating the ACTIVE ones and using the ReplicationFactor?

    cortex/pkg/ring/ring.go

    Lines 258 to 286 in 1ca4ad0

    // GetAll returns all available ingesters in the ring.
    func (r *Ring) GetAll() (ReplicationSet, error) {
    r.mtx.RLock()
    defer r.mtx.RUnlock()
    if r.ringDesc == nil || len(r.ringDesc.Tokens) == 0 {
    return ReplicationSet{}, ErrEmptyRing
    }
    ingesters := make([]IngesterDesc, 0, len(r.ringDesc.Ingesters))
    maxErrors := r.cfg.ReplicationFactor / 2
    for _, ingester := range r.ringDesc.Ingesters {
    if !r.IsHealthy(&ingester, Read) {
    maxErrors--
    continue
    }
    ingesters = append(ingesters, ingester)
    }
    if maxErrors < 0 {
    return ReplicationSet{}, fmt.Errorf("too many failed ingesters")
    }
    return ReplicationSet{
    Ingesters: ingesters,
    MaxErrors: maxErrors,
    }, nil
    }

After changing the aforementioned code (line 278-280) to the following, I stopped receiving "error removing stale client":

if len(ingesters) < r.cfg.ReplicationFactor / 2 + 1 {
    return ReplicationSet{}, fmt.Errorf("not enough healthy ingesters (ingesters: %d, replicationFactor: %d)", len(ingesters), r.cfg.ReplicationFactor)
}

also when I set distributor.health-check-ingesters enabled, the Unhealthy ingesters got cleaned up properly. (so #1264 might be related as well) EXCEPT when it was an OOM or using ECS and your ingester comes back with the same identity (as you mentioned), on the same IP/host. Meaning there's no logic to transition from LEAVING to ACTIVE again. Shouldn't there be some logic created around that?

Note that distributor.client-cleanup-period (def 15s) & distributor.health-check-ingesters (def: false) controls the ingester cleanup, which will clean up Unhealthy ingesters if you have the health-check-ingesters enabled.

func (p *Pool) loop() {
defer p.done.Done()
cleanupClients := time.NewTicker(p.cfg.ClientCleanupPeriod)
defer cleanupClients.Stop()
for {
select {
case <-cleanupClients.C:
p.removeStaleClients()
if p.cfg.HealthCheckIngesters {
p.cleanUnhealthy()
}
case <-p.quit:
return
}
}
}

@wuyafang
Copy link
Author

Note that distributor.client-cleanup-period (def 15s) & distributor.health-check-ingesters (def: false) controls the ingester cleanup, which will clean up Unhealthy ingesters if you have the health-check-ingesters enabled.

I tried this. but it just removes unhealthy ingester from distirbutor pool( which holds ingester clients ) instead of removing them from consul ring. It doesn't work for me.

@bboreham
Copy link
Contributor

Don't read too much into the words - that's removing them from one data structure in memory. There is no code to remove ingesters from the ring when they are suspected to be dead, and this was deliberate.

@miklosbarabasForm3
Copy link

@bboreham
what was the idea behind not removing ingesters from the ring when they are suspected to be dead?

and then what is the purpose of removeStaleClients and cleanUnhealhty ? (which is just removing the unhealthy ingesters from the distributor pool only)

@bboreham
Copy link
Contributor

bboreham commented Jul 31, 2019

what was the idea behind not removing ingesters from the ring when they are suspected to be dead?

Risky, easy to get wrong, not necessary day one.

and then what is the purpose of removeStaleClients

that was to fix #217

@bboreham
Copy link
Contributor

Here's an example scenario we want to avoid: Cortex is running under Kubernetes, and a rolling update begins:

  • Kubernetes sends SIGTERM to one old ingester and starts one new ingester.
  • A bug in the new code means hand-over from old to new fails.
  • Terminating ingester starts to flush all data, but can't flush quickly enough and runs out of time.
  • Kubernetes terminates the first ingester and removes its pod entry.

Now, if we allow the rolling update to proceed, the same thing will happen in each case and we will lose the unflushed data from all ingesters, which could be a significant proportion of all data in the last 12 hours.

With the current code the rolling update is halted because there will be an "unhealthy" entry for the old ingester in the ring, and this means the new ingester will never show "ready" to Kubernetes.

@YaoZengzeng
Copy link

@bboreham Yes, it's exactly the scenario we encountered and it's annoying.

@bboreham
Copy link
Contributor

I think you would find losing half the data more annoying than having to operate the system manually when there is a fault.

@siennathesane
Copy link

I also hit this issue. I was able to work around it by completely wiping the slate clean, but it's not ideal.

@bboreham
Copy link
Contributor

If you indeed hit the same issue please follow the steps in #1521 (comment)

If your issue is different please file it separately.

@stale
Copy link

stale bot commented Feb 3, 2020

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Feb 3, 2020
@bboreham bboreham added the keepalive Skipped by stale bot label Feb 3, 2020
@stale stale bot removed the stale label Feb 3, 2020
@jgraettinger
Copy link

Hi -- FYI I've found this /ready behavior plays badly with StatefulSets using an "Ordered" Pod Management Policy (the default). I believe the fix is easy -- use a "Parallel" policy -- but documenting the problematic scenario:

Suppose you have 3 SS replicas with "Ordered" policy:

  • pod-0, pod-1, pod2 are all running
  • pod-1 & pod-2 have the power yanked at (approximately) the same time
  • pod-1 is re-started by the SS replica controller
  • pod-0 is marked as unhealthy, because it can't talk to pod-2
  • pod-1 becomes healthy
  • The replica controller is wedged because pod-0 is still unhealthy. pod-2 is never started

I experienced this running with preemptible nodes (I know, I know) and confirmed with manual testing. If the "Parallel" policy is used instead then pod-1 & pod-2 start in parallel and pick up their former places in the ring.

@pracucci
Copy link
Contributor

pracucci commented Jun 8, 2020

pod-0 is marked as unhealthy, because it can't talk to pod-2

Why is pod-0 marked as unhealthy? I can't understand this.

@bboreham
Copy link
Contributor

Now that chunks storage is deprecated and we use blocks storage, we no longer "hand-over" from one ingester to another.
So one justification for this behaviour has disappeared.

Happy to hear experience reports from people who did automate it.

@ctorrisi
Copy link

ctorrisi commented Oct 20, 2021

The ingester.autoforget_unhealthy configuration item exists in Loki since this pull request was merged grafana/loki#3919.

Would it be possible to add the same functionality into Cortex?

Or is there another way to facilitate the same behaviour as Loki's ingester.autoforget_unhealthy?

@ebr
Copy link

ebr commented Nov 11, 2021

I've read through this issue and the linked issues, and it's still unclear to me whether there is a way to have the ingester ring self-heal in case of unclean shutdowns. Not needing human operator intervention would be extremely valuable to us, as we are losing much more data due to ingesters being down compared to what we would lose by auto-forgetting unhealthy ingesters from the ring.

@rafilkmp3
Copy link

+1

@rafilkmp3
Copy link

ingester.autoforget_unhealthy sure will fix restart pods, or cortex pods can register itself always with same same to avoid this scenario

│ cortex-ingester-84c7969655-kdbqq level=warn ts=2021-11-12T14:47:55.531248888Z caller=lifecycler.go:237 msg="found an existing instance(s) with a problem in the ring, this instance cannot become ready until this problem is resol │
│ ved. The /ring http endpoint on the distributor (or single binary) provides visibility into the ring." ring=ingester err="instance cortex-ingester-84c7969655-kzvr9 past heartbeat timeout"                                         │
│ cortex-ingester-84c7969655-kdbqq level=warn ts=2021-11-12T14:48:25.533870058Z caller=lifecycler.go:237 msg="found an existing instance(s) with a problem in the ring, this instance cannot become ready until this problem is resol │
│ ved. The /ring http endpoint on the distributor (or single binary) provides visibility into the ring." ring=ingester err="instance cortex-ingester-84c7969655-kzvr9 past heartbeat timeout"                                         │
│ cortex-ingester-84c7969655-m9tnz level=warn ts=2021-11-12T14:47:56.207233612Z caller=lifecycler.go:237 msg="found an existing instance(s) with a problem in the ring, this instance cannot become ready until this problem is resol │
│ ved. The /ring http endpoint on the distributor (or single binary) provides visibility into the ring." ring=ingester err="instance cortex-ingester-84c7969655-9p7pj past heartbeat timeout"                                         │
│ cortex-ingester-84c7969655-m9tnz level=warn ts=2021-11-12T14:48:26.207033877Z caller=lifecycler.go:237 msg="found an existing instance(s) with a problem in the ring, this instance cannot become ready until this problem is resol │
│ ved. The /ring http endpoint on the distributor (or single binary) provides visibility into the ring." ring=ingester err="instance cortex-ingester-84c7969655-9p7pj past heartbeat timeout"                                         │
│ cortex-ingester-84c7969655-kdbqq level=warn ts=2021-11-12T14:48:55.528262784Z caller=lifecycler.go:237 msg="found an existing instance(s) with a problem in the ring, this instance cannot become ready until this problem is resol │
│ ved. The /ring http endpoint on the distributor (or single binary) provides visibility into the ring." ring=ingester err="instance cortex-ingester-84c7969655-kzvr9 past heartbeat timeout"                                         │
│ cortex-ingester-84c7969655-m9tnz level=warn ts=2021-11-12T14:48:56.205825525Z caller=lifecycler.go:237 msg="found an existing instance(s) with a problem in the ring, this instance cannot become ready until this problem is resol │
│ ved. The /ring http endpoint on the distributor (or single binary) provides visibility into the ring." ring=ingester err="instance cortex-ingester-84c7969655-9p7pj past heartbeat timeout"                                         │
│ cortex-ingester-84c7969655-clfgm level=warn ts=2021-11-12T14:48:56.240787216Z caller=lifecycler.go:237 msg="found an existing instance(s) with a problem in the ring, this instance cannot become ready until this problem is resol │
│ ved. The /ring http endpoint on the distributor (or single binary) provides visibility into the ring." ring=ingester err="instance cortex-ingester-84c7969655-kzvr9 past heartbeat timeout"

@jpikoulas
Copy link

ingester.autoforget_unhealthy will be amazing as deploying to AWS with spot instances, get ingesters destroyed and re span up. Exposing the Cortex Ring Status web interface to manually remove unhealthy ingesters is not practical , and it is a security concern.

@stewartshea
Copy link

@rafilkmp3 Thanks for your input on that... I'm using k8s for this and will switch the ingestors to a statefulset which should fix this issue (forcing the pods into a consistent name). The other approach was going to be a quick job that would query the endpoint and remove the unhealthy ingestors, but the statefulset approach feels much cleaner.

@bboreham
Copy link
Contributor

whether there is a way to have the ingester ring self-heal in case of unclean shutdowns.

Nobody has coded one for Cortex, to my knowledge.

deploying to AWS with spot instances

We tell you not to do this in the docs.

@jmcarp
Copy link
Contributor

jmcarp commented Dec 23, 2021

I would be happy to take a stab at writing ingester.autoforget_unhealthy based on the loki implementation if the maintainers think it makes sense.

@alanprot
Copy link
Member

alanprot commented Jan 6, 2022

+1 for this feature.

This is useful specially in the distributor ring - distributor is totally safe to be forgotten if is unhealthy for a long time (2 day). In this case is safe to assume it was an unclean shutdown and it will never come back.

Another thing is in the newest cortex release a we introduced the cortex_ring_member_ownership_percent metrics for distributors (before this metric was only for ingesters) and these metric is there even for unhealthy distributors - creating unnecessary timeseries and causing distributors to use more cpu when been scraped.

jmcarp added a commit to jmcarp/cortex that referenced this issue Feb 4, 2022
Implementation adapted from grafana/loki#3919.

Related to cortexproject#1521.

Signed-off-by: Josh Carp <jm.carp@gmail.com>
@rafilkmp3
Copy link

#1521 (comment)

How you did this ? can you share your conf ?

@rafilkmp3
Copy link

I would be happy to take a stab at writing ingester.autoforget_unhealthy based on the loki implementation if the maintainers think it makes sense.

Would be nice

@Rahuly360
Copy link

whether there is a way to have the ingester ring self-heal in case of unclean shutdowns.

Nobody has coded one for Cortex, to my knowledge.

deploying to AWS with spot instances

We tell you not to do this in the docs.

Is there any way to auto forget unhealthy rings in Cortex?

@sspreitzer
Copy link

In a Kubernetes & Helm based scenario, these Helm values could be a workaround:

ingester:
  initContainers:
    - name: cleanup-unhealthy-ingesters
      image: alpine
      command:
        - sh
        - -c
        - 'apk add curl jq && curl -H "Accept: application/json" http://cortex-distributor:8080/ingester/ring | jq ".shards[] | select(.state==\"UNHEALTHY\") | .id" | xargs -I{} curl -d "forget={}" -H "Accept: application/json" http://cortex-distributor:8080/ingester/ring'

Please be aware that you need to change the two urls in conformance to your Helm release name. Here it is cortex, so the url is http://{{ .Release.Name }}-distributor:8080/ingester/ring.
Please test thoroughly and contribute your enhancements.

@sspreitzer
Copy link

We ended up adding these Kubernetes resources for an automatic cleanup of unhealthy ingesters:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cortex-ingester-cleanup-script
  namespace: cortex
data:
  script: |
    while true; do
      which curl > /dev/null 2>&1
      if [ $? -eq 1 ]; then
        apk add curl
      fi
      which jq > /dev/null 2>&1
      if [ $? -eq 1 ]; then
        apk add jq
      fi

      curl -H "Accept: application/json" http://cortex-distributor:8080/ingester/ring | 
        jq ".shards[] | select(.state==\"Unhealthy\") | .id" |
        sed 's|"||g' |
        xargs -I{} curl -d "forget={}" -d 'csrf_token=$__CSRF_TOKEN_PLACEHOLDER__' -H "Accept: application/json" http://cortex-distributor:8080/ingester/ring
      
      sleep 3
    done
    true
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cortex-ingester-cleanup
  namespace: cortex
  labels:
    app: cortex-ingester-cleanup
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cortex-ingester-cleanup
  template:
    metadata:
      labels:
        app: cortex-ingester-cleanup
        revision: '1'
    spec:
      containers:
        - name: cortex-ingester-cleanup
          image: alpine
          resources:
            limits:
              cpu: 500m
              memory: 512Mi
          command:
            - sh
            - -c
            - "apk add bash && exec bash /cortex-ingester-cleanup.sh"
          volumeMounts:
            - name: cortex-ingester-cleanup-script
              mountPath: /cortex-ingester-cleanup.sh
              subPath: script
      volumes:
        - name: cortex-ingester-cleanup-script
          configMap:
            name: cortex-ingester-cleanup-script

@kingtran2112
Copy link

kingtran2112 commented Nov 29, 2022

Why is pod-0 marked as unhealthy? I can't understand this.

I'm not sure. Looking at the code, the /ready state is supposed to latch. My observations in a couple runs of the above were that it didn't, or could become unlatched somehow.

I'm asking because we also run ingesters as statefulsets (in several clusters) and we've never experienced the issue you're describing, so I'm trying to understand how we could reproduce such scenario. Once an ingester switches to ready it should never get back to not-ready, unless a critical issue occurs. Do you see any valuable information in the logs of the ingester switching from ready to not-ready?

I think the reason is when 2 pods are terminated at the same time, then with the ordered policy, one pod will start first. That pod will be shown at ACTIVE in the ring but in the k8s side, it is not ready. I checked the log of that pod and it showed this log
level=warn ts=2022-11-29T06:27:48.304675108Z caller=lifecycler.go:239 msg="found an existing instance(s) with a problem in the ring, this instance cannot become ready until this problem is resolved. The /ring http endpoint on the distributor (or single binary) provides visibility into the ring." ring=ingester err="instance my-cortex-ingester-1 past heartbeat timeout"
Then, because the "new" started pod is not ready, k8s will not start the second pod. Then after a while, the state of the second pod changing to Unhealthy. And we have a deadlock there, the first pod is not ready because the second pod is down. And the second pod does not restart because the first pod is not ready

@alex-berger
Copy link

Got bitten bit this terribly several times now, and lost a lot of time and data :-(, would really love to see ingester.autoforget_unhealthy support in Cortex.

@fhperuchi
Copy link

Where do I find the value for __CSRF_TOKEN_PLACEHOLDER__?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.