-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessary PodDisruptionBudget update on operator restart #5812
Comments
You can't avoid upgrading your version, it behaves the same as a reboot. So I think this is an issue |
I am not sure I understand this. Can you explain a bit more. In general there is no strong reason to have the observation async. I mean we do a few more synchronous requests to the Elasticsearch API that are affected if the cluster does not respond to requests. Having said that I think it is probably not worth the extra complexity to have the initial observation now outside of the observer. Also I am not sure that having the first observation synchronous will fix the issue as the PDB code takes the health from status sub resource which will only be updated after the first reconciliation and the Elasticsearch cluster will probably not be available at this point (takes ~ 60-70sec depending on hardware) So we would still see multiple PDB updates (unless I am missing something) The current behaviour to disallow disruptions in the presence of (yet) unknown health sounds reasonable to me. |
Thank you for your reply. I think it is because our E2E test did not restore our usage scenarios. I submitted a new commit and withdrew the previous asynchronous writing. I modified the usage scenario of E2E to make it more suitable for the running environment of Eck. In k8s, the receiver is asynchronous rather than synchronous, but E2E is synchronous. This leads to the E2E running to be stuck, but the Eck runs normally.Please review again. |
The issue occurs when the operator is restarted and an Elasticsearch is already running, so your comment about the time to startup isn't relevant here? -- Reconciliation 1 --
d.ES.Status.Health=green observedState=unknown
Reconcile PDB, es.Status.Health=green
Update ES status previous=green current=unknown <-- we set the state to unknown
-- Reconciliation 2 --
d.ES.Status.Health= unknown observedState=green
Reconcile PDB, es.Status.Health=unknown <-- we update PDB
Update ES status previous=unknown current=green
-- Reconciliation 3 --
d.ES.Status.Health= green observedState=green
Reconcile PDB, es.Status.Health=green <-- back to green, update PDB again
Update ES status previous=green current=green I don't find this behavior very satisfying, so I'm in favor of trying to improve it. Since we already have a synchronous call to check the license, I think it's acceptable to pass the first observation synchronous. This only lengthens the first reconciliation by a few ms. We can see that the first HTTP call has a cold start (+2s), then the connection is reused.
|
Opening this issue following PR #5783 to discuss about the potential issue.
Context:
allowedDisruptions()
)Issue:
The issue occurs when the operator restarts. At the first ES reconciliation, the ES controller starts the Observer and retrieves the observed cluster health which is still equal to
unknown
. The reconciliation is performed with the last stored status in the ES resource, so all is good, except that theunknown
value is used to update the ES status when the reconciliation is complete. At the second ES reconcilition, the reconciliation is performed with anunknown
status which triggers an update of the PDB. The ES status is updated again with the last observed health when the reconciliation is complete. At the third reconciliation, the reconciliation is performed with a known status and the PDB is updated again to return to normal.I'm not sure that's a real issue. This translates from the operator point of view as until I don't see a green cluster I don't allow pod disruptions. It sounds ok for me.
Cause:
When an Observer is created, the cluster health is initialized to
unknown
. Then the Observer is started with a first observation and a timer to trigger next observations periodically. The whole start operation is executed asynchronously in a separated goroutine. This means there is no simple way to start the observer and get the first observed heath.Solution proposed in #5783:
Move the first observation out of the goroutine to make it synchronous. Interestingly, it fails the
TestManager_AddObservationListener
unit test that tests the notification listeners mechanism. The test uses blocking channels to test the listeners like the ES controller does (WatchClusterHealthChange()
). With the first observation synchronous that uses a listener which writes in a blocking channel, the test is stuck because the code that consumes the channel of the listener has not started. We could use asynchronous channels to fix this but...I need a confirmation, I think that the first observation is intentionally asynchronous because we don't want to block the reconciliation if an ES cluster takes time to respond.
Let's take a step back.
Is it a real issue?
If yes, could we consider to reconcile the PDB only when the health is not unknown? This would imply that at the creation, the PDB is created once the pods are ready.
The text was updated successfully, but these errors were encountered: