-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fleet Server goes to permanent offline state when fleet server agent is reboot after Kibana restart. #357
Comments
@manishgupta-qasource Please review. |
Reviewed & Assigned to @EricDavisX |
@ph @ruflin @michalpristas @blakerouse any thoughts? Is this a blocker? It feels like it would prevent successful usage of the Beta. I don't know if it is actually the same as other reboot host tests we've seen. thinking of this: https://github.com/elastic/obs-dc-team/issues/528 |
@ph @mostlyjason I wonder if for 7.13, we should just state that a local fleet-server is not supported. It adds a lot of variability and complexity. There is a chance that this is related to the fix we did. We should test again when the next SNAPSHOT / BC are out. What would be nice is if we could reproduce these things also in a local no Cloud setup. @amolnater-qasource I assume you updated the settings page for the local fleet-server? Are these values by chance reset after the reboot? |
@ruflin I think local fleet-server needs to be supported because about 60% of beta clusters are self-managed. We also have a GA release coming up, and feedback from real users during the beta will be critical for us to achieve confidence in its reliability. I'd treat it as a bug to fix. |
Not sure if we talk about the same thing. I'm talking about a local fleet-server with Elastic Cloud. Everything on prem must work. |
I expect this is still be a valid bug for full on-prem environment. @amolnater-qasource BC7 is in progress available sometime during your work day we expect, can you set up a full on-prem environment and retest and report back please? |
Hi @ruflin
Yes, after that we were able to install secondary agents with that Fleet Server. It was initially working fine:
Thanks |
Hi @EricDavisX Steps followed:
Hence, this issue is not observed on self managed 7.13 Kibana. Please let us know if we are missing anything. |
Knowing this now, it is less urgent - removing from urgent issues concerns list and removing 7.13 label |
can we retest this once 7.14.0 is available? |
FYI, prior, I had downgraded the urgency because the full cloud-stack side setup is working well, and the full on-prem solution is working well, it was only with the hybrid cloud stack and local fleet server where issues were seen (and that scenario may not yet be cited as fully supported, I need to dig up the ticket). Regardless, and for now:
... in this last test, we *could update the Fleet Settings to render the cloud Fleet Server unused, but it is a better test to leave it in place, so we can do that. Thank you. |
Hi @EricDavisX
We observed this #376 issue reproducible after following this scenario.
We haven't observed any errors while following this scenario:
On attempting this scenario we haven't observed agents going to offline state. We aren't able to reproduce this issue on 7.14.0 BC-3. Hence we are closing this issue. Please let us know if anything else is required from our end. |
Kibana version: 7.13.0 Snapshot Kibana cloud environment
Host OS and Browser version: Ubuntu 20, All
Build Details:
Preconditions:
Steps to reproduce:
sudo reboot
command.Expected Result:
Fleet Server should come back "Healthy", when fleet server agent is reboot after Kibana restart.
Fleet-Server Logs:
Logs.zip
Note:
Screenshot:

The text was updated successfully, but these errors were encountered: