-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Elastic Agent] Monitoring filebeat and metricbeat not connecting to Agent over GRPC #23833
Comments
Pinging @elastic/agent (Team:Agent) |
the failure was found both in the e2e-testing automated test as well as in the Demo/Test environment deploy, which confirmed it against Windows as well. e2e-testing job link: @mdelapenya and @michalpristas I see that the tests ran during the potentially relating PR put in for other tests passed e2e-testing, so I'm curious how the test missed it, or if it indeed isn't related to that change somehow. |
I'm going to add here my traces about the root cause for the error we see: The possible culprit commits are:
In this table, which lists the commits in reverse order (older first), we are trying to describe what happened at the CI side of the PRs, which job triggered what, and with what result.
I bisected this change set, building the elastic-agent artifacts for each commit and running the tests against the local binaries, and the results are exactly the same as on CI. After checking that (#23779, #22170, #23724, #23736) were bundled in the same packaging job, and triggered the same E2E job, I'd say that the culprit is in that set of commits. Given the changes, I'd say that one of #23724 or #23736 are the root cause. |
I got deep into beat-agent communication and logically nothing seemed wrong feeding TLS Config in a way it has to work correctly. i tried i need to go to sleep now @blakerouse if you have time in the meantime it would be great if you could take a look at it. i will try picking up where you left in the morning |
we can test this with the next 8.0 snapshot - hopefully on Friday Feb 5 it will be available for us. |
Overview
Elastic Agent spawns a filebeat and metricbeat to collect logs and metrics about Elastic Agent. These are seperate from the filebeat and metricbeat that is spawned for the
system
integration.Seems that the monitoring filebeat and metricbeat are not connection back to Elastic Agent. So they never receive the configuration and they also timeout because they never check-in.
I believe this is related to #23776 and the certificate work.
Logs
Below is the Elastic Agent logs that show filebeat and metricbeat are restarted because they never connect.
Below is the
netstat
output. This should have 2 filebeat and 2 metricbeat, as you can see it only has 1 of each.The text was updated successfully, but these errors were encountered: