-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Win-10-2016-ltsb specific]: Agent's Metricbeat process not running (so no Activity logs under Logs tab for Metricbeat), may relate to Endpoint or Linux / Windows Integrations #24180
Comments
@manishgupta-qasource Please review. |
Pinging @elastic/agent (Team:Agent) |
Reviewed & assigned to @EricDavisX |
it would be helpful like always to have confirmation of what the Agent logs look like from the host, @amolnater-qasource are there any? Also, if there is a problem when Endpoint is configured, we will always request to know if it works without Endpoint included, so we can triage all the faster. Please let us know, thank you! |
it would be worth a try to verify whether logs are discoverable using Discover |
Hi @EricDavisX Below are the required log files: We haven't observed any data under Discover tab. Please let us know if anything else is required. |
I see this repeated in the logs provided, over 2000 times it seems:
I wonder if this is human error in using the x64 artifact on an x86 environment? @amolnater-qasource can you double check for us? It could be a build side problem, too |
@ph fyi. also the Security Engg prod group has offered to give some second opinions as to using the latest snapshot. Thank you @charlie-pichette and @andrew-garfield101 for any info you can post about if you are seeing the same things on any Windows environments or not |
Hi @EricDavisX No Eric, we are using required x64 artifact with Windows 10 x64 machine. Please refer below screenshot for machine details: Please let us know if anything else is required. |
Hi @EricDavisX, While performing testing on 7.12 snapshot build, we observed that windows agent is still going unhealthy with no activity logs.
Build details: Hence, blocked to continue testing on Windows package 4.1. Agent logs: |
looks like todays build is the same and does not includes anything from yesterday, so if the artifact from yesterday contained invalid metricbeat binary it would be the case for this one as well. can you check hash of the metricbeat.exe?
|
Hi @michalpristas Elastic-agent logs: Build details:
Please let us know if anything else is required. |
seems like you have the same version/hash of agent as i was testing and hash of metricbeat does not match. can we check that
|
As per your feedback we have checked hash from both the locations and found it different. Please find screenshot below: Further we attempted to start the metricbeat from:
Observed Access is denied even when run on admin cmd.
Observed no actions after running metricbeat.exe. cc: @EricDavisX |
Thank you Amol and Michal for working this. Here is a summary of where we see we are today, afte a quick chat with Michal:
|
ok, we have reports of other Win 10 versions working, and I used the same template (I believe) as is reported originally, and I DO see the problem. I've cloned one for Michal to take a look at, requires Endgame VPN access. notes sent in slack for access. |
we had a nice chat with @EricDavisX and Qas team. what we found out is that they ran into issue with race between enroll and install which is fixed few days ago. after restart we ran into issue described. what we observed was that agent unzips metricbeat but only partially. it results in metricbeat of 2/3 size 77 instead of 118 MBs the rest of the files following i will check unzip code and as this was not change i will also change upstream for changes. |
have a thought about root cause: my suspicion is this is related to change of order in install/enrollment process: what i think is going on is that agent is restarted while installing filebeat/metricbeat and then beat is not fully copied/unpacked. |
@michalpristas To ensure that Elastic Agent is always running with the correct (unmodified) version of a beat, would it be better to always verify/extract before starting it? That would really reduce the window of a beat being modified and then executed by an attacker, and always ensure that Elastic Agent is running with what it expects (not something that is corrupt or modified)? I think the overhead of re-extraction is worth the benefit. |
I have asked @dikshachauhan-qasource to help test out the PR opened above to see how it works, just to help us with a data point. Request being sent in daily email to QAS team. |
Hi @EricDavisX , Today, we are unable to deploy build from from staging cloud. Hence, blocked to verify PR merged above. We will validate it once 8.0 build is available for deployment. Thanks |
Hi @EricDavisX We have validated this issue on 8.0 snapshot build and found it working fine. Build details are as follows:
Observations:
Further, though agent got installed successfully, however few new error logs were displayed on Agent logs tab. Please refer below: Agent logs from UI: Thanks |
@michalpristas , now or later (or skip it, if not valuable enough) we could change the log level to 'Warn' maybe here? ... That is since it is being recovered by our code logic and not a totally unexpected problem. Diksha can log a ticket to track it... @dikshachauhan-qasource that is great. I'd like to add a 'logs message' ticket into the system and link it to our Logs Meta issue to track that those error citations are acceptable / accepted and not indicative of new problems we need to track. Depending on what Michal may say above, we can adjust our follow through |
This seems fixed indeed - though we are finding new problems with Metricbeat, but not as frequently. : / It isn't really very clear - but we can track the above in reference to subsequent problems. |
Kibana version: 7.12.0 Snapshot Kibana Cloud environment
Host OS and Browser version: Windows 10, All
Preconditions:
Build Details:
Steps to reproduce:
Expected Result:
Activity logs should stream on enrolling agent with policy having System and Endpoint Security.
Screenshots:

Note:
It is working fine with MAC and Linux .tar agents.
The text was updated successfully, but these errors were encountered: