Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some agents go offline, some agents report startup issues #1772

Open
pjbertels opened this issue Nov 22, 2022 · 12 comments
Open

some agents go offline, some agents report startup issues #1772

pjbertels opened this issue Nov 22, 2022 · 12 comments
Labels
bug Something isn't working Project:FleetScaling

Comments

@pjbertels
Copy link
Contributor

Issues encountered during Fleet Scaling testing with drones and a subset of real VMs.

  • Version: 8.5.0

  • Operating System: Linux Ubuntu VM (e2-standard-8)

  • Steps to Reproduce:
    We used some tooling to bring up 200 VMs and 9800 Horde drones, some VM's report errors in the logs on the way up and take longer to come up. Once we begin testing some VMs go unhealthy(18/199).

[14:07:29] Change config in swarm_0-pp1-437887                                                                                                          perf_lib.py:312
           Update policy(4aa7bdf0-6a92-11ed-ae4b-ef0dc45b4219) config for "system" package policy                                                       perf_lib.py:146
           Found "system" package policy, updating from fixtures/package_policies/inputs-system1.json                                                   perf_lib.py:153
[14:07:32] "system" package policy updated from fixtures/package_policies/inputs-system1.json                                                           perf_lib.py:159

Here is an example of the errors on startup:


[elastic_agent][info] APM instrumentation disabled
13:52:39.949
elastic_agent
[elastic_agent][info] Detecting execution mode
13:52:39.950
elastic_agent
[elastic_agent][info] Agent is managed locally
13:52:39.950
elastic_agent
[elastic_agent][info] capabilities file not found in /opt/Elastic/Agent/capabilities.yml
13:52:40.653
elastic_agent
[elastic_agent][info] Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
13:52:40.654
elastic_agent
[elastic_agent][info] Starting stats endpoint
13:52:40.654
elastic_agent
[elastic_agent][info] Agent is starting
13:52:40.654
elastic_agent
[elastic_agent][info] Metrics endpoint listening on: /opt/Elastic/Agent/data/tmp/elastic-agent.sock (configured: unix:///opt/Elastic/Agent/data/tmp/elastic-agent.sock)
13:52:40.655
elastic_agent
[elastic_agent][info] Agent is stopped
13:52:40.655
elastic_agent
[elastic_agent][info] Configuration changes detected
13:52:40.660
elastic_agent
[elastic_agent][info] Source URI reset from "https://artifacts.elastic.co/downloads/" to "https://artifacts.elastic.co/downloads/"
13:52:40.660
elastic_agent
[elastic_agent][info] New State ID is qYySf7Z0
13:52:40.660
elastic_agent
[elastic_agent][info] Converging state requires execution of 2 step(s)
13:52:41.635
elastic_agent
[elastic_agent][info] waiting for installer of pipeline 'default' to finish
13:52:41.641
elastic_agent
[elastic_agent][info] Signaling application to stop because of shutdown: metricbeat--8.5.0
13:52:41.926
elastic_agent
[elastic_agent][info] APM instrumentation disabled
13:52:41.927
elastic_agent
[elastic_agent][info] Detecting execution mode
13:52:41.944
elastic_agent
[elastic_agent][info] Agent is managed by Fleet
13:52:41.945
elastic_agent
[elastic_agent][info] capabilities file not found in /opt/Elastic/Agent/capabilities.yml
13:52:42.436
elastic_agent
[elastic_agent][info] Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
13:52:42.437
elastic_agent
[elastic_agent][info] Starting stats endpoint
13:52:42.437
elastic_agent
[elastic_agent][info] Agent is starting
13:52:42.437
elastic_agent
[elastic_agent][info] Metrics endpoint listening on: /opt/Elastic/Agent/data/tmp/elastic-agent.sock (configured: unix:///opt/Elastic/Agent/data/tmp/elastic-agent.sock)
13:52:44.435
elastic_agent
[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "https://artifacts.elastic.co/downloads/"
13:52:44.436
elastic_agent
[elastic_agent][info] New State ID is Wir_OOK2
13:52:44.436
elastic_agent
[elastic_agent][info] Converging state requires execution of 4 step(s)
13:52:48.536
elastic_agent
[elastic_agent][info] 2022-11-22T18:52:48Z - message: Application: endpoint-security--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
13:52:57.018
elastic_agent
[elastic_agent][info] 2022-11-22T18:52:57Z - message: Application: filebeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
13:52:58.236
elastic_agent
[elastic_agent][info] 2022-11-22T18:52:58Z - message: Application: filebeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to RUNNING: Running - type: 'STATE' - sub_type: 'RUNNING'
13:53:08.062
elastic_agent
[elastic_agent][info] 2022-11-22T18:53:08Z - message: Application: metricbeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
13:53:08.385
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
13:53:08.514
elastic_agent
[elastic_agent][info] 2022-11-22T18:53:08Z - message: Application: filebeat--8.5.0--36643631373035623733363936343635[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
13:53:08.908
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
13:53:08.994
elastic_agent
[elastic_agent][info] 2022-11-22T18:53:08Z - message: Application: metricbeat--8.5.0--36643631373035623733363936343635[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
13:53:08.998
elastic_agent
[elastic_agent][info] Updating internal state
13:53:09.350
elastic_agent
[elastic_agent][info] 2022-11-22T18:53:09Z - message: Application: metricbeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to RUNNING: Running - type: 'STATE' - sub_type: 'RUNNING'
13:53:09.721
elastic_agent
[elastic_agent][info] 2022-11-22T18:53:09Z - message: Application: filebeat--8.5.0--36643631373035623733363936343635[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to RUNNING: Running - type: 'STATE' - sub_type: 'RUNNING'
13:53:10.263
elastic_agent
[elastic_agent][info] 2022-11-22T18:53:10Z - message: Application: metricbeat--8.5.0--36643631373035623733363936343635[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to RUNNING: Running - type: 'STATE' - sub_type: 'RUNNING'
13:53:22.445
elastic_agent
[elastic_agent][info] 2022-11-22T18:53:22Z - message: Application: endpoint-security--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to RUNNING: Protecting with policy {ec404f8a-69f1-48b9-a571-0eecc4178789} - type: 'STATE' - sub_type: 'RUNNING'
14:07:35.569
elastic_agent
[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "https://artifacts.elastic.co/downloads/"
14:07:35.569
elastic_agent
[elastic_agent][info] Source URI in operator changed to "https://artifacts.elastic.co/downloads/"
14:07:35.571
elastic_agent
[elastic_agent][info] New State ID is 5OsWg7W0
14:07:35.571
elastic_agent
[elastic_agent][info] Converging state requires execution of 4 step(s)
14:07:36.736
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for endpoint-security.8.5.0
14:07:36.736
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for endpoint-security.8.5.0
14:07:37.028
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:37.028
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:37.387
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:37.387
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:37.408
elastic_agent
[elastic_agent][error] Elastic Agent status changed to "error": "app filebeat--8.5.0-3b4067e0: 1 error occurred:\n\t* 1 error: Error creating runner from config: Can only start an input when all related states are finished: {Id: native::65699-2049, Finished: false, Fileinfo: &{syslog 185460 416 {432873338 63804740487 0x556ddcf6ba40} {2049 65699 1 33184 104 4 0 0 185460 4096 368 {1669143225 900284892} {1669143687 432873338} {1669143687 432873338} [0 0 0]}}, Source: /var/log/syslog, Offset: 187704, Timestamp: 2022-11-22 19:07:32.996155066 +0000 UTC m=+875.928725159, TTL: -1ns, Type: log, Meta: map[], FileStateOS: 65699-2049}\n\n"
14:07:37.408
elastic_agent
[elastic_agent][error] 2022-11-22T19:07:37Z - message: Application: filebeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to FAILED: 1 error occurred:
	* 1 error: Error creating runner from config: Can only start an input when all related states are finished: {Id: native::65699-2049, Finished: false, Fileinfo: &{syslog 185460 416 {432873338 63804740487 0x556ddcf6ba40} {2049 65699 1 33184 104 4 0 0 185460 4096 368 {1669143225 900284892} {1669143687 432873338} {1669143687 432873338} [0 0 0]}}, Source: /var/log/syslog, Offset: 187704, Timestamp: 2022-11-22 19:07:32.996155066 +0000 UTC m=+875.928725159, TTL: -1ns, Type: log, Meta: map[], FileStateOS: 65699-2049}

 - type: 'ERROR' - sub_type: 'FAILED'
14:07:37.688
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:37.688
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:38.050
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:38.050
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:38.052
elastic_agent
[elastic_agent][info] Updating internal state
14:07:38.117
elastic_agent
[elastic_agent][error] lazy acker: failed ack batch, enqueue for retry: []fleetapi.Action{(*fleetapi.ActionPolicyChange)(0xc0004aa150)}
14:07:40.869
elastic_agent
[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "https://artifacts.elastic.co/downloads/"
14:07:40.869
elastic_agent
[elastic_agent][info] Source URI in operator changed to "https://artifacts.elastic.co/downloads/"
14:07:40.871
elastic_agent
[elastic_agent][info] New State ID is 5OsWg7W0
14:07:40.871
elastic_agent
[elastic_agent][info] Converging state requires execution of 3 step(s)
14:07:41.161
elastic_agent
[elastic_agent][warn] Elastic Agent status changed to "online": ""
14:07:41.161
elastic_agent
[elastic_agent][info] 2022-11-22T19:07:41Z - message: Application: filebeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
14:07:41.161
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:41.162
elastic_agent
[elastic_agent][info] 2022-11-22T19:07:41Z - message: Application: filebeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to RESTARTING: Restarting - type: 'STATE' - sub_type: 'STARTING'
14:07:43.646
elastic_agent
[elastic_agent][error] filebeat stderr: "Exiting: data path already locked by another beat. Please make sure that multiple beats are not sharing the same data path (path.data).\n"
14:07:43.653
elastic_agent
[elastic_agent][info] 2022-11-22T19:07:43Z - message: Application: filebeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to RESTARTING: exited with code: 1 - type: 'STATE' - sub_type: 'STARTING'
14:07:43.653
elastic_agent
[elastic_agent][info] 2022-11-22T19:07:43Z - message: Application: filebeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
14:07:43.653
elastic_agent
[elastic_agent][info] 2022-11-22T19:07:43Z - message: Application: filebeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to RESTARTING: Restarting - type: 'STATE' - sub_type: 'STARTING'
14:07:43.829
elastic_agent
[elastic_agent][error] filebeat stderr: "Exiting: data path already locked by another beat. Please make sure that multiple beats are not sharing the same data path (path.data).\n"
14:07:43.866
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:43.866
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:44.161
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:44.161
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:44.518
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:44.518
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:44.520
elastic_agent
[elastic_agent][info] Updating internal state
14:07:44.567
elastic_agent
[elastic_agent][error] lazy acker: failed ack batch, enqueue for retry: []fleetapi.Action{(*fleetapi.ActionPolicyChange)(0xc000af3470)}
14:07:46.505
elastic_agent
[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "https://artifacts.elastic.co/downloads/"
14:07:46.505
elastic_agent
[elastic_agent][info] Source URI in operator changed to "https://artifacts.elastic.co/downloads/"
14:07:46.507
elastic_agent
[elastic_agent][info] New State ID is 5OsWg7W0
14:07:46.507
elastic_agent
[elastic_agent][info] Converging state requires execution of 3 step(s)
14:07:46.797
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:46.797
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:47.158
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:47.158
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:47.456
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:47.456
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:47.817
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:47.817
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:47.820
elastic_agent
[elastic_agent][info] Updating internal state
14:07:47.867
elastic_agent
[elastic_agent][error] lazy acker: failed ack batch, enqueue for retry: []fleetapi.Action{(*fleetapi.ActionPolicyChange)(0xc00063f6e0)}
14:07:49.824
elastic_agent
[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "https://artifacts.elastic.co/downloads/"
14:07:49.824
elastic_agent
[elastic_agent][info] Source URI in operator changed to "https://artifacts.elastic.co/downloads/"
14:07:49.826
elastic_agent
[elastic_agent][info] New State ID is 5OsWg7W0
14:07:49.826
elastic_agent
[elastic_agent][info] Converging state requires execution of 3 step(s)
14:07:50.115
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:50.115
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:50.472
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:50.472
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:50.766
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:50.766
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:51.122
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:51.122
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:51.124
elastic_agent
[elastic_agent][info] Updating internal state
14:07:51.171
elastic_agent
[elastic_agent][error] lazy acker: failed ack batch, enqueue for retry: []fleetapi.Action{(*fleetapi.ActionPolicyChange)(0xc000dbc150)}
14:07:53.613
elastic_agent
[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "https://artifacts.elastic.co/downloads/"
14:07:53.614
elastic_agent
[elastic_agent][info] Source URI in operator changed to "https://artifacts.elastic.co/downloads/"
14:07:53.615
elastic_agent
[elastic_agent][info] New State ID is 5OsWg7W0
14:07:53.615
elastic_agent
[elastic_agent][info] Converging state requires execution of 2 step(s)
14:07:53.906
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:53.906
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:54.197
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:54.197
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:54.554
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:54.554
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:54.559
elastic_agent
[elastic_agent][info] Updating internal state
14:07:54.606
elastic_agent
[elastic_agent][error] lazy acker: failed ack batch, enqueue for retry: []fleetapi.Action{(*fleetapi.ActionPolicyChange)(0xc000b74690)}
14:07:56.806
elastic_agent
[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "https://artifacts.elastic.co/downloads/"
14:07:56.806
elastic_agent
[elastic_agent][info] Source URI in operator changed to "https://artifacts.elastic.co/downloads/"
14:07:56.808
elastic_agent
[elastic_agent][info] New State ID is 5OsWg7W0
14:07:56.808
elastic_agent
[elastic_agent][info] Converging state requires execution of 3 step(s)
14:07:57.097
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:57.097
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:57.455
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:57.455
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:57.749
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for filebeat.8.5.0
14:07:57.749
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for filebeat.8.5.0
14:07:58.106
elastic_agent
[elastic_agent][info] operation 'operation-install' skipped for metricbeat.8.5.0
14:07:58.106
elastic_agent
[elastic_agent][info] operation 'operation-start' skipped for metricbeat.8.5.0
14:07:58.112
elastic_agent
[elastic_agent][info] Updating internal state
14:08:42.334
elastic_agent
[elastic_agent][warn] Elastic Agent status changed to "degraded": "app filebeat--8.5.0-3b4067e0: Missed last check-in"
14:08:42.335
elastic_agent
[elastic_agent][info] 2022-11-22T19:08:42Z - message: Application: filebeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to DEGRADED: Missed last check-in - type: 'STATE' - sub_type: 'RUNNING'
14:09:42.344
elastic_agent
[elastic_agent][error] Elastic Agent status changed to "error": "app filebeat--8.5.0-3b4067e0: Missed two check-ins"
14:09:42.344
elastic_agent
[elastic_agent][error] 2022-11-22T19:09:42Z - message: Application: filebeat--8.5.0[96b3e99c-3480-4395-b7cb-7650be23f3e3]: State changed to FAILED: Missed two check-ins - type: 'ERROR' - sub_type: 'FAILED'
@pjbertels pjbertels added bug Something isn't working Project:FleetScaling labels Nov 22, 2022
@AndersonQ
Copy link
Member

This caught my attention. Is there any other beat running? perhaps a filebeat to collect logs from the VM?

[elastic_agent][error] filebeat stderr: "Exiting: data path already locked by another beat. Please make sure that multiple beats are not sharing the same data path (path.data).\n"

@pjbertels
Copy link
Contributor Author

I don't think so but maybe that is happening due to the three integrations that we have running bumping into each other ...
image

@pjbertels
Copy link
Contributor Author

I'm going to run the test again. We added code to tag the IP of the VM so I will login and look if there are issues.

@AndersonQ
Copy link
Member

I'm going to run the test again. We added code to tag the IP of the VM so I will login and look if there are issues.

that's good :)

Check the status (elastic-agent status) to see which applications the agent is running. It can run up to 2 filebeat and metricbeat, one of each being the monitoring one.
Also collect a diagnostics (elastic-agent diagnostics collect) if the problem persists

@AndersonQ
Copy link
Member

btw, the integration s should not conflict, the agent should correctly turn it into a single config for each beats necessary for the integrations

@pjbertels
Copy link
Contributor Author

image

ubuntu@ogc-b65fa1a2-elastic-agent-ubuntu:~$ sudo elastic-agent status
Status: FAILED
Message: app filebeat--8.5.0-e251e592: Missed two check-ins
Applications:
  * endpoint-security      (HEALTHY)
                           Protecting with policy {5a693db9-e121-4e50-98f9-8dc017471b7b}
  * filebeat               (FAILED)
                           Missed two check-ins
  * metricbeat             (HEALTHY)
                           Running
  * filebeat_monitoring    (HEALTHY)
                           Running
  * metricbeat_monitoring  (HEALTHY)
                           Running

@pjbertels
Copy link
Contributor Author

elastic-agent-diagnostics-2022-11-23T17-02-03Z-00.zip was emailed to Anderson.

@AndersonQ
Copy link
Member

How is the agent installed on the VMs?

This error is due to an inconsistent state, it's like filebeat was abruptly finished and did not clean its lock file.

To fix that, it'd be necessary to delete the lock file. It's located on:

/opt/Elastic/Agent/data/elastic-agent-*/run/default/filebeat--8.5.0--SOME_UUID/filebeat.lock

@cmacknz
Copy link
Member

cmacknz commented Nov 23, 2022

Exiting: data path already locked by another beat. Please make sure that multiple beats are not sharing the same data path (path.data). is a known bug.

This should be fixed in 8.5.1. Note that the original bug has been reopened (elastic/beats#31670) with reports that we haven't fixed it on containers where the agent or beats can get recycled PIDs, but that shouldn't apply to VMs.

@cmacknz
Copy link
Member

cmacknz commented Nov 23, 2022

You can recover from this by deleting the lock file as is suggested above.

@pjbertels
Copy link
Contributor Author

I was able to correct the issue by doing ...

ubuntu@ogc-b65fa1a2-elastic-agent-ubuntu:~$ sudo elastic-agent enroll
This will replace your current settings. Do you want to continue? [Y/n]:Y
{"log.level":"info","@timestamp":"2022-11-23T17:55:04.775Z","log.origin":{"file.name":"cmd/enroll_cmd.go","file.line":471},"message":"Starting enrollment to URL: :///","ecs.version":"1.6.0"}
Error: fail to enroll: fail to execute request to fleet-server: 1 error occurred:
	* missing enrollment api key


For help, please see our troubleshooting guide at https://www.elastic.co/guide/en/fleet/8.5/fleet-troubleshooting.html
ubuntu@ogc-b65fa1a2-elastic-agent-ubuntu:~$

@pjbertels
Copy link
Contributor Author

Forgot to add this info yesterday ... this is how the agent is installed on the VMs.

#!/bin/bash
# Variables required to be exported
# OGC_FLEET_ENROLLMENT_TOKEN
# OGC_FLEET_URL

<%namespace name="utils" file="/functions.mako"/>

<%
env['OGC_ELASTIC_AGENT_VERSION'] = "8.5.0"
url = "https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-{version}-linux-x86_64.tar.gz".format(version=env['OGC_ELASTIC_AGENT_VERSION'])
%>

${utils.setup_env()}

# Download elastic-agent tarball
wget -O elastic-agent.tar.gz ${url}
${utils.extract('elastic-agent.tar.gz')}

mv elastic-agent-${env['OGC_ELASTIC_AGENT_VERSION']}-linux-x86_64 elastic-agent
cd elastic-agent && sudo ./elastic-agent install -f --url=${env["OGC_FLEET_URL"]} --enrollment-token=${env["OGC_FLEET_ENROLLMENT_TOKEN"]} --tag=$(date +%Y-%m-%d),${env['OGC_ELASTIC_AGENT_VERSION']},$(hostname),${node.public_ip},real-vm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Project:FleetScaling
Projects
None yet
Development

No branches or pull requests

3 participants