Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add start of elastic-agent diagnostics command #28265

Merged
merged 14 commits into from
Oct 15, 2021

Conversation

michel-laterman
Copy link
Contributor

@michel-laterman michel-laterman commented Oct 6, 2021

What does this PR do?

This PR starts the elastic-agent diagnostics command.
The beats info ("/") HTTP endpoint has been changed to add more data about the running beat including git commit and ephemeral ID.
Currently the diagnostics command will gather beats metadata information from the endpoint and display them along with agent version information.

Why is it important?

The diagnostics command is meant to be used to gather diagnostics information from a running elastic-agent (and beats) that can be used for debugging.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation PR here
  • [] I have made corresponding change to the default configuration files
  • [] I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

Run the agent normally then use: elastic-agent diagnostics

Related issues

Logs

bash-4.2$ elastic-agent diagnostics
{BeatMeta:[{Beat:filebeat Name:filebeat_monitoring Hostname:docker-fleet-agent ID:f5e55217-16fc-4b34-be14-2e60096807b5 EphemeralID:08b22db1-4716-42ce-b1e6-8bcdafcc6776 Version:8.0.0 BuildCommit:ba9eebf2f62f3a827008dbfaf0f814464b0a63f9 BuildTime:2021-09-15 21:13:30 +0000 UTC Username:elastic-agent UserID:1000 UserGID:1000 BinaryArchitecture:amd64 RouteKey:default ElasticLicensed:true Error:} {Beat:metricbeat Name:metricbeat_monitoring Hostname:docker-fleet-agent ID:46a16e22-533d-461d-9916-6dba8fb9ed05 EphemeralID:d090f30f-7662-4abb-9b89-59e95b7b3cd8 Version:8.0.0 BuildCommit:ba9eebf2f62f3a827008dbfaf0f814464b0a63f9 BuildTime:2021-09-15 21:17:17 +0000 UTC Username:elastic-agent UserID:1000 UserGID:1000 BinaryArchitecture:amd64 RouteKey:default ElasticLicensed:true Error:} {Beat:filebeat Name:filebeat Hostname:docker-fleet-agent ID:f5e55217-16fc-4b34-be14-2e60096807b5 EphemeralID:08b22db1-4716-42ce-b1e6-8bcdafcc6776 Version:8.0.0 BuildCommit:ba9eebf2f62f3a827008dbfaf0f814464b0a63f9 BuildTime:2021-09-15 21:13:30 +0000 UTC Username:elastic-agent UserID:1000 UserGID:1000 BinaryArchitecture:amd64 RouteKey:default ElasticLicensed:true Error:} {Beat:metricbeat Name:metricbeat Hostname:docker-fleet-agent ID:46a16e22-533d-461d-9916-6dba8fb9ed05 EphemeralID:d090f30f-7662-4abb-9b89-59e95b7b3cd8 Version:8.0.0 BuildCommit:ba9eebf2f62f3a827008dbfaf0f814464b0a63f9 BuildTime:2021-09-15 21:17:17 +0000 UTC Username:elastic-agent UserID:1000 UserGID:1000 BinaryArchitecture:amd64 RouteKey:default ElasticLicensed:true Error:}] AgentVersion:{Version:8.0.0 Commit:473cb596588bc7da634e21e6615a0a6c90180be8 BuildTime:2021-10-06 15:31:20 +0000 UTC Snapshot:true}}

Add initial command that will gather meta-data from running beats HTTP
endpoints and elastic-agent version information.
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 6, 2021
@michel-laterman michel-laterman added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Oct 6, 2021
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Oct 6, 2021
@mergify
Copy link
Contributor

mergify bot commented Oct 6, 2021

This pull request does not have a backport label. Could you fix it @michel-laterman? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Oct 6, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented Oct 6, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-10-14T21:38:22.115+0000

  • Duration: 155 min 1 sec

  • Commit: a72a7fb

Test stats 🧪

Test Results
Failed 0
Passed 53802
Skipped 5346
Total 59148

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

@michalpristas
Copy link
Contributor

could you diff a change in output in desciption? it would make easier for people having strict requirements on format using this endpoint finding out what should be added

Copy link
Contributor

@michalpristas michalpristas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, i havent tested it so far. i will today/tomorrow

libbeat/cmd/instance/beat.go Show resolved Hide resolved
@@ -20,6 +21,10 @@ type stater interface {
State() map[string]state.State
}

type specer interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably you meant specser. i dont know if it is a real world but it sounds funny :-)
no more naming comments from me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think specser is a bit too strange, it does more to hide what the interface is doing than specer

@michel-laterman michel-laterman added backport-v7.16.0 Automated backport with mergify and removed backport-skip Skip notification from the automated backport with mergify labels Oct 6, 2021
@michel-laterman
Copy link
Contributor Author

/test

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the format of the output? Can we make it human readable or at least add an option for it?


diag, err := getDiagnostics(innerCtx)
if err == context.DeadlineExceeded {
return errors.New("timed out after 30 seconds trying to connect to Elastic Agent daemon")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30s sounds like a long time for a timeout for a local diagnostic commands. I would expect the diagnostic command to be pretty quick or is there some "remote" dependency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copied timeout from the status command

@@ -53,6 +53,25 @@ type ApplicationStatus struct {
Payload map[string]interface{}
}

// BeatMeta is the running version and ID inforation for a running beat.
type BeatMeta struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about endpoint and other processes we run? If possible we should keep the naming not specific to Beats only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, Do we know the kind of meta-data endpoint or another process would return? Or should I change this to just be a generic map[string]interface{} to capture everything?

@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks really good. Just got one question.

// DialContext returns a function that can be used to dial a local unix-domain socket.
func DialContext(socket string) func(context.Context, string, string) (net.Conn, error) {
return func(_ context.Context, _, _ string) (net.Conn, error) {
return net.Dial("unix", socket)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not possible to get net.Dial to take a context? Is there no net.DialContext?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

@michel-laterman michel-laterman merged commit 887e40a into elastic:master Oct 15, 2021
@michel-laterman michel-laterman deleted the diagnostics branch October 15, 2021 02:26
mergify bot pushed a commit that referenced this pull request Oct 15, 2021
This PR starts the elastic-agent diagnostics command.
The beats info ("/") HTTP endpoint has been changed to add more data about the running beat including git commit and ephemeral ID.
Currently the diagnostics command will gather beats metadata information from the endpoint and display them along with agent version information.

(cherry picked from commit 887e40a)
michel-laterman added a commit that referenced this pull request Oct 15, 2021
This PR starts the elastic-agent diagnostics command.
The beats info ("/") HTTP endpoint has been changed to add more data about the running beat including git commit and ephemeral ID.
Currently the diagnostics command will gather beats metadata information from the endpoint and display them along with agent version information.

(cherry picked from commit 887e40a)

Co-authored-by: Michel Laterman <82832767+michel-laterman@users.noreply.github.com>
Icedroid pushed a commit to Icedroid/beats that referenced this pull request Nov 1, 2021
This PR starts the elastic-agent diagnostics command.
The beats info ("/") HTTP endpoint has been changed to add more data about the running beat including git commit and ephemeral ID.
Currently the diagnostics command will gather beats metadata information from the endpoint and display them along with agent version information.
@michel-laterman michel-laterman added the backport-v8.0.0 Automated backport with mergify label Nov 10, 2021
mergify bot pushed a commit that referenced this pull request Nov 10, 2021
This PR starts the elastic-agent diagnostics command.
The beats info ("/") HTTP endpoint has been changed to add more data about the running beat including git commit and ephemeral ID.
Currently the diagnostics command will gather beats metadata information from the endpoint and display them along with agent version information.

(cherry picked from commit 887e40a)

# Conflicts:
#	libbeat/cmd/instance/beat.go
#	x-pack/elastic-agent/pkg/agent/cmd/diagnostics.go
mergify bot pushed a commit that referenced this pull request Nov 10, 2021
This PR starts the elastic-agent diagnostics command.
The beats info ("/") HTTP endpoint has been changed to add more data about the running beat including git commit and ephemeral ID.
Currently the diagnostics command will gather beats metadata information from the endpoint and display them along with agent version information.

(cherry picked from commit 887e40a)

# Conflicts:
#	libbeat/cmd/instance/beat.go
#	x-pack/elastic-agent/pkg/agent/cmd/diagnostics.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v7.16.0 Automated backport with mergify backport-v8.0.0 Automated backport with mergify enhancement Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants