Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[metricbeat] Add service metricset #14206

Merged

Conversation

fearful-symmetry
Copy link
Contributor

@fearful-symmetry fearful-symmetry commented Oct 23, 2019

This is part of #14122 and #11846

What is this?

This PR adds system/services metricset that reports on the state of services from systemd. Right now this only supports systemd, but I tried to make it somewhat generic in the hope that we could support other service managers from other OSes as time goes on.

How do I test this?

This is fairly simple. Any linux distro that uses systemd (which is nearly all of them at this point) will work. Just pull down, enable the metricset and run. You may need to separately run go get github.com/coreos/go-systemd/dbus.

TODO

  • Commit our coreos/go-systemd dependency.
  • Make dashboard(s)
  • add a line to the example metricbeat config

Some other ideas I had

  • Would we want a services_summary metricset like how we have a socket_summary metricset?
  • Do we want to grab additional metrics from cgroups?

We now have a dashboard!

Screen Shot 2019-10-31 at 7 31 15 AM

Test Plan

  • Start metricbeat and load the dashboards. This metricset should be run on a linux distro using systemd.

  • Check for output:

   "system": {
        "service": {
            "load_state": "loaded",
            "name": "NetworkManager.service",
            "resources": {
                "memory": {
                    "usage": {
                        "bytes": 15646720
                    }
                },
                "tasks": {
                    "count": 4
                }
            },
            "state": "active",
            "state_since": "2019-10-18T21:24:57.581561-07:00",
            "sub_state": "running"
        }
  • Restart the metricset with the state_filter config value set to something like [active] and insure that the metricset is filtering out events with that state.

  • Also check the dashboard to make sure events are showing up.

@fearful-symmetry fearful-symmetry added enhancement Metricbeat Metricbeat Team:Integrations Label for the Integrations team labels Oct 23, 2019
@fearful-symmetry fearful-symmetry requested review from a team October 23, 2019 04:11
@fearful-symmetry fearful-symmetry self-assigned this Oct 23, 2019
@fearful-symmetry fearful-symmetry changed the title Add services metricset [metricbeat] Add services metricset Oct 23, 2019
@fearful-symmetry
Copy link
Contributor Author

Commit'ed the new dependencies. Not sure if we want to take the time to use the lower-level godbus library so we can trim down the dependencies a bit.

@fearful-symmetry
Copy link
Contributor Author

Still not sure how to map the nsec CPU usage, since it's an oddball metric. Everything else should be a little more standard now, I hope.

Copy link
Contributor

@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for starting this! I left a few comments

metricbeat/module/system/services/_meta/docs.asciidoc Outdated Show resolved Hide resolved
metricbeat/module/system/services/_meta/docs.asciidoc Outdated Show resolved Hide resolved
metricbeat/module/system/services/_meta/fields.yml Outdated Show resolved Hide resolved
metricbeat/module/system/services/data.go Outdated Show resolved Hide resolved
@fearful-symmetry
Copy link
Contributor Author

Cleaned this up a lot, and ended up using the mapstructure library, since it seemed like a better fit considering the conditional logic being thrown around everywhere.

@webmat
Copy link
Contributor

webmat commented Nov 11, 2019

Yes, dupe the data until 8.0, and mark the currently existing field as deprecated in favour of the new ECS one.

Of course this should only be done once the new field has been released in an official ECS release :-)

@fearful-symmetry
Copy link
Contributor Author

That sounds like a plan. Do we have some kind of process for deprecated/removing metricsets?

@philippkahr
Copy link
Contributor

The issue is that migrating the windows.service stuff to system.service would be a breaking change, which would have to wait until 8.0.

Sounds good to me. Mind if I take a shot at this? Is there any due date for 8.0?

I'd be open to expanding the ECS fields, since what we have now is pretty limited.

I will go through your fields.yml and look at this and create an issue at ECS to discuss that further?

@fearful-symmetry
Copy link
Contributor Author

I will go through your fields.yml and look at this and create an issue at ECS to discuss that further?

@webmat might know more, I don't have much contact with ECS.

Sounds good to me. Mind if I take a shot at this? Is there any due date for 8.0?

Right now I'd like to get this PR merged, as any cross-platform stuff falls outside the scope of this PR. I don't have much windows knowledge, so I'm more than happy to have other folks work on it.

@exekias
Copy link
Contributor

exekias commented Nov 11, 2019

Just checking in here, the scope of this PR is still the same, isn't it? Getting systemd services in first.

That sounds like a plan. Do we have some kind of process for deprecated/removing metricsets?

You flag it as deprecated both in docs and code, pointing users to the new replacement metricset. Then in 8.0 we remove it altogether. Just a note, the new services metricset needs to be in GA before deprecating the other.

@philippkahr
Copy link
Contributor

philippkahr commented Nov 11, 2019

Of course! I always thought that the merge between windows and system should be is own PR and not be blocking yours. I am sorry if I caused confusion

@fearful-symmetry
Copy link
Contributor Author

Just checking in here, the scope of this PR is still the same, isn't it? Getting systemd services in first.

Yep, the conversation has just expanded somewhat. We may want to open a separate issue for dealing with non-systemd integrations.

@webmat
Copy link
Contributor

webmat commented Nov 11, 2019

@philippkahr @fearful-symmetry Simplest way to get this going is probably to drop in #ecs and have a discussion to get aligned. Then we'll see if/which fields are needed. If new fields are needed, I can guide you on creating that PR :-)

@fearful-symmetry
Copy link
Contributor Author

jenkins, test this

1 similar comment
@fearful-symmetry
Copy link
Contributor Author

jenkins, test this

@fearful-symmetry fearful-symmetry merged commit ed74a3f into elastic:master Nov 14, 2019
fearful-symmetry added a commit to fearful-symmetry/beats that referenced this pull request Dec 16, 2019
* add system/services metricset and dashboard

(cherry picked from commit ed74a3f)
fearful-symmetry added a commit that referenced this pull request Dec 17, 2019
* add system/services metricset and dashboard

(cherry picked from commit ed74a3f)
@fearful-symmetry fearful-symmetry added the test-plan Add this PR to be manual test plan label Jan 15, 2020
@kvch kvch assigned kvch and unassigned fearful-symmetry Jan 17, 2020
@kvch
Copy link
Contributor

kvch commented Jan 17, 2020

@fearful-symmetry I have tested the metricset. Everything works well. One minor note is the new metricset is missing from the reference configuration.

@kvch kvch added the test-plan-regression Manually testing this PR found a regression label Jan 17, 2020
@kvch
Copy link
Contributor

kvch commented Jan 17, 2020

Created an issue for the missing config: #15635

@kvch kvch added test-plan-ok This PR passed manual testing and removed test-plan-regression Manually testing this PR found a regression labels Jan 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Metricbeat Metricbeat Team:Integrations Label for the Integrations team test-plan Add this PR to be manual test plan test-plan-ok This PR passed manual testing v7.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants