Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to a modules.d layout #4537

Merged
merged 14 commits into from
Jul 7, 2017
Merged

Move to a modules.d layout #4537

merged 14 commits into from
Jul 7, 2017

Conversation

exekias
Copy link
Contributor

@exekias exekias commented Jun 20, 2017

This PR moves module settings into a modules.dfolder, and adds a helper command to list/enable/disable them both in filebeat and metricbeat. Final layout for metricbeat looks like this:

» ls modules.d 
apache.yml.disabled  couchbase.yml.disabled   elasticsearch.yml.disabled  http.yml.disabled     kibana.yml.disabled      mysql.yml.disabled    postgresql.yml.disabled  redis.yml.disabled    windows.yml.disabled
audit.yml.disabled   docker.yml.disabled      golang.yml.disabled         jolokia.yml.disabled  kubernetes.yml.disabled  nginx.yml.disabled    prometheus.yml.disabled  system.yml            zookeeper.yml.disabled
ceph.yml.disabled    dropwizard.yml.disabled  haproxy.yml.disabled        kafka.yml.disabled    memcached.yml.disabled   php_fpm.yml.disabled  rabbitmq.yml.disabled    vsphere.yml.disabled

And it's managed like this:

X1 :: elastic/beats/metricbeat ‹modules.d*› » ./metricbeat modules list             
Enabled:
system

Disabled:
apache
audit
ceph
couchbase
docker
dropwizard
elasticsearch
golang
haproxy
http
jolokia
kafka
kibana
kubernetes
memcached
mysql
nginx
php_fpm
postgresql
prometheus
rabbitmq
redis
vsphere
windows
zookeeper

X1 :: elastic/beats/metricbeat ‹modules.d*› » ./metricbeat modules enable redis     
Enabled redis

X1 :: elastic/beats/metricbeat ‹modules.d*› » ./metricbeat modules list        
Enabled:
redis
system

Disabled:
...

Some caveats:

  • modules subcommand only works with default layout, user can still break that by manually adding modules to the yaml, or disabling metricbeat.config.modules.path
  • I haven't enabled conf reloading by default, but this should work fine with it

TODO:

  • Make this work for metricbeat
  • Make this work for filebeat (I think it requires more work as config reload is only supported for prospectors but not modules.d, I will add support for it)
  • System tests
  • Packaging

reload.enabled: false

# Period on which files under path should be checked for changes
reload.period: 10s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps comment this out so it's clear no reloading happens.

@@ -6,37 +6,17 @@
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/metricbeat/index.html

#========================== Modules configuration ============================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume adding config options to this file still works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it should work as usual


metricbeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part I worry here is that it becomes easier to break the getting started. Before getting started was 2 files: Binary + Basic config. Now a directory with the correct path + a one more config file are needed. Don't have a better solution here, just want to raise that point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess most people use downloaded packages or tar.gz, everything will be on those.

For the scripted case, old config formats should still work. I don't see this becoming an issue but you never know, it's a good point

@@ -0,0 +1,3 @@
- module: audit
metricsets: ["kernel"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have enabled in all except this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it didn't have the enabled: false before, this makes me think I can remove enabled from all of them, so we don't have 2 visible ways to disable (rename file vs changing field)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove it. The problem is that the same file is also used for the docs where it is probably nice to have it.

# This file is a full configuration example documenting all non-deprecated
# options in comments. For a shorter configuration example, that contains only
# the most common options, please see metricbeat.yml in the same directory.
# This file is an example configuration file highlighting only the most common
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this has now the short config header?

@@ -1,470 +1,28 @@
########################## Metricbeat Configuration ###########################
###################### Metricbeat Configuration Example #######################
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to the full config? As now each module is separate, should the config there directly contain the full config and only have 1 version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I broke something here, will fix it :)

@exekias exekias force-pushed the modules.d branch 4 times, most recently from 2130c5c to 374dc2c Compare June 21, 2017 13:01
@ruflin
Copy link
Contributor

ruflin commented Jun 22, 2017

Thinking out loud: Looking at apache an alternative approach would be to have 2 directories, one with the enabled configs as symlinks and the other with all inside. So if you enable a module, it writes a symlink to the enabled directory. Not sure if this one is better, but the advantage is that "only" a symlink to a file has to be created an no files renamed which feels more intrusive to me. But also symlinks have it's problem cross platform.

I like that in the current current implementation the pattern for enabled / disabled can be configured as far as I understand. I was first thinking if we could also play around with the -M flag here and have it in an internal state which modules are enabled and not. But this would not work across restart. It would require metricbeat to write a state about enabled / disabled modules. Could this be an alternative so no config files have to be touched / modified?

What happens in the current implementation if there is a config file with a module inside that has enabled: false? Will it overwrite the enabled flag?

@tsg
Copy link
Contributor

tsg commented Jun 22, 2017

Yeah, symlink-ing is problematic on Windows. Playing with the current approach, I think it's intuitive enough.

Some thoughts and concerns:

  • What does this mean for metricbeat.full.yml? Perhaps a solution would be to have the modules.d approach for the short config, but have metricbeat.full.yml still as a huge file?
  • For Filebeat there might be a confusion created between filebeat --modules=nginx and filebeat modules enable nginx. I do think it makes sense to have both, but there's going to be confusion.
  • Another source of confusion for Filebeat can be between /usr/share/filebeat/modules and the new /etc/filebeat/modules.d.
  • After using the installed DEB/RPM, one gets a script file lie /usr/bin/filebeat.sh. I wonder if this will interfere with the commands in any way.
  • I think we shouldn't enable auto-reloading for now (but it's really cool that it can be enabled with one line in the config). That means I think auto-reloading is not mandatory for the first FB version.
  • What about the other Beats? Do we keep this FB & MB only for now?

I think we should discuss this over zoom :)

@exekias exekias force-pushed the modules.d branch 4 times, most recently from 40840a0 to 5601f0b Compare June 26, 2017 19:00
@exekias
Copy link
Contributor Author

exekias commented Jun 27, 2017

Working on filebeat support, once #4566 it should be trivial to add it here

@exekias exekias force-pushed the modules.d branch 6 times, most recently from fcd3b90 to 4c9330e Compare July 4, 2017 17:05
@exekias
Copy link
Contributor Author

exekias commented Jul 4, 2017

jenkins package it

@exekias exekias force-pushed the modules.d branch 2 times, most recently from 6cf46a3 to 1fadb98 Compare July 4, 2017 17:52
@exekias
Copy link
Contributor Author

exekias commented Jul 4, 2017

jenkins package it

@exekias
Copy link
Contributor Author

exekias commented Jul 5, 2017

jenkins retest it

@exekias
Copy link
Contributor Author

exekias commented Jul 5, 2017

Packaging job passed: http://build-eu-00.elastic.co/job/beats-package-PR/112/, will need to do manual testing on them

@exekias exekias removed the in progress Pull request is currently in progress. label Jul 5, 2017
@exekias exekias changed the title [WIP] Move to a modules.d layout Move to a modules.d layout Jul 5, 2017
@exekias
Copy link
Contributor Author

exekias commented Jul 5, 2017

jenkins package it please

@tsg
Copy link
Contributor

tsg commented Jul 5, 2017

make update complains that:

cp: /Users/tsg/src/github.com/elastic/beats/filebeat/module/redis/_meta/config.yml: No such file or directory

I think the redis config file is missing?

@exekias
Copy link
Contributor Author

exekias commented Jul 5, 2017

@exekias
Copy link
Contributor Author

exekias commented Jul 5, 2017

jenkins package it

@@ -0,0 +1,16 @@
#- module: nginx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all these should be un-commented? Otherwise they are not really enabled.

@tsg
Copy link
Contributor

tsg commented Jul 5, 2017

I tried starting filebeat after enabling the nginx module, and I think there are two potential issues:

  • Everything in nginx.yml is commented out, so it's not really enabled
  • Despite the module not being enabled, I didn't get an error saying that no modules and prospectors are on. That would be normal if configuration reloading is enabled, but it was not.

As we moved them into modules.d, they won't be enabled until the file is
renamed
@exekias
Copy link
Contributor Author

exekias commented Jul 5, 2017

Despite the module not being enabled, I didn't get an error saying that no modules and prospectors are on. That would be normal if configuration reloading is enabled, but it was not.

This was the behavior already for prospectors, the issue is that dynamic config loading (from *.yml files) is done by the cfgfile Reloader, so it happens after startup. I can try to add code to do that check before if reload is disabled.

Perhaps that can go in a follow up PR, WDYT?

@tsg
Copy link
Contributor

tsg commented Jul 5, 2017

Yeah, absolutely, it can come in a different PR.

@tsg
Copy link
Contributor

tsg commented Jul 5, 2017

@exekias I get an error when running without Elasticsearch available. Filebeat should start even if ES is not available, and just wait for it. This regression might have been introduced in #4566, but we should fix it for beta1.

$ sudo filebeat.sh -e
2017/07/05 12:58:25.561142 beat.go:470: INFO Home path: [/usr/share/filebeat] Config path: [/etc/filebeat] Data path: [/var/lib/filebeat] Logs path: [/var/log/filebeat]
2017/07/05 12:58:25.561173 beat.go:495: INFO Beat metadata path: /var/lib/filebeat/meta.json
2017/07/05 12:58:25.561224 beat.go:477: INFO Beat UUID: 503a0697-3aff-4aa6-9b9a-b5a82b52ca97
2017/07/05 12:58:25.561233 beat.go:239: INFO Setup Beat: filebeat; Version: 6.0.0-alpha3
2017/07/05 12:58:25.561352 client.go:130: INFO Elasticsearch url: http://localhost:9200
2017/07/05 12:58:25.562946 metrics.go:23: INFO Metrics logging every 30s
2017/07/05 12:58:25.563015 pipeline.go:71: INFO Publisher name: localhost.localdomain
2017/07/05 12:58:25.563024 publish.go:108: INFO Publisher name: localhost.localdomain
2017/07/05 12:58:25.563738 beat.go:315: INFO filebeat start running.
2017/07/05 12:58:25.563842 registrar.go:83: INFO Registry file set to: /var/lib/filebeat/registry
2017/07/05 12:58:25.563891 registrar.go:104: INFO Loading registrar data from /var/lib/filebeat/registry
2017/07/05 12:58:25.565343 registrar.go:115: INFO States Loaded from registrar: 1
2017/07/05 12:58:25.565578 client.go:130: INFO Elasticsearch url: http://localhost:9200
2017/07/05 12:58:25.571231 registrar.go:148: INFO Starting Registrar
2017/07/05 12:58:25.571269 sync.go:41: INFO Start sending events to output
2017/07/05 12:58:25.571290 spooler.go:63: INFO Starting spooler: spool_size: 2048; idle_timeout: 5s
2017/07/05 12:58:25.571485 elasticsearch.go:188: ERR Error connecting to Elasticsearch: http://localhost:9200
2017/07/05 12:58:25.571508 spooler.go:100: INFO Stopping spooler
2017/07/05 12:58:25.571529 registrar.go:202: INFO Stopping Registrar
2017/07/05 12:58:25.571538 registrar.go:160: INFO Ending Registrar
2017/07/05 12:58:25.573787 metrics.go:51: INFO Total non-zero values:  beat.memstats.gc_next=4473924 beat.memstats.memory_alloc=2946528 beat.memstats.memory_total=2946528 registrar.states.current=1 registrar.writes=1
2017/07/05 12:58:25.573812 metrics.go:52: INFO Uptime: 20.086641ms
2017/07/05 12:58:25.573817 beat.go:323: INFO filebeat stopped.
2017/07/05 12:58:25.573829 beat.go:650: CRIT Exiting: Error creating Elasticsearch client: Couldn't connect to any of the configured Elasticsearch hosts
Exiting: Error creating Elasticsearch client: Couldn't connect to any of the configured Elasticsearch hosts

@@ -93,6 +95,29 @@ def test_reload_writes_pipeline(self):
for key in self.es.transport.perform_request("GET", "/_ingest/pipeline/").keys()))
proc.check_kill_and_wait()

def test_no_es_connection(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to see a test on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to avoid tripping twice over the same stone 😸

@cp ${ES_BEATS}/filebeat/_meta/common.reference.p1.yml _meta/beat.reference.yml
@${PYTHON_ENV}/bin/python ${ES_BEATS}/script/config_collector.py --beat ${BEAT_NAME} --full $(PWD) >> _meta/beat.reference.yml
@cat ${ES_BEATS}/filebeat/_meta/common.reference.p1.yml > _meta/beat.reference.yml
@. ${PYTHON_ENV}/bin/activate; python ${ES_BEATS}/script/config_collector.py --beat ${BEAT_NAME} --full $(PWD) >> _meta/beat.reference.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to activate the virtual environment? I thought you could invoke ${PYTHON_ENV}/bin/python because it's directly running a script (and it's slightly faster)? [virtualenv activate docs]

}

if !strings.HasSuffix(glob, ".yml") {
return nil, errors.Errorf("wrong settings for config.modules.path, it is expected to end with *.yml. Got: %s", glob)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm taking the error too literally, but the check is for .yml and the error says it must end with *.yml which is a disconnect.

}

func (g *GlobManager) load() error {
g.files = []*cfgfile{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not initialize this to an empty slice. I've found it's best to not distinguish between empty and nil slices. https://github.com/golang/go/wiki/CodeReviewComments#declaring-empty-slices


// ListEnabled conf files
func (g *GlobManager) ListEnabled() []string {
names := []string{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use var names []string (occurs in a few places).

@@ -34,7 +34,6 @@ in <<configuration-metricbeat>>. Here is an example configuration:
metricbeat.modules:
- module: apache
metricsets: ["status"]
enabled: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call 👍

@andrewkroh
Copy link
Member

LGTM.

@exekias
Copy link
Contributor Author

exekias commented Jul 7, 2017

Thank you for the review @andrewkroh!, I think I addressed all comments

@exekias
Copy link
Contributor Author

exekias commented Jul 7, 2017

jenkins retest this

@dedemorton
Copy link
Contributor

Removed needs_docs label because we should be covered with the CLI changes plus the additions for modules.d in #4973

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants