Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add k8s metadata autodetection #13473

Merged
merged 10 commits into from
Sep 12, 2019

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented Sep 3, 2019

This PR is a proposal for enabling autodetection mode for add_kubernetes_metadata.

Currently we configure add_cloud_metadata and add_host_metadata by default in all Beats.

add_cloud_metadata does autodetection, and it will only enrich events if the beat is running on a known public cloud.

It would be useful to have the same for Kubernetes, so if Beats is started in a scenario where Kubernetes is reachable with a default config, it automatically enables metadata for it. If Kubernetes is not reachable with a default config nothing will happen.

Part of: #13068
Depends on #13374

How to test it

  1. With k8s available
    a. Run metricbeat or filebeat inside a k8s cluster and without setting kube_config.
    b. Verify that processor is being automatically enabled. (Note that it will create a client in an in_cluster mode.)
  2. With k8s unavailable
    a. Run metricbeat or filebeat outside of a k8s cluster and without setting kube_config.
    b. Verify that processor is not being automatically enabled.

docs: https://www.elastic.co/guide/en/beats/filebeat/master/add-kubernetes-metadata.html#add-kubernetes-metadata

@ChrsMark ChrsMark requested review from a team as code owners September 3, 2019 14:25
@ChrsMark ChrsMark self-assigned this Sep 3, 2019
@ChrsMark ChrsMark added Team:Integrations Label for the Integrations team containers Related to containers use case [zube]: In Review review test-plan Add this PR to be manual test plan labels Sep 3, 2019
@ChrsMark ChrsMark force-pushed the add_k8s_metadata_autodetection branch from b11d4ec to fe6f4ca Compare September 4, 2019 09:28
@ChrsMark ChrsMark requested a review from a team September 4, 2019 15:17
@ChrsMark ChrsMark force-pushed the add_k8s_metadata_autodetection branch 2 times, most recently from 92f46a5 to 25e0373 Compare September 6, 2019 07:08
Copy link
Contributor

@odacremolbap odacremolbap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ChrsMark I haven't executed it (for now), so consider my comments code only.

It would be nice to an update on docs to explain how the autodetection works, maybe in this PR or at a follow up.

libbeat/processors/add_kubernetes_metadata/config.go Outdated Show resolved Hide resolved
libbeat/processors/add_kubernetes_metadata/kubernetes.go Outdated Show resolved Hide resolved
libbeat/processors/add_kubernetes_metadata/kubernetes.go Outdated Show resolved Hide resolved
libbeat/processors/add_kubernetes_metadata/kubernetes.go Outdated Show resolved Hide resolved
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@andresrc andresrc added in progress Pull request is currently in progress. and removed review labels Sep 9, 2019
@ChrsMark ChrsMark force-pushed the add_k8s_metadata_autodetection branch from 25e0373 to 7b89d95 Compare September 10, 2019 07:15
@ChrsMark
Copy link
Member Author

Hey @ChrsMark I haven't executed it (for now), so consider my comments code only.

It would be nice to an update on docs to explain how the autodetection works, maybe in this PR or at a follow up.

Hey @odacremolbap thank you for reviewing. I think all your comments have been addressed. I would love to hear back if you notice anything else.

As far as the docs' update is concerned, if it is needed I would go with a follow up PR. @jsoriano WDYT about his?

Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark force-pushed the add_k8s_metadata_autodetection branch from 7b89d95 to 2a9577f Compare September 10, 2019 07:54
@odacremolbap odacremolbap self-requested a review September 10, 2019 08:13
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark force-pushed the add_k8s_metadata_autodetection branch from 4c4898e to cbd16eb Compare September 10, 2019 10:43
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Copy link
Contributor

@odacremolbap odacremolbap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Let's wait for CI outcome

@jsoriano do you want to take a look?

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you have added some code to handle in-cluster configuration vs. explicitly specified kubeconfig, I am worried that this can cause some behavioural inconsistencies with other kubernetes features in Beats.
I think this code shouldn't be needed here. If it is needed then this is probably also required in other places and it should be added to some common place so all kubernetes clients are initialized consistently.

return true
}
return false
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be consistent with the logic to configure kubernetes clients along all our kubernetes features.

I think we shouldn't need to add special handling here to get the config path from the environment variables, or to get in-cluster configuration. If we need to add this logic for some reason, we should consider adding it in a common places, so logic is kept consistent along all kubernetes features.

Why did you need to add this logic?

We may need to revisit this conversation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the field InCluster can be removed.

Regarding the searching of environmental variables or kubeconfig in $HOME directory:
If we are not in an "incluster" mode then the autodetection will just fail if users have not set kube_config in their configuration cause it will not be able to create a client etc. So auto-detection will be actually available only in "Incluster" deployments or we should require from the user to set the kubeconfig.

In addition this code has do only with add_kubernetes_metadata processor, so is it possible to affect other features too (really not sure about that, just asking).

So what we should do? I'm open on any proposal :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the field InCluster can be removed.

👍

Regarding the searching of environmental variables or kubeconfig in $HOME directory:
If we are not in an "incluster" mode then the autodetection will just fail if users have not set kube_config in their configuration cause it will not be able to create a client etc. So auto-detection will be actually available only in "Incluster" deployments or we should require from the user to set the kubeconfig.

I think that failing if in-cluster configuration is not available and kubeconfig is not set would be perfectly fine, and it would be consistent with the rest of kubernetes features.

Said that, I see that needing to set kubeconfig in all Kubernetes features can be cumbersome, we could try to explore the use of KUBECONFIG env variable as an alternative to set it everywhere. But if something needs to be done for that I would do it in the common helpers, and in a separate PR.

In addition this code has do only with add_kubernetes_metadata processor, so is it possible to affect other features too (really not sure about that, just asking).

So what we should do? I'm open on any proposal :)

I'd say to:

  • Keep it failing if in-cluster configuration is not available and kubeconfig is not set. In any case the most common case for Kubernetes is to have in-cluster configuration working.
  • Consider leveraging KUBECONFIG for all our Kubernetes features, but as a separate issue/PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

func isKubernetesAvailable(client k8s.Interface) bool {
server, err := client.Discovery().ServerVersion()
if err != nil {
logp.Err("%v: could not detect kubernetes env: %v", "add_kubernetes_metadata", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be logged as error, as this is going to happen on any non-kubernetes deployment using the default configuration.

func defaultKubernetesAnnotatorConfig() kubeAnnotatorConfig {
return kubeAnnotatorConfig{
KubeConfig: getSystemKubeConfig(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in other places we are assuming that if no kubeconfig is specified, then in cluster configuration is wanted. We should keep this behaviour also here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it falls under the same doubts of #13473 (comment)

} else {
logp.Err("%v: could not create kubernetes client using config: %v", "add_kubernetes_metadata", config.KubeConfig)
}
processor.kubernetesAvailable = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Set kubernetesAvailable to false already when initializing the processor (or just don't set it as is the zero value), and only set it to true when all checks have been done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here it is not needed to set processor.kubernetesAvailable = false.

Suggested change
processor.kubernetesAvailable = false

"original": "fields",
},
},
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why initializing the cache if kubernetes is not available?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is not required. It is just a left-over from copying the above test. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, try to remove it then 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

libbeat/processors/add_kubernetes_metadata/config.go Outdated Show resolved Hide resolved
client, err := kubernetes.GetKubernetesClient(config.KubeConfig)
if err != nil {
return nil, err
if config.InCluster {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config.InCluster could be replaced by kubernetes.IsInCluster(config.KubeConfig) as used in other places. Or at least it should be consistent with the result of this call.

@jsoriano
Copy link
Member

As far as the docs' update is concerned, if it is needed I would go with a follow up PR. @jsoriano WDYT about his?

Ok to do docs changes in a follow up
We may also need to review docs of in_cluster setting, I think that this is being ignored after #13051

@ChrsMark
Copy link
Member Author

So @jsoriano the big deal is about if we decide to have auto enablement of add_kubernetes_metadata to be able to search on its own for kubeconfig file (in $HOME path or in env) or request from the user to specify that. WDYT?

cc: @odacremolbap

Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark requested a review from jsoriano September 11, 2019 13:16
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I have added some extra minor comments, please address them before merging.

@@ -248,6 +248,8 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- Add support for RFC3339 time zone offsets in JSON output. {pull}13227[13227]
- Add autodetection mode for add_docker_metadata and enable it by default in included configuration files{pull}13374[13374]
- Added `monitoring.cluster_uuid` setting to associate Beat data with specified ES cluster in Stack Monitoring UI. {pull}13182[13182]
- Add autodetection mode for add_kubernetes_metadata and enable it by default in included configuration files{pull}13473[13473]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit.

Suggested change
- Add autodetection mode for add_kubernetes_metadata and enable it by default in included configuration files{pull}13473[13473]
- Add autodetection mode for add_kubernetes_metadata and enable it by default in included configuration files. {pull}13473[13473]

@@ -36,6 +36,7 @@ const defaultNode = "localhost"
// in cluster configuration based on the secrets mounted in the Pod. If kubeConfig is passed,
// it parses the config file to get the config required to build a client.
func GetKubernetesClient(kubeconfig string) (kubernetes.Interface, error) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Remove this added line.

Suggested change

} else {
logp.Err("%v: could not create kubernetes client using config: %v", "add_kubernetes_metadata", config.KubeConfig)
}
processor.kubernetesAvailable = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here it is not needed to set processor.kubernetesAvailable = false.

Suggested change
processor.kubernetesAvailable = false

if kubernetes.IsInCluster(config.KubeConfig) {
logp.Err("%v: could not create kubernetes client using in_cluster config", "add_kubernetes_metadata")
} else {
logp.Err("%v: could not create kubernetes client using config: %v", "add_kubernetes_metadata", config.KubeConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also these ones should be logged at debug level, because they are going to be logged in all non-kubernetes deployments.

@@ -26,17 +26,20 @@ import (
"github.com/elastic/beats/libbeat/common/kubernetes"
"github.com/elastic/beats/libbeat/logp"
"github.com/elastic/beats/libbeat/processors"

k8s "k8s.io/client-go/kubernetes"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import should be placed before beats imports.

	"fmt"
	"time"

	k8s "k8s.io/client-go/kubernetes"

	"github.com/elastic/beats/libbeat/beat"
	"github.com/elastic/beats/libbeat/common"
	"github.com/elastic/beats/libbeat/common/kubernetes"
	"github.com/elastic/beats/libbeat/logp"
	"github.com/elastic/beats/libbeat/processors"

And I wonder if we should use another alias, k8s is basically the same as kubernetes, maybe k8sclient... Not sure.

Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark requested a review from jsoriano September 12, 2019 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
containers Related to containers use case release-highlight review Team:Integrations Label for the Integrations team test-plan Add this PR to be manual test plan v7.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants