Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate the copy of autodiscovery templates inside Elastic Agent image #5661

Closed
gizas opened this issue Oct 2, 2024 · 10 comments · Fixed by #6381
Closed

Investigate the copy of autodiscovery templates inside Elastic Agent image #5661

gizas opened this issue Oct 2, 2024 · 10 comments · Fixed by #6381
Assignees

Comments

@gizas
Copy link
Contributor

gizas commented Oct 2, 2024

Currently we have a CI job that is responsible for creating all of the hints autdiscovery templates under https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone/templates.d

The goal of this story is to make a copy of them inside the elastic agent image and keep them inside it for the users.

Pros: This will eliminate the need of an init container for our users
Neg: We need to figure out what we will do with the CI update cycle of the templates. The CI update is not affected by the agents release cycle but we need to see if want to keep both (the ci and the local copy inside image) and if yes to think of issues in case of differences

Initiator of the problem: #5643 (comment)

@pkoutsovasilis
Copy link
Contributor

pkoutsovasilis commented Dec 4, 2024

@cmacknz the ask here is; to copy the templates inside agent container images or not to copy?! I believe that, given the size of these templates 920KB, this makes sense as it will allow us to discard the need of an init container that downloads the respective agent release archive and extracts the templates from there. As a result, when hints autodiscovery feature is enabled, we get more robust and faster pod start sequence, as well as we are not prone to external network errors (we can also support airtight environments). Now on the downsides, the templates can't be updated without a new elastic-agent container image. To this end, this isn't a deal breaker for me as I don't believe there will be an urgent case that demands it and the frequency of stack releases is enough to cover them. Any thoughts?

@gizas
Copy link
Contributor Author

gizas commented Dec 4, 2024

To this end, this isn't a deal breaker for me as I don't believe there will be an urgent case that demands it and the frequency of stack releases is enough to cover them.

If it helps, we have run only 2 times the CI to update the templates so far

@cmacknz
Copy link
Member

cmacknz commented Dec 4, 2024

I believe that, given the size of these templates 920KB, this makes sense as it will allow us to discard the need of an init container that downloads the respective agent release archive and extracts the templates from there

Agreed

Now on the downsides, the templates can't be updated without a new elastic-agent container image. To this end, this isn't a deal breaker for me as I don't believe there will be an urgent case that demands it and the frequency of stack releases is enough to cover them.

Most of the time this will be fine. If we do this can we make it possible to override the set of embedded integrations? Maybe just changing the location agent expects the inputs.d directory to exist at to something else so a new volume mount can provide the configurations? This doesn't have to be the default way but gives us an escape hatch to delivery fixes without requiring releases.

@pkoutsovasilis
Copy link
Contributor

Most of the time this will be fine. If we do this can we make it possible to override the set of embedded integrations? Maybe just changing the location agent expects the inputs.d directory to exist at to something else so a new volume mount can provide the configurations? This doesn't have to be the default way but gives us an escape hatch to delivery fixes without requiring releases.

Yes that's a nice proposal @cmacknz and it is actually feasible exactly the way you describe it. The only "tricky" part is to make agent detect these from another path outside it's state one as the latter most of the times is a mount of a host path. Maybe /usr/share/elastic-agent/input.d is suitable here?

@gizas please correct me if I am wrong but the job that keeps these templates up to date is this one?

I believe the solution to this issue is relatively straight forward and comprises of two steps:

  1. Change the location agent expects the inputs.d directory
    1. Either code new env var that allows us to point to a different inputs.d path (code ref) (I am leaning towards this approach)
    2. Tap into the container cmd and extend setPaths (code ref to actually set the external inputs path only for containerised agents.
  2. copy the templates inside the agent images.

@gizas
Copy link
Contributor Author

gizas commented Dec 11, 2024

@gizas please correct me if I am wrong but the job that keeps these templates up to date is this one?

Indeed this is it.

And this is currently the buildkite job and a reference here

@blakerouse
Copy link
Contributor

We need to understand the performance impact this will have an users that are not using hints with Kubernetes. Being that all inputs are placed into inputs.d that means that the Elastic Agent will parse these configuration and determine that kubernetes.* will always be present in the configuration. The output of pre-config.yml in diagnostics will now include all of these inputs.

This change #6169 will also no longer have an affect as the kubernetes.* always be present now and always try to run.

I don't think with this information, we just want these inputs to be read and parsed all the time because they are in inputs.d. I think doing this better when only in the case that they want to enable hints do we want the Elastic Agent to run them.

@pkoutsovasilis
Copy link
Contributor

pkoutsovasilis commented Dec 11, 2024

thanks for chiming in @blakerouse, that is quite helpful 🙂 How about the following adjustment to the plan; embed the hints in the agent image under /etc/elastic-agent/hints.inputs.d and extend the agent container cmd to copy them over to the usual inputs.d dir only when a flag/env_var is set? would that allow us to maintain the benefits of #6169 but also get away with init containers and downloading them on every pod restart? But then removing them from there is the tricky part 🤔

@blakerouse
Copy link
Contributor

@pkoutsovasilis I think that is definitely a valid option, would be nice to do it based on the setting in the configuration though. I think having to have both a flag/env as well as the config setting seems complicated and easy to get wrong. It is also possible that we could remove the need to copy and just have agent read from that path in the case that the setting is enabled.

See:

discover := config.Discoverer(pathConfigFile, cfg.Settings.Path, paths.ExternalInputs())

That could be changed to add another discover path when that setting is enabled.

@pkoutsovasilis
Copy link
Contributor

Ok I see what you mean and I agree /etc/elastic-agent/hints.input.d should go inside the config.Discoverer. Now about having that there based on the given configuration, seems ideal, but the Providers config structure/unpacking feels like miles apart from the config.Discoverer code, seems almost intentional. Thus not so sure how minimal of a change that would be if we do it this way.

@blakerouse
Copy link
Contributor

@pkoutsovasilis I agree it is miles apart, but we don't allow live reloading of the providers configuration. I think just a direct check in this code path for it, is the simplest way to make it work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants