Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8s] Add elastic-agent helm-chart #5331

Merged

Conversation

pkoutsovasilis
Copy link
Contributor

@pkoutsovasilis pkoutsovasilis commented Aug 21, 2024

What does this PR do?

This PR introduces a Helm Chart for deploying Elastic-Agent in Kubernetes

Why is it important?

By having an Elastic-Agent Helm Chart we get a simplified deployment on k8s

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • [ ] I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Disruptive User Impact

N/A

How to test this PR locally

Go to deploy/helm/elastic-agent/examples

Related issues

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Aug 21, 2024
@pkoutsovasilis pkoutsovasilis force-pushed the pkoutsovasilis/agent_helm_chart branch 2 times, most recently from 6f3d557 to 50f534b Compare August 26, 2024 14:26
@pkoutsovasilis pkoutsovasilis changed the title [WIP] Add elastic-agent helm-chart [k8s] Add elastic-agent helm-chart Aug 26, 2024
@pkoutsovasilis pkoutsovasilis marked this pull request as ready for review August 26, 2024 14:33
@pkoutsovasilis pkoutsovasilis requested a review from a team as a code owner August 26, 2024 14:33
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@pkoutsovasilis pkoutsovasilis force-pushed the pkoutsovasilis/agent_helm_chart branch from 50f534b to 4e7b5b7 Compare August 26, 2024 21:28
@pkoutsovasilis pkoutsovasilis force-pushed the pkoutsovasilis/agent_helm_chart branch from 4e7b5b7 to dc88b1d Compare August 27, 2024 15:16
@elastic elastic deleted a comment from mergify bot Aug 27, 2024
@elastic elastic deleted a comment from mergify bot Aug 27, 2024
@elastic elastic deleted a comment from mergify bot Aug 27, 2024
@jlind23
Copy link
Contributor

jlind23 commented Aug 28, 2024

buildkite test it

@pkoutsovasilis pkoutsovasilis force-pushed the pkoutsovasilis/agent_helm_chart branch from dc88b1d to e27504e Compare August 28, 2024 06:30
@ycombinator ycombinator requested review from swiatekm and removed request for andrzej-stencel August 28, 2024 22:23
@ycombinator
Copy link
Contributor

@pkoutsovasilis Just a general comment: we plan to use this Helm chart to create an AWS Marketplace listing for Elastic Agent. To this end, I believe AWS has some requirements that the Helm chart must fulfill. Just wanted to bring that up in this PR in case it means some things in the chart need to change to meet these requirements. Thanks!

@pkoutsovasilis
Copy link
Contributor Author

@pkoutsovasilis Just a general comment: we plan to use this Helm chart to create an AWS Marketplace listing for Elastic Agent. To this end, I believe AWS has some requirements that the Helm chart must fulfill. Just wanted to bring that up in this PR in case it means some things in the chart need to change to meet these requirements. Thanks!

Hi @ycombinator 👋 I have read these requirements before opening this PR and I think that all of them are satisfied:

  1. it support both rootful and rootless deployments (we need the latter for AWS Marketplace)
  2. the Helm chart does not utilise anything from Helm that isn't supported by the EKS addon framework, such Helm hooks, lookup function, etc.
  3. although this is up for configuration but for the AWS Marketplace configuration we can offer deployment that have a Daemonset and thus meet the requirement of Traceable Deployments or Daemonsets

I don't see something that it isn't covered but maybe another extra pair of eyes on this PR would be helpful 🙂

Copy link
Contributor

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like all the examples! I was wondering if we could also render the final manifests using the example configurations and keep these in version control. The Otel Helm Charts do something like this and it acts partially as a test, and partially as code review aid, where it lets you visualize how a change to the templates impacts the rendered manifests.

@pkoutsovasilis
Copy link
Contributor Author

pkoutsovasilis commented Aug 29, 2024

I like all the examples! I was wondering if we could also render the final manifests using the example configurations and keep these in version control. The Otel Helm Charts do something like this and it acts partially as a test, and partially as code review aid, where it lets you visualize how a change to the templates impacts the rendered manifests.

oh that's a great idea @swiatekm , ty for bringing it up! 🙂 I say definitely let's add them; Initially I am gonna push them for each example and maybe afterwards we could investigate how to utilise them in a more meaningful way in CI

UPDATE: added two mage targets, one that renders the k8s manifest for each example inside the Helm chart and one that lints the Helm chart. I have also added these two in the check-ci makefile recipe, thus, a change that causes either the Helm chart to fail linting or the rendered examples to be changed without any commit will be captured

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I have a few questions.

I did not review the templates. They are so large and complex that it didn't seem like I should sit there and go through each and every line. I think it would be better to just merge this and start using it and make incremental fixes as issues are identified.

I am good with giving a +1 after and review some of the answers to my questions.

deploy/helm/elastic-agent/Chart.yaml Show resolved Hide resolved
deploy/helm/elastic-agent/README.md.gotmpl Show resolved Hide resolved
deploy/helm/elastic-agent/values.yaml Outdated Show resolved Hide resolved
enabled: true
# -- elastic-agent version
# @section -- 3 - Elastic-Agent Configuration
version: 8.15.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this needs to get bump every version. We should ensure this is added to the automation for this repo so that it gets bumped when we do new releases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkoutsovasilis are we intentionally not putting this under appVersion in Chart.yaml? Normally that indicates the Chart works with a wide array of application versions.

Copy link
Contributor Author

@pkoutsovasilis pkoutsovasilis Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I do that intentionally, because this Helm chart should be able to work with multiple elastic-agent versions. This was at least part of the old requirements, however maybe it is non-relevant any more?! @strawgate any opinion on that?

# -- image configuration
# @section -- 3 - Elastic-Agent Configuration
image:
repository: docker.elastic.co/beats/elastic-agent-complete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is -complete being used? synthetics? Seems like it would be better to not use that one unless they are wanting to do synthetics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we want to support that?! if we don't this can be default to something else or maybe switch to wolfi-based images?!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not default to -complete its much larger, just for synthetics. I would expect them to adjust the chart to use -complete if they want to run synthetics. Wolfi image could also be the default, no issues there from me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I am in a pickle here, for unprivileged agent we need version 8.16.0 and the same goes for wolfi image. Maybe I we should switch to 8.16.0-SNAPSHOT by default? PS: I did fallback to an image without -complete

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the open question here is how is this Helm chart getting released? Does it get updated out of sync of the Elastic Agent release or does it get released at the same time of the full stack release? If its stack release based then this should really sync with the version of the repository here and when its the main repository it should really be -SNAPSHOT as that is the only released version for running code in actual main. When 8.16 is created, then it should not use -SNAPSHOT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick thinking about that; I would like having this chart "following" the agent release to always have the appropriate agent image in it, but also be able to bump helm chart version to fix any agent-unrelated issue, e.g. issues with templating?!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we defer the release process question for now? It sounds like it could be its own major discussion, but whatever we decide is not going to require major changes to this Chart, which we're not publishing yet anyway. I think it might be better to merge this PR without answering this question, and do that in a separate issue. WDYT @blakerouse @pkoutsovasilis ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep I agree with deferring the release process question for now. That said, I added a helm:updateAgentVersion mage target that I deem it would be useful in this discussion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should defer. A follow up PR to change this is simple.

deploy/helm/elastic-agent/values.yaml Show resolved Hide resolved
Copy link
Contributor

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one final comment about names, but otherwise I think we're good to go.

I would still like to move away from named templates as much as possible, and definitely add a schema file for validation, but that can happen in a follow-up.

@pkoutsovasilis pkoutsovasilis force-pushed the pkoutsovasilis/agent_helm_chart branch from 673385b to 882b6cf Compare September 5, 2024 03:19
@elastic elastic deleted a comment from mergify bot Sep 5, 2024
Copy link
Contributor

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I am ready for this to be merged.

@pkoutsovasilis
Copy link
Contributor Author

Looks good, I am ready for this to be merged.

haha!! this phrase sounded like "release the Kraken"! 😄 ty both @swiatekm and @blakerouse for the comments and the reviews. That said, the agent-extended-testing CI step, which is required, is failing but it is not something caused by this PR?!

Copy link
Contributor

mergify bot commented Sep 6, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b pkoutsovasilis/agent_helm_chart upstream/pkoutsovasilis/agent_helm_chart
git merge upstream/main
git push upstream pkoutsovasilis/agent_helm_chart

@blakerouse
Copy link
Contributor

@pkoutsovasilis It is very much that!

The failures are flaky tests that we are working on, also seems there is a conflict now. I would merge main into the PR and fix the conflict and have it re-run to get a green CI run.

Copy link

@pkoutsovasilis pkoutsovasilis merged commit 189ec2b into elastic:main Sep 6, 2024
13 checks passed
@ycombinator
Copy link
Contributor

@pkoutsovasilis If we want to release the Helm chart as a beta initially, is there any way we can mention this as part of the chart itself, such that users would clearly see it before they use it or right as they start to use it?

@pkoutsovasilis
Copy link
Contributor Author

@ycombinator just thinking out loud the available options here:

  1. I assume that we can push the helm chart with a beta suffix at the version and update the NOTES.txt accordingly here https://helm.elastic.co/
  2. Somebody that wants to try this helm chart can clone the repo and install the helm chart from the folder deploy/helm/elastic-agent
  3. We could utilise github pages on this repo and commit directly the helm chart package.tgz (that points to the SNAPSHOT agent image) [my least favourite option]

that's why a discussion about the Helm chart release process is wise to have before any next step 🙂

@swiatekm
Copy link
Contributor

swiatekm commented Sep 9, 2024

Helm Charts use semver 2.0 for versioning, so it's simple enough to just designate our releases as betas by setting it to va.b.c-beta.d or something similar. This makes Helm automatically treat it as a prerelease. This will cause them not to show up in searches without the --devel flag, amongst other consequences.

@ycombinator
Copy link
Contributor

Thanks @pkoutsovasilis and @swiatekm. I've created #5485 to continue the discussion on the beta marking and to track the implementation as well.

Likewise, I have created #5486 to resume the discussion about the release process that we deferred from this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip enhancement New feature or request skip-changelog Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce a Helm chart for Kubernetes monitoring with Elastic Agent
7 participants