-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] Add elastic-agent helm-chart #5331
[k8s] Add elastic-agent helm-chart #5331
Conversation
6f3d557
to
50f534b
Compare
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
50f534b
to
4e7b5b7
Compare
4e7b5b7
to
dc88b1d
Compare
buildkite test it |
dc88b1d
to
e27504e
Compare
@pkoutsovasilis Just a general comment: we plan to use this Helm chart to create an AWS Marketplace listing for Elastic Agent. To this end, I believe AWS has some requirements that the Helm chart must fulfill. Just wanted to bring that up in this PR in case it means some things in the chart need to change to meet these requirements. Thanks! |
Hi @ycombinator 👋 I have read these requirements before opening this PR and I think that all of them are satisfied:
I don't see something that it isn't covered but maybe another extra pair of eyes on this PR would be helpful 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like all the examples! I was wondering if we could also render the final manifests using the example configurations and keep these in version control. The Otel Helm Charts do something like this and it acts partially as a test, and partially as code review aid, where it lets you visualize how a change to the templates impacts the rendered manifests.
oh that's a great idea @swiatekm , ty for bringing it up! 🙂 I say definitely let's add them; Initially I am gonna push them for each example and maybe afterwards we could investigate how to utilise them in a more meaningful way in CI UPDATE: added two mage targets, one that renders the k8s manifest for each example inside the Helm chart and one that lints the Helm chart. I have also added these two in the check-ci makefile recipe, thus, a change that causes either the Helm chart to fail linting or the rendered examples to be changed without any commit will be captured |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I have a few questions.
I did not review the templates. They are so large and complex that it didn't seem like I should sit there and go through each and every line. I think it would be better to just merge this and start using it and make incremental fixes as issues are identified.
I am good with giving a +1 after and review some of the answers to my questions.
deploy/helm/elastic-agent/examples/kubernetes-only-logs/agent-kubernetes.yaml
Outdated
Show resolved
Hide resolved
enabled: true | ||
# -- elastic-agent version | ||
# @section -- 3 - Elastic-Agent Configuration | ||
version: 8.15.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this needs to get bump every version. We should ensure this is added to the automation for this repo so that it gets bumped when we do new releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pkoutsovasilis are we intentionally not putting this under appVersion
in Chart.yaml? Normally that indicates the Chart works with a wide array of application versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I do that intentionally, because this Helm chart should be able to work with multiple elastic-agent versions. This was at least part of the old requirements, however maybe it is non-relevant any more?! @strawgate any opinion on that?
# -- image configuration | ||
# @section -- 3 - Elastic-Agent Configuration | ||
image: | ||
repository: docker.elastic.co/beats/elastic-agent-complete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is -complete
being used? synthetics
? Seems like it would be better to not use that one unless they are wanting to do synthetics
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we want to support that?! if we don't this can be default to something else or maybe switch to wolfi-based images?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not default to -complete
its much larger, just for synthetics. I would expect them to adjust the chart to use -complete
if they want to run synthetics. Wolfi image could also be the default, no issues there from me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I am in a pickle here, for unprivileged agent we need version 8.16.0 and the same goes for wolfi image. Maybe I we should switch to 8.16.0-SNAPSHOT by default? PS: I did fallback to an image without -complete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the open question here is how is this Helm chart getting released? Does it get updated out of sync of the Elastic Agent release or does it get released at the same time of the full stack release? If its stack release based then this should really sync with the version of the repository here and when its the main
repository it should really be -SNAPSHOT
as that is the only released version for running code in actual main
. When 8.16
is created, then it should not use -SNAPSHOT
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick thinking about that; I would like having this chart "following" the agent release to always have the appropriate agent image in it, but also be able to bump helm chart version to fix any agent-unrelated issue, e.g. issues with templating?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we defer the release process question for now? It sounds like it could be its own major discussion, but whatever we decide is not going to require major changes to this Chart, which we're not publishing yet anyway. I think it might be better to merge this PR without answering this question, and do that in a separate issue. WDYT @blakerouse @pkoutsovasilis ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep I agree with deferring the release process question for now. That said, I added a helm:updateAgentVersion mage target that I deem it would be useful in this discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should defer. A follow up PR to change this is simple.
…asilis/agent_helm_chart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one final comment about names, but otherwise I think we're good to go.
I would still like to move away from named templates as much as possible, and definitely add a schema file for validation, but that can happen in a follow-up.
…helm_chart # Conflicts: # NOTICE.txt # go.mod # go.sum
673385b
to
882b6cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I am ready for this to be merged.
haha!! this phrase sounded like "release the Kraken"! 😄 ty both @swiatekm and @blakerouse for the comments and the reviews. That said, the agent-extended-testing CI step, which is required, is failing but it is not something caused by this PR?! |
This pull request is now in conflicts. Could you fix it? 🙏
|
@pkoutsovasilis It is very much that! The failures are flaky tests that we are working on, also seems there is a conflict now. I would merge main into the PR and fix the conflict and have it re-run to get a green CI run. |
…helm_chart # Conflicts: # go.sum
|
@pkoutsovasilis If we want to release the Helm chart as a beta initially, is there any way we can mention this as part of the chart itself, such that users would clearly see it before they use it or right as they start to use it? |
@ycombinator just thinking out loud the available options here:
that's why a discussion about the Helm chart release process is wise to have before any next step 🙂 |
Helm Charts use semver 2.0 for versioning, so it's simple enough to just designate our releases as betas by setting it to |
Thanks @pkoutsovasilis and @swiatekm. I've created #5485 to continue the discussion on the beta marking and to track the implementation as well. Likewise, I have created #5486 to resume the discussion about the release process that we deferred from this PR. |
What does this PR do?
This PR introduces a Helm Chart for deploying Elastic-Agent in Kubernetes
Why is it important?
By having an Elastic-Agent Helm Chart we get a simplified deployment on k8s
Checklist
[ ] I have made corresponding change to the default configuration files[ ] I have added an entry in./changelog/fragments
using the changelog toolDisruptive User Impact
N/A
How to test this PR locally
Go to
deploy/helm/elastic-agent/examples
Related issues