-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add use_batch_format for HTTP output plugin #8184
Conversation
If a change doesn't benefit many users it doesn't make sense for us to merge it. Why do you need to send records non-batched? It must bad for performance. I saw that you don't want me to ask but we won't merge unless there's a good reason that is useful generally. Does it help you debug or log something? |
We deliver the metrics to a business unit that uses logstash/elasticsearch. They do not want to process the metrics from a JSON array, as this would be an exception for their input processing. (They (and we) accept the performance penalty, as we are not going to send a significant amount of data.) The default (batched format) isn't touched, and the code change is small, actually its one additional condition that shouldn't sacrifice too much performance for users that do not use it. And this change makes the HTTP output plugin more compatible with the FILE output plugin. And yes, it even helped me in understanding and debugging HTTP related issues. |
Why not try to move the data input from the telegraf into a dedicated pipeline and already in it split one event (incoming JSON array) into separate events (metric)? Logstash seems to be a better place to do this. Moreover, this is a fairly simple task - https://www.elastic.co/guide/en/logstash/current/plugins-filters-split.html. |
From technical point of view I fully agree, and would never have had the idea to use a single HTTP request per event (metric). But the integration into an existing infrastructure requires us to do so. For this reason we implemented (copied from file output plugin) the unbatched output. I believe, the comment in the README and the example config should make clear, that this unbatched output should not be used without any reason. The impact of this change for existing usecases (batched output) should be minimal (one additional conditional). Merging this change would greatly improve the chance of integrating telegraph into an existing large-scale infrastructure. |
Is there any chance to get this PR merged into the upstream sources? It would greatly improve the chance for the integration of telegraf into a legacy environment. |
I think Heiko is only trying to have a similar behaviour like here:
The json "unbatched" seems a valid use case that is already implemented in telegraf via the file plugin. I would argue to merge that as it works quite well for the intention. |
I added a unit test for http batched/unbatched output. Hopefully this increases the chance to get this enhancement merged into the upstream. |
Rebased to the latest master. |
Please review and merge into upstream master, we still use this code in production and would be more than happy having it included in your upstream source, as this would greatly simplify our build pipeline. And, besides this, we still think, this is a valuable option for debugging and for integrating into other environments (which can't handle batched HTTP output) |
Hi, @sspaink I'd like to bring this PR to your attention, it is rebased onto the current master now. Having it integrated would help the deployment of Telegraf in a large environment, where for several (legacy) reasons the HTTP data have to be sent unbatched. |
Rebased onto the current master and did a successful testrun. |
Hi, is there anything I can do? |
Hi Upstream, |
@HeikoSchlittermann sorry for the delay in response, I've marked the pr as |
Sebastian Spaink ***@***.***> (Mi 13 Okt 2021 22:01:50 CEST):
@HeikoSchlittermann sorry for the delay in response, I've marked the pr as `ready for final review` and we will review it soon.
Cool, thank you.
Greetings from Dresden/Germany
--
Heiko
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment about naming, but @reimda needs to re-review as well
@HeikoSchlittermann - I realize this has taken a year to get reviewed, but I am also still questioning who and where this would actually get used outside of your single use-case. In which, I would suggest you carry this as an external plugin rather than merging this in. I want to avoid creating confusion and potential performance issues for users. What do you think? |
@powersj Thank you for your response.
Creating an external plugin for a small enhancement to an existing plugin seems to be wasted effort, at least to me. Of course, if you reject that extension we either need to maintain our own fork of telegraf, or we can spend the time to turn it into an external plugin, depending on what our management decides. But in either case, the enhancement won't be available to the public (for internal management/licensing reasons) then. (I'm not predicting the future, but I can imagine, that one day somebody else will do about the same job, and IHMO that's not how Open Source is supposed to work. (Management/Licensing: I can easily convince the management that we're contributing to open source and therefore have to make the change public. If we build this as an "own piece of software", I can't force them to release this to the public.) |
Co-Authored-By: Heiko Schlittermann <hs@schlittermann.de>
Co-authored-by: reimda <reimda@users.noreply.github.com>
Thanks, @reimda , I updated the docs as required, rebased onto master and ran the testsuite. Looks good from my POV |
Thank you for merging and thank you for your great work @telegraf. |
Code shameless stolen from file output plugin.
Co-authored-by: Heiko Schlittermann hs@schlittermann.de
Required for all PRs:
On unit tests: there is no unit test for the default for the batched output format in any place. But I did manual testing using a simple "echo" HTTP server.
It would be great if you can include this patch, as we need this for production, and you would make our life a lot easier. Please do not ask why we need to send the records non-batched, as this is a request from the side processing the data.