-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[META][Discuss] add support for input_type
in "dual modes" codecs
#11885
Comments
If we take a bit of a step back, I think this is an issue about whether ownership of the record-separator belongs with the inputs or with the codecs. We currently have a mixed-ownership model, with some subset inputs claiming ownership (e.g., using You're right. This is complex. Perhaps instead of adding a config option, which would require users to track the complexity of our plugin-compatibility matrix, the codec plugin class could advertise its expectations, e.g.:
Which inputs could then use to properly prepare the payload that the codec requires. I also wonder what the effect would be of making our |
This is the return of the milling idea and the whole codecs inconsistencies, some relevant information in #4858 and #5124. We have to separate a redesign discussion with an improvement discussion. I am trying to improve on what we have without breaking backward compatibility nor having to wait for major version upgrade to factor in these improvements. This does not exclude reopening the redesign discussion but I think we should not mix these too much because this will block possible short-term improvements. |
I think that the selectable behaviour is a good starting point to rationalize the codecs. For the long term discussion, I think we should however keep open the customization of the codec behaviour I don't know if it's sufficient a declaration of expectations between the codec and the input to automatically avoid the user to choose what to do |
When an input consumes a delimiter, it is lossy: a codec can usually assume the presence of a delimiter between calls to If we were to push this configurable behaviour to the input (e.g., whether or not to consume the delimiter), then I believe we would be on a better track for a future feature to make it automatic by enabling codecs to advertise their requirements and for inputs to provide what the codecs need. |
I surveyed all input plugins to classify how they manipulate data they hand out to the codecs. Note that I did not verify if any of these are broken, non supported etc. I just looked at all Inputs
Observations:
Thoughts
Recommendations for BWC and minimize changes in current design
|
I amended my thoughts on the |
TL;DR RecapGoal
First Step
Second and Optional Step
Third and also Optional Step
Fourth and also Optional Step
|
As seen in logstash-plugins/logstash-codec-csv#8 and logstash-plugins/logstash-codec-multiline#63 some codecs would benefit from supporting two modes of operation for the 2 types of data that our input plugins can provide to codecs.
Our input plugins can provide two types of data for decoding:
file
input or thehttp
input.stdin
input or thetcp
input.The way we have been dealing with this situation has beed to have two versions of a same codec, for example:
json
andjson_lines
orplain
andline
. Furthermore, to help deal with this confusion, we introduced thefix_streaming_codecs
method to automagically swap these codecs depending on the input used.logstash/logstash-core/lib/logstash/inputs/base.rb
Lines 145 to 159 in 196ec20
Until we figure a whole new/better input/codec architecture my proposal to iteratively improve the current design with:
input_type
config option in codecs that can support both input types (similar to [WIP] support line delimited data logstash-plugins/logstash-codec-csv#8)I am looking for comments/suggestions about this plan and if we agree on the idea I will detail the steps for each iterations.
The text was updated successfully, but these errors were encountered: