From 6673976f6240b70ab7a4c2487cb93e2780ff46d5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Fri, 23 Apr 2021 13:52:32 +0200 Subject: [PATCH 01/11] Update documentation of `filestream` input * How is it different from log input? * Section about parsers * Add `ignore_inactive` option * Add `resend_on_touch` option --- .../input-filestream-file-options.asciidoc | 56 +++++++++++- .../input-filestream-reader-options.asciidoc | 89 +++++++++++++++++++ .../docs/inputs/input-filestream.asciidoc | 23 +++-- 3 files changed, 161 insertions(+), 7 deletions(-) diff --git a/filebeat/docs/inputs/input-filestream-file-options.asciidoc b/filebeat/docs/inputs/input-filestream-file-options.asciidoc index b0ced1eab5bd..4bf2ea5e34c1 100644 --- a/filebeat/docs/inputs/input-filestream-file-options.asciidoc +++ b/filebeat/docs/inputs/input-filestream-file-options.asciidoc @@ -37,6 +37,24 @@ a `gz` extension: See <> for a list of supported regexp patterns. +===== `prospector.scanner.include_files` + +A list of regular expressions to match the files that you want {beatname_uc} to +include. By default no files are excluded. This option is the counterpart of +`prospector.scanner.exclude_files`. + +The following example configures {beatname_uc} to include files under `/var/log`: + +["source","yaml",subs="attributes"] +---- +{beatname_lc}.inputs: +- type: {type} + ... + prospector.scanner.include_files: ['/var/log/.*'] +---- + +See <> for a list of supported regexp patterns. + ===== `prospector.scanner.symlinks` The `symlinks` option allows {beatname_uc} to harvest symlinks in addition to @@ -57,6 +75,12 @@ This is, for example, the case for Kubernetes log files. Because this option may lead to data loss, it is disabled by default. +===== `prospector.scanner.resend_on_touch` + +If this option is enabled a file is resent if its size has not changed +but its modification time has changed to a later time than before. +It is disabled by default to avoid accidentally resending files. + [float] [id="{beatname_lc}-input-{type}-scan-frequency"] @@ -117,6 +141,36 @@ If a file that's currently being harvested falls under `ignore_older`, the harvester will first finish reading the file and close it after `close.on_state_change.inactive` is reached. Then, after that, the file will be ignored. +[float] +[id="{beatname_lc}-input-{type}-ignore-inactive"] +===== `ignore_inactive` + +If this option is enabled, {beatname_uc} ignores every file that has not been +updated since the selected time. Possible options are `since_first_start` and +`since_last_start`. The first option ignores every files that has not been updated since +the first start of {beatname_uc}. It is useful when the Beat might be restarted +due to configuration changes or due to a failure. The second option is the same +as the `tail_files` option of `log` input, it reads from files that has been updated +since the start of {beatname_uc}. + +The files affected by this setting fall into two categories: + +* Files that were never harvested +* Files that were harvested but weren't updated since `ignore_inactive`. + +For files which were never seen before, the offset state is set to the end of +the file. If a state already exist, the offset is not changed. In case a file is +updated again later, reading continues at the set offset position. + +The setting relies on the modification time of the file to +determine if a file is ignored. If the modification time of the file is not +updated when lines are written to a file (which can happen on Windows), the +setting may cause {beatname_uc} to ignore files even though content was added +at a later time. + +To remove the state of previously harvested files from the registry file, use +the `clean_inactive` configuration option. + [float] [id="{beatname_lc}-input-{type}-close-options"] ===== `close.*` @@ -218,7 +272,7 @@ single log event to a new file. This option is disabled by default. [float] [id="{beatname_lc}-input-{type}-close-timeout"] -===== `close.reader.timeout` +===== `close.reader.after_interval` WARNING: Only use this option if you understand that data loss is a potential side effect. Another side effect is that multiline events might not be diff --git a/filebeat/docs/inputs/input-filestream-reader-options.asciidoc b/filebeat/docs/inputs/input-filestream-reader-options.asciidoc index 8b365f1ede25..edf819d9a934 100644 --- a/filebeat/docs/inputs/input-filestream-reader-options.asciidoc +++ b/filebeat/docs/inputs/input-filestream-reader-options.asciidoc @@ -141,3 +141,92 @@ The default is 16384. The maximum number of bytes that a single log message can have. All bytes after `mesage_max_bytes` are discarded and not sent. The default is 10MB (10485760). + +[float] +===== `parsers` + +This option expects a list of parsers the log line has to go through. + +Avaliable parsers: +- `multiline` +- `ndjson` + +In this example a JSON object that spans 3 lines are aggregated into a single +event and then parsed by `ndjson`. + +["source","yaml",subs="attributes"] +---- +{beatname_lc}.inputs: +- type: {type} + ... + parsers: + - multiline: + type: counter + lines_count: 3 + keep_newline: false + - ndjson: + keys_under_root: true +---- + +See the available parser settings in detail below. + +[float] +===== `multiline` + +Options that control how {beatname_uc} deals with log messages that span +multiple lines. See <> for more information about +configuring multiline options. + +[float] +===== `ndjson` + +These options make it possible for {beatname_uc} to decode logs structured as +JSON messages. {beatname_uc} processes the logs line by line, so the JSON +decoding only works if there is one JSON object per message. + +The decoding happens before line filtering. You can combine JSON +decoding with filtering if you set the `message_key` option. This +can be helpful in situations where the application logs are wrapped in JSON +objects, as with like it happens for example with Docker. + +Example configuration: + +[source,yaml] +---- +- ndjson: + keys_under_root: true + add_error_key: true + message_key: log +---- + +*`keys_under_root`*:: By default, the decoded JSON is placed under a "json" key +in the output document. If you enable this setting, the keys are copied top +level in the output document. The default is false. + +*`overwrite_keys`*:: If `keys_under_root` and this setting are enabled, then the +values from the decoded JSON object overwrite the fields that {beatname_uc} +normally adds (type, source, offset, etc.) in case of conflicts. + +*`expand_keys`*:: If this setting is enabled, {beatname_uc} will recursively +de-dot keys in the decoded JSON, and expand them into a hierarchical object +structure. For example, `{"a.b.c": 123}` would be expanded into `{"a":{"b":{"c":123}}}`. +This setting should be enabled when the input is produced by an +https://github.com/elastic/ecs-logging[ECS logger]. + +*`add_error_key`*:: If this setting is enabled, {beatname_uc} adds a +"error.message" and "error.type: json" key in case of JSON unmarshalling errors +or when a `message_key` is defined in the configuration but cannot be used. + +*`message_key`*:: An optional configuration setting that specifies a JSON key on +which to apply the line filtering and multiline settings. If specified the key +must be at the top level in the JSON object and the value associated with the +key must be a string, otherwise no filtering or multiline aggregation will +occur. + +*`document_id`*:: Option configuration setting that specifies the JSON key to +set the document id. If configured, the field will be removed from the original +json document and stored in `@metadata._id` + +*`ignore_decoding_error`*:: An optional configuration setting that specifies if +JSON decoding errors should be logged or not. If set to true, errors will not +be logged. The default is false. diff --git a/filebeat/docs/inputs/input-filestream.asciidoc b/filebeat/docs/inputs/input-filestream.asciidoc index be121a4fd7eb..9c04421cf3e8 100644 --- a/filebeat/docs/inputs/input-filestream.asciidoc +++ b/filebeat/docs/inputs/input-filestream.asciidoc @@ -10,10 +10,21 @@ experimental[] ++++ Use the `filestream` input to read lines from active log files. It is the -new, improved alternative to the `log` input. However, a few feature are -missing from it, e.g. `multiline` or other special parsing capabilities. -These missing options are probably going to be added again. We strive to -achieve feature parity, if possible. +new, improved alternative to the `log` input. It comes various improvements +to the existing input: + +1. Checking of `close_*` options happens out of band. Thus, if an output is blocked +{beatname_uc} is able to close the reader and it avoids keeping too many files open. + +2. Detailed metrics are available for all files that match the `paths` configuration +regardless of the `harvester_limit`. This way, you can keep track of all files +even ones that are not actively read. + +3. The order of `parsers` is configurable. So it is possible to parse JSON lines and than +aggragate the contents into a multiline event. + +4. Some position updates and metadata changes no longer depend on the publishing pipeline. +If a the pipeline is blocked some changes are still applied to the registry. To configure this input, specify a list of glob-based <> that must be crawled to locate and fetch the log lines. @@ -158,10 +169,10 @@ on. If enabled it expands a single `**` into a 8-level deep `*` pattern. This feature is enabled by default. Set `prospector.scanner.recursive_glob` to false to disable it. -include::../inputs/input-filestream-reader-options.asciidoc[] - include::../inputs/input-filestream-file-options.asciidoc[] +include::../inputs/input-filestream-reader-options.asciidoc[] + [id="{beatname_lc}-input-{type}-common-options"] include::../inputs/input-common-options.asciidoc[] From 78965d8f0ea48ff182918230d4d6e6af8bc1e186 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Mon, 26 Apr 2021 16:49:17 +0200 Subject: [PATCH 02/11] address review notes --- .../input-filestream-file-options.asciidoc | 17 +++++++++++------ .../input-filestream-reader-options.asciidoc | 11 ++++++----- filebeat/docs/inputs/input-filestream.asciidoc | 9 +++++++++ 3 files changed, 26 insertions(+), 11 deletions(-) diff --git a/filebeat/docs/inputs/input-filestream-file-options.asciidoc b/filebeat/docs/inputs/input-filestream-file-options.asciidoc index 4bf2ea5e34c1..c665462c1755 100644 --- a/filebeat/docs/inputs/input-filestream-file-options.asciidoc +++ b/filebeat/docs/inputs/input-filestream-file-options.asciidoc @@ -40,19 +40,25 @@ See <> for a list of supported regexp patterns. ===== `prospector.scanner.include_files` A list of regular expressions to match the files that you want {beatname_uc} to -include. By default no files are excluded. This option is the counterpart of +include. If a list of regexes is provided, only the files that are allowed by +the patterns are harvested. + +By default no files are excluded. This option is the counterpart of `prospector.scanner.exclude_files`. -The following example configures {beatname_uc} to include files under `/var/log`: +The following example configures {beatname_uc} to exlude files that +are not under `/var/log`: ["source","yaml",subs="attributes"] ---- {beatname_lc}.inputs: - type: {type} ... - prospector.scanner.include_files: ['/var/log/.*'] + prospector.scanner.include_files: ['^/var/log/.*'] ---- +NOTE: Patterns should start with `^` in case of absolute paths. + See <> for a list of supported regexp patterns. ===== `prospector.scanner.symlinks` @@ -149,9 +155,8 @@ If this option is enabled, {beatname_uc} ignores every file that has not been updated since the selected time. Possible options are `since_first_start` and `since_last_start`. The first option ignores every files that has not been updated since the first start of {beatname_uc}. It is useful when the Beat might be restarted -due to configuration changes or due to a failure. The second option is the same -as the `tail_files` option of `log` input, it reads from files that has been updated -since the start of {beatname_uc}. +due to configuration changes or due to a failure. The second option tells +the Beat to read from files that has been updated since the start. The files affected by this setting fall into two categories: diff --git a/filebeat/docs/inputs/input-filestream-reader-options.asciidoc b/filebeat/docs/inputs/input-filestream-reader-options.asciidoc index edf819d9a934..9e3a124c2956 100644 --- a/filebeat/docs/inputs/input-filestream-reader-options.asciidoc +++ b/filebeat/docs/inputs/input-filestream-reader-options.asciidoc @@ -151,8 +151,9 @@ Avaliable parsers: - `multiline` - `ndjson` -In this example a JSON object that spans 3 lines are aggregated into a single -event and then parsed by `ndjson`. +In this example, {beatname_uc} is reading multiline messages that consist of 3 lines +and encapsulated in single-line JSON objects. +The multiline message is stored under the key `msg`. ["source","yaml",subs="attributes"] ---- @@ -160,12 +161,12 @@ event and then parsed by `ndjson`. - type: {type} ... parsers: + - ndjson: + keys_under_root: true + message_key: msg - multiline: type: counter lines_count: 3 - keep_newline: false - - ndjson: - keys_under_root: true ---- See the available parser settings in detail below. diff --git a/filebeat/docs/inputs/input-filestream.asciidoc b/filebeat/docs/inputs/input-filestream.asciidoc index 9c04421cf3e8..c49b93ba88f0 100644 --- a/filebeat/docs/inputs/input-filestream.asciidoc +++ b/filebeat/docs/inputs/input-filestream.asciidoc @@ -26,6 +26,15 @@ aggragate the contents into a multiline event. 4. Some position updates and metadata changes no longer depend on the publishing pipeline. If a the pipeline is blocked some changes are still applied to the registry. +5. Only the most recent updates are serialized to the registry, in contrast the `log` input +has to serialize the complete registry on each ACK from the outputs. Making the registry updates +much quicker. + +6. The input ensures that only offsets updates are written to the registry append only log. +The `log` writes the complete file state. + +7. Stale entries can be removed stale from registry, even if there is no active input. + To configure this input, specify a list of glob-based <> that must be crawled to locate and fetch the log lines. From 569a6c55dc80c3bb2c7ca868e9ca0e3787cc7d98 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Tue, 4 May 2021 11:28:44 +0200 Subject: [PATCH 03/11] Update filebeat/docs/inputs/input-filestream-file-options.asciidoc Co-authored-by: Brandon Morelli --- filebeat/docs/inputs/input-filestream-file-options.asciidoc | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/filebeat/docs/inputs/input-filestream-file-options.asciidoc b/filebeat/docs/inputs/input-filestream-file-options.asciidoc index c665462c1755..1eaf7d1fb8f2 100644 --- a/filebeat/docs/inputs/input-filestream-file-options.asciidoc +++ b/filebeat/docs/inputs/input-filestream-file-options.asciidoc @@ -46,7 +46,7 @@ the patterns are harvested. By default no files are excluded. This option is the counterpart of `prospector.scanner.exclude_files`. -The following example configures {beatname_uc} to exlude files that +The following example configures {beatname_uc} to exclude files that are not under `/var/log`: ["source","yaml",subs="attributes"] @@ -452,4 +452,3 @@ Set the location of the marker file the following way: ---- file_identity.inode_marker.path: /logs/.filebeat-marker ---- - From 4a00ccb72d355158f4e738e918a7e41a257ed473 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Tue, 4 May 2021 11:28:52 +0200 Subject: [PATCH 04/11] Update filebeat/docs/inputs/input-filestream-file-options.asciidoc Co-authored-by: Brandon Morelli --- filebeat/docs/inputs/input-filestream-file-options.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/filebeat/docs/inputs/input-filestream-file-options.asciidoc b/filebeat/docs/inputs/input-filestream-file-options.asciidoc index 1eaf7d1fb8f2..ff26cdbd7347 100644 --- a/filebeat/docs/inputs/input-filestream-file-options.asciidoc +++ b/filebeat/docs/inputs/input-filestream-file-options.asciidoc @@ -153,7 +153,7 @@ harvester will first finish reading the file and close it after If this option is enabled, {beatname_uc} ignores every file that has not been updated since the selected time. Possible options are `since_first_start` and -`since_last_start`. The first option ignores every files that has not been updated since +`since_last_start`. The first option ignores every file that has not been updated since the first start of {beatname_uc}. It is useful when the Beat might be restarted due to configuration changes or due to a failure. The second option tells the Beat to read from files that has been updated since the start. From c1f433b6da1c1e265e2c3a778c798e9228aa3080 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Tue, 4 May 2021 11:29:01 +0200 Subject: [PATCH 05/11] Update filebeat/docs/inputs/input-filestream-file-options.asciidoc Co-authored-by: Brandon Morelli --- filebeat/docs/inputs/input-filestream-file-options.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/filebeat/docs/inputs/input-filestream-file-options.asciidoc b/filebeat/docs/inputs/input-filestream-file-options.asciidoc index ff26cdbd7347..6c6eec6ee081 100644 --- a/filebeat/docs/inputs/input-filestream-file-options.asciidoc +++ b/filebeat/docs/inputs/input-filestream-file-options.asciidoc @@ -156,7 +156,7 @@ updated since the selected time. Possible options are `since_first_start` and `since_last_start`. The first option ignores every file that has not been updated since the first start of {beatname_uc}. It is useful when the Beat might be restarted due to configuration changes or due to a failure. The second option tells -the Beat to read from files that has been updated since the start. +the Beat to read from files that have been updated since its start. The files affected by this setting fall into two categories: From 793c585264237c3347bcb017fe3b08ac571b85be Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Tue, 4 May 2021 11:29:10 +0200 Subject: [PATCH 06/11] Update filebeat/docs/inputs/input-filestream-file-options.asciidoc Co-authored-by: Brandon Morelli --- filebeat/docs/inputs/input-filestream-file-options.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/filebeat/docs/inputs/input-filestream-file-options.asciidoc b/filebeat/docs/inputs/input-filestream-file-options.asciidoc index 6c6eec6ee081..9f584b9ba83a 100644 --- a/filebeat/docs/inputs/input-filestream-file-options.asciidoc +++ b/filebeat/docs/inputs/input-filestream-file-options.asciidoc @@ -155,7 +155,7 @@ If this option is enabled, {beatname_uc} ignores every file that has not been updated since the selected time. Possible options are `since_first_start` and `since_last_start`. The first option ignores every file that has not been updated since the first start of {beatname_uc}. It is useful when the Beat might be restarted -due to configuration changes or due to a failure. The second option tells +due to configuration changes or a failure. The second option tells the Beat to read from files that have been updated since its start. The files affected by this setting fall into two categories: From 8259a1af7ac7164579327bff31223a073597a830 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Tue, 4 May 2021 11:29:19 +0200 Subject: [PATCH 07/11] Update filebeat/docs/inputs/input-filestream-file-options.asciidoc Co-authored-by: Brandon Morelli --- filebeat/docs/inputs/input-filestream-file-options.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/filebeat/docs/inputs/input-filestream-file-options.asciidoc b/filebeat/docs/inputs/input-filestream-file-options.asciidoc index 9f584b9ba83a..0cb482a83009 100644 --- a/filebeat/docs/inputs/input-filestream-file-options.asciidoc +++ b/filebeat/docs/inputs/input-filestream-file-options.asciidoc @@ -163,7 +163,7 @@ The files affected by this setting fall into two categories: * Files that were never harvested * Files that were harvested but weren't updated since `ignore_inactive`. -For files which were never seen before, the offset state is set to the end of +For files that were never seen before, the offset state is set to the end of the file. If a state already exist, the offset is not changed. In case a file is updated again later, reading continues at the set offset position. From 1491adbfaa480792a7937d1ad4a1dcbe3504d5db Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Tue, 4 May 2021 11:29:27 +0200 Subject: [PATCH 08/11] Update filebeat/docs/inputs/input-filestream.asciidoc Co-authored-by: Brandon Morelli --- filebeat/docs/inputs/input-filestream.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/filebeat/docs/inputs/input-filestream.asciidoc b/filebeat/docs/inputs/input-filestream.asciidoc index c49b93ba88f0..a7a604646f25 100644 --- a/filebeat/docs/inputs/input-filestream.asciidoc +++ b/filebeat/docs/inputs/input-filestream.asciidoc @@ -17,7 +17,7 @@ to the existing input: {beatname_uc} is able to close the reader and it avoids keeping too many files open. 2. Detailed metrics are available for all files that match the `paths` configuration -regardless of the `harvester_limit`. This way, you can keep track of all files +regardless of the `harvester_limit`. This way, you can keep track of all files, even ones that are not actively read. 3. The order of `parsers` is configurable. So it is possible to parse JSON lines and than From 7ba1f7b94fc48577f3a7cb1aede52db32292e09c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Tue, 4 May 2021 11:29:33 +0200 Subject: [PATCH 09/11] Update filebeat/docs/inputs/input-filestream.asciidoc Co-authored-by: Brandon Morelli --- filebeat/docs/inputs/input-filestream.asciidoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/filebeat/docs/inputs/input-filestream.asciidoc b/filebeat/docs/inputs/input-filestream.asciidoc index a7a604646f25..2b9b8529d471 100644 --- a/filebeat/docs/inputs/input-filestream.asciidoc +++ b/filebeat/docs/inputs/input-filestream.asciidoc @@ -20,8 +20,8 @@ to the existing input: regardless of the `harvester_limit`. This way, you can keep track of all files, even ones that are not actively read. -3. The order of `parsers` is configurable. So it is possible to parse JSON lines and than -aggragate the contents into a multiline event. +3. The order of `parsers` is configurable. So it is possible to parse JSON lines and then +aggregate the contents into a multiline event. 4. Some position updates and metadata changes no longer depend on the publishing pipeline. If a the pipeline is blocked some changes are still applied to the registry. From 0c4ab84fa711af96e61b4c0a050eee9eef1109c0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Tue, 4 May 2021 11:30:32 +0200 Subject: [PATCH 10/11] Update filebeat/docs/inputs/input-filestream.asciidoc Co-authored-by: Brandon Morelli --- filebeat/docs/inputs/input-filestream.asciidoc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/filebeat/docs/inputs/input-filestream.asciidoc b/filebeat/docs/inputs/input-filestream.asciidoc index 2b9b8529d471..8c910174aed5 100644 --- a/filebeat/docs/inputs/input-filestream.asciidoc +++ b/filebeat/docs/inputs/input-filestream.asciidoc @@ -26,9 +26,9 @@ aggregate the contents into a multiline event. 4. Some position updates and metadata changes no longer depend on the publishing pipeline. If a the pipeline is blocked some changes are still applied to the registry. -5. Only the most recent updates are serialized to the registry, in contrast the `log` input -has to serialize the complete registry on each ACK from the outputs. Making the registry updates -much quicker. +5. Only the most recent updates are serialized to the registry. In contrast, the `log` input +has to serialize the complete registry on each ACK from the outputs. This makes the registry updates +much quicker with this input. 6. The input ensures that only offsets updates are written to the registry append only log. The `log` writes the complete file state. From 868471459b362e9b974004e4e3fc3df96a779dcf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?No=C3=A9mi=20V=C3=A1nyi?= Date: Tue, 4 May 2021 11:30:49 +0200 Subject: [PATCH 11/11] Update filebeat/docs/inputs/input-filestream.asciidoc Co-authored-by: Brandon Morelli --- filebeat/docs/inputs/input-filestream.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/filebeat/docs/inputs/input-filestream.asciidoc b/filebeat/docs/inputs/input-filestream.asciidoc index 8c910174aed5..219a1e50d236 100644 --- a/filebeat/docs/inputs/input-filestream.asciidoc +++ b/filebeat/docs/inputs/input-filestream.asciidoc @@ -33,7 +33,7 @@ much quicker with this input. 6. The input ensures that only offsets updates are written to the registry append only log. The `log` writes the complete file state. -7. Stale entries can be removed stale from registry, even if there is no active input. +7. Stale entries can be removed from the registry, even if there is no active input. To configure this input, specify a list of glob-based <> that must be crawled to locate and fetch the log lines.