Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cursor input skeleton #19378

Merged
merged 2 commits into from
Jun 29, 2020
Merged

Cursor input skeleton #19378

merged 2 commits into from
Jun 29, 2020

Conversation

urso
Copy link

@urso urso commented Jun 24, 2020

  • Enhancement

What does this PR do?

This PR is part of introducing a new input architecture in filebeat. The current state of the full implementation can be seen here and sample inputs based on the new API.

The full list of changes will include:

The change introduces the skeleton and documentation with details for
cursor based inputs. Future updates will add the actual implementation
and tests.

Why is it important?

Filebeat input v2 API.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
    - [ ] I have added tests that prove my fix is effective or that my feature works
    - [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

This PR is part of introducing a new input architecture in filebeat. The current state of the full implementation can be seen [here](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/input/v2) and [sample inputs based on the new API](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/features/input).

The full list of changes will include:
- Introduce v2 API interfaces
- Introduce [compatibility layer](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/input/v2/compat) to integrate API with existing functionality
- Introduce helpers for writing [stateless](https://github.com/urso/beats/blob/fb-input-v2-combined/filebeat/input/v2/input-stateless/stateless.go) inputs.
- Introduce helpers for writing [inputs that store a state](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/input/v2/input-cursor) between restarts.
- Integrate new API with [existing inputs and modules](https://github.com/urso/beats/blob/fb-input-v2-combined/filebeat/beater/filebeat.go#L301) in filebeat.

The change introduces the skeleton and documentation with details for
cursor based inputs. Future updates will add the actual implementation
and tests.
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 24, 2020
@urso urso added Filebeat Filebeat Project:Filebeat-Input-v2 review Team:Services (Deprecated) Label for the former Integrations-Services team v7.9.0 labels Jun 24, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jun 24, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jun 24, 2020

💚 Build Succeeded

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [urso commented: jenkins run the tests please]

  • Start Time: 2020-06-29T18:39:07.862+0000

  • Duration: 80 min 36 sec

Test stats 🧪

Test Results
Failed 0
Passed 543
Skipped 127
Total 670

@urso urso requested a review from kvch June 25, 2020 11:40
// sources ([]Source) that it has read from the configuration object, and the
// actual Input that will be used to collect events from each configured
// source.
// When Run a go-routine will be started per configured source. If two inputs have
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

"github.com/elastic/go-concert/unison"
)

// cleaner removes finished entries from the registry file.
Copy link
Contributor

@kvch kvch Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the finished states are not claimed by any input and there is no pending state update. So this means that as long as Filebeat is running, it must claim the resource to persist its state. Does this mean that Filebeat cannot close an input source temporarily, e.g. a journal to wait until new entries show up otherwise its state is lost? Or is this going to implement functionality similar to those of the options clean_* for log input?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finished states are not claimed by any input and there is no pending state update.

correct.

So this means that as long as Filebeat is running, it must claim the resource to persist its state.

The resource is claimed by an active input. If there is no active input (e.g. autodiscovery closed the input after the container has been deleted, or the configuration file has changed between restarts), it is 'free' and might be removed by the cleaning process.

Does this mean that Filebeat cannot close an input source temporarily,

More or less correct. With the implementation of this input manager we add a TTL to each entry in the store. unclaimed resources are removed if latest update + TTL < now.

This input manager requires sources to be configured statically and upfront based on the configuration. Internally resources can't be released and reclaimed without an external signal.
But:

  • the input.Run method is allowed to close connections, wait, and reopen. The resource is claimed for as long as the go-routine is alive (Run did not return).
  • We might consider to add a 'flag' to a resource saying that it can not be cleaned yet. This flag would be active in memory only and reset on beats restart

Filebeat can have different input manager instances and even different input manager implementations that can all coexist and provide different coordination behavior. For the log input we can not configure []Source upfront, as we need a discovery mechanism (file watcher) adding and removing sources dynamically. Here we have two options, enhance this input manager, or develop a second input manager that can handle dynamic sources. For dynamic sources we should also consider some signaling like 'source removed' or 'source renamed', so we implement close_removed and clean_removed on top of these signals.

Almost all stateful filebeat inputs we have do configure the source statically at init-time. Even the windows event logs and the journald log (configure system_logs, path, or a filename). In the later case the journald libraries do the heavy lifting for use (based on our configured source).
AFAICT the only special input is the log input.

Or is this going to implement functionality similar to those of the options clean_* for log input?

Yes. The TTL and cleaner are similar to clean_inactive. This input manager does not have a clean_removed, as the sources are 'static' and we have no way to tell it that a resource is gone for good.

For an input manager that supports dynamic sources, having a file watcher we could emit a remove immediately, even cancelling pending update operations (or set TTL to 0 and wait for the resource to be 'finished').

For an example usage check out the journald input implementation: https://github.com/urso/beats/blob/fb-input-v2-combined/filebeat/features/input/journald/input.go
The journald input creates it's own InputManager, that coordinates journald input instances only.

@urso urso added the needs_backport PR is waiting to be backported to other branches. label Jun 25, 2020
@urso
Copy link
Author

urso commented Jun 29, 2020

jenkins run the tests please

@urso urso merged commit 13633ce into elastic:master Jun 29, 2020
@urso urso deleted the cursor-input-skeleton branch June 29, 2020 21:34
v1v added a commit to v1v/beats that referenced this pull request Jul 2, 2020
…ne-beats

* upstream/master: (105 commits)
  ci: enable packaging job (elastic#19536)
  ci: disable upstream trigger on PRs for the packaging job (elastic#19490)
  Implement memlog on-disk handling (elastic#19408)
  fix go.mod for PR elastic#19423 (elastic#19521)
  [MetricBeat] add param `aws_partition` to support aws-cn, aws-us-gov regions (elastic#19423)
  Input v2 stateless manager (elastic#19406)
  Input v2 compatibility layer (elastic#19401)
  [Elastic Agent] Fix artifact downloading to allow endpoint-security to be downloaded (elastic#19503)
  fix: ignore target changes on scans (elastic#19510)
  Add more helpers to pipeline/testing package (elastic#19405)
  Report dependencies in CSV format (elastic#19506)
  [Filebeat] Fix reference leak in TCP and Unix socket inputs (elastic#19459)
  Cursor input skeleton (elastic#19378)
  Add changelog. (elastic#19495)
  [DOC] Typo in Kerberos (elastic#19265)
  Remove accidentally commited unused NOTICE template (elastic#19485)
  [Elastic Agent] Support the install, control, and uninstall of Endpoint (elastic#19248)
  [Filebeat][httpjson] Add split_events_by config setting (elastic#19246)
  ci: disabling packaging job until we fix it (elastic#19481)
  Fix golang.org/x/tools to release1.13 (elastic#19478)
  ...
urso pushed a commit to urso/beats that referenced this pull request Jul 7, 2020
This PR is part of introducing a new input architecture in filebeat. The current state of the full implementation can be seen [here](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/input/v2) and [sample inputs based on the new API](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/features/input).

The full list of changes will include:
- Introduce v2 API interfaces
- Introduce [compatibility layer](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/input/v2/compat) to integrate API with existing functionality
- Introduce helpers for writing [stateless](https://github.com/urso/beats/blob/fb-input-v2-combined/filebeat/input/v2/input-stateless/stateless.go) inputs.
- Introduce helpers for writing [inputs that store a state](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/input/v2/input-cursor) between restarts.
- Integrate new API with [existing inputs and modules](https://github.com/urso/beats/blob/fb-input-v2-combined/filebeat/beater/filebeat.go#L301) in filebeat.

The change introduces the skeleton and documentation with details for
cursor based inputs. Future updates will add the actual implementation
and tests.

(cherry picked from commit 13633ce)
@urso urso removed the needs_backport PR is waiting to be backported to other branches. label Jul 7, 2020
urso pushed a commit that referenced this pull request Jul 8, 2020
melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020
This PR is part of introducing a new input architecture in filebeat. The current state of the full implementation can be seen [here](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/input/v2) and [sample inputs based on the new API](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/features/input).

The full list of changes will include:
- Introduce v2 API interfaces
- Introduce [compatibility layer](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/input/v2/compat) to integrate API with existing functionality
- Introduce helpers for writing [stateless](https://github.com/urso/beats/blob/fb-input-v2-combined/filebeat/input/v2/input-stateless/stateless.go) inputs.
- Introduce helpers for writing [inputs that store a state](https://github.com/urso/beats/tree/fb-input-v2-combined/filebeat/input/v2/input-cursor) between restarts.
- Integrate new API with [existing inputs and modules](https://github.com/urso/beats/blob/fb-input-v2-combined/filebeat/beater/filebeat.go#L301) in filebeat.

The change introduces the skeleton and documentation with details for
cursor based inputs. Future updates will add the actual implementation
and tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Filebeat Filebeat Project:Filebeat-Input-v2 review skip-test-plan Team:Services (Deprecated) Label for the former Integrations-Services team v7.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants