Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libbeat: add support for defining analyzers in-line in fields.yml files #28926

Merged
merged 2 commits into from
Nov 15, 2021

Conversation

efd6
Copy link
Contributor

@efd6 efd6 commented Nov 11, 2021

What does this PR do?

This adds support for defining custom text analyzers in fields.yml files. For example:

- key: powershell
  title: PowerShell module
  description: >
    These are the event fields specific to the module for the Microsoft-Windows-PowerShell/Operational and Windows PowerShell logs.
  release: beta
  analyzer:
    powershell_script_analyzer:
      type: pattern
      pattern: "[\\W&&[^-]]+"
  fields:
  ...

Why is it important?

Not being able to define custom analyzers is a blocker for processing some documents containing syntactically meaningful non-standard token structures (for example captured script text as shown above).

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

No specific recommendations.

How to test this PR locally

Running go test in the relevant packages tests this change.

Related issues

Use cases

Use case is shown above.

Screenshots

N/A

Logs

N/A

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Nov 11, 2021
@efd6 efd6 force-pushed the libbeat/inlineanalyzer branch from 2022b2b to 2e12248 Compare November 11, 2021 05:41
@efd6 efd6 requested review from andrewkroh and marc-gr November 11, 2021 05:42
@efd6 efd6 marked this pull request as ready for review November 11, 2021 05:43
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@elasticmachine
Copy link
Collaborator

elasticmachine commented Nov 11, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-11-15T21:39:25.895+0000

  • Duration: 103 min 35 sec

  • Commit: 69a09cb

Test stats 🧪

Test Results
Failed 0
Passed 54249
Skipped 5345
Total 59594

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@efd6
Copy link
Contributor Author

efd6 commented Nov 11, 2021

It's not obvious to me where documentation for this should go. So suggestions welcomed.

@andrewkroh
Copy link
Member

It's not obvious to me where documentation for this should go. So suggestions welcomed.

Personally if I were looking for info on how to use it I would be reading the Field struct or the Analyzer struct so I suggest some godocs in one of those places.

Now I also see we have docs in https://www.elastic.co/guide/en/beats/devguide/current/event-fields-yml.html. That's probably what most contributors read rather than reading Go source.

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@efd6
Copy link
Contributor Author

efd6 commented Nov 15, 2021

Now I also see we have docs in https://www.elastic.co/guide/en/beats/devguide/current/event-fields-yml.html.

That's perfect. Thanks.

That's probably what most contributors read rather than reading Go source.

:sadpanda:

@mergify
Copy link
Contributor

mergify bot commented Nov 15, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b libbeat/inlineanalyzer upstream/libbeat/inlineanalyzer
git merge upstream/master
git push upstream libbeat/inlineanalyzer

@efd6 efd6 force-pushed the libbeat/inlineanalyzer branch from 2e12248 to 69a09cb Compare November 15, 2021 21:39
@efd6
Copy link
Contributor Author

efd6 commented Nov 15, 2021

@andrewkroh PTAL for docs.

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@efd6 efd6 merged commit 62ec678 into elastic:master Nov 15, 2021
mergify bot pushed a commit that referenced this pull request Nov 15, 2021
@efd6 efd6 deleted the libbeat/inlineanalyzer branch November 15, 2021 23:37
efd6 added a commit that referenced this pull request Nov 16, 2021
…es (#28926) (#28981)

(cherry picked from commit 62ec678)

Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com>
Co-authored-by: Dan Kortschak <dan.kortschak@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[libbeat] Support custom analyzers in fields.yml
3 participants