Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Label regex persistence #399

Merged
merged 1 commit into from
Aug 21, 2021

Conversation

jsirianni
Copy link
Member

@jsirianni jsirianni commented Aug 20, 2021

Description of Changes

This PR is going to merge into file-input-lables-regex branch which is already approved.

  • Move fingerprint.Labels to Reader.HeaderLabels
    • This seems like a more appropriate place considering the Reader type is what is written to the database (which includes the Fingerprint type)
  • Add logic to ensure Reader.HeaderLabels are persisted to the database by adding a deep copy to the Reader.Copy method

Motivation

On branch file-input-label-regex, everything is working except label persistence. When the agent is restarted, all new log entries are missing their HeaderLabels. This PR ensures header labels are persisted with the Reader.

Results

  1. Start agent with new --database file, agent reads to end, all entries have the correct header labels
  2. Add new entry to input file, agent reads new entry with correct header labels
  3. Stop agent
  4. Start agent, agent does not read any entries as it is picking up where it left off
  5. Add new entry to file while agent is running, new entry has correct header labels

Please check that the PR fulfills these requirements

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • Add a changelog entry (for non-trivial bug fixes / features)
  • CI passes

@jsirianni jsirianni requested a review from djaglowski August 20, 2021 21:15
@djaglowski
Copy link
Member

Log Files Logs / Second CPU Avg (%) CPU Avg Δ (%) Memory Avg (MB) Memory Avg Δ (MB)
1 1000 1.482744 -0.05186057 128.27425 -5.725357
1 5000 5.08621 +0.08618593 137.42955 -0.8580322
1 10000 10.551906 +0.051709175 146.72656 +1.3983002
1 50000 50.74157 +0.5167923 178.33041 -0.536499
1 100000 98.43236 +4.1196136 225.88618 -4.9846344
10 100 2.0345273 +0.10346496 133.47899 +0.9594574
10 500 6.6726856 +0.68980694 138.269 -3.6584015
10 1000 11.896595 +1.0689716 148.64911 +2.094818
10 5000 56.691612 +3.81213 180.90517 +1.036499
10 10000 111.15871 +1.1993561 229.78381 -1.8988342

@codecov
Copy link

codecov bot commented Aug 20, 2021

Codecov Report

Merging #399 (33e6a8f) into file-input-label-regex (a251b3c) will decrease coverage by 0.02%.
The diff coverage is 14.29%.

Impacted file tree graph

@@                    Coverage Diff                     @@
##           file-input-label-regex     #399      +/-   ##
==========================================================
- Coverage                   73.06%   73.04%   -0.02%     
==========================================================
  Files                         124      124              
  Lines                        8041     8043       +2     
==========================================================
  Hits                         5875     5875              
- Misses                       1662     1663       +1     
- Partials                      504      505       +1     
Impacted Files Coverage Δ
operator/builtin/input/file/fingerprint.go 90.48% <ø> (-0.43%) ⬇️
operator/builtin/input/file/reader.go 60.87% <14.29%> (-0.61%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a251b3c...33e6a8f. Read the comment docs.

Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes a lot of sense. Tinkering with the fingerprint was clearly not ideal. I'm glad you identified a better way.

@jsirianni jsirianni merged commit 05515ba into file-input-label-regex Aug 21, 2021
@jsirianni jsirianni deleted the file-input-label-regex-persist branch August 21, 2021 03:15
jsirianni pushed a commit that referenced this pull request Aug 21, 2021
* add label regex param, for parsing file headers

* make readHeaders private. do not use reader's fingerprint, instead open the file directly and parse headers

* add label_regex.yaml to test configs

* combine label_regex length checks

* call readHeaders from readHeaders, to avoid calling readHeaders on every poll cycle

* add type consumerFunc func(context.Context, []byte) error so we can share the same reader for reading the headers and file

* do not modify offset as it was never set

* enhance regex check to allow reversed capture groups

* file_input: Added optional LabelRegex parameter

* move fingerprint header labels to Reader type. Add logic to ensure HeaderLabels are persisted to the datbase (#399)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants