Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SIEM][Detection Engine] Speeds up value list imports by enabling streaming of files. #70685

Merged
merged 14 commits into from
Jul 9, 2020

Conversation

FrankHassanabad
Copy link
Contributor

@FrankHassanabad FrankHassanabad commented Jul 3, 2020

Summary

  • Changes the value list imports to use a streaming in model
  • Adds a custom light hand spun multi-part parser for the incoming text
  • Adds a buffer pause and resume which continues to buffer the incoming data if an async event such as creating a list from the attachment file needs to happen but does not emit the lines until the resume continues.
  • Adds a data slicing if the buffer becomes larger than the maximum so that if we begin buffering too quickly within memory we don't blow up the limit of Elastic Search.
  • Adds unit tests

Checklist

@FrankHassanabad
Copy link
Contributor Author

@elasticmachine merge upstream

@FrankHassanabad
Copy link
Contributor Author

@elasticmachine merge upstream

@FrankHassanabad
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@rylnd rylnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a walkthrough of the streaming logic from @FrankHassanabad, and the code looks good!

I tested this against #67068, and it decreased import time of a 1.2kB file from 52s to 444ms. So yeah, this is an improvement to say the least 😉 .

});

readline.on('close', () => {
this.push(null);
});
}

public _read(): void {
// No operation but this is required to be implemented
public _read(): void {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was gonna say we should leave the comment, but the typescript error generated if one removes this method should be straightforward 👍

Copy link
Contributor

@yctercero yctercero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Saw Ryland pulled down to test, so was just looking at the code. This is really awesome!

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@FrankHassanabad FrankHassanabad merged commit 3863921 into elastic:master Jul 9, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/siem (Team:SIEM)

@FrankHassanabad FrankHassanabad deleted the stream-imports branch July 9, 2020 02:15
@MindyRS MindyRS added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Sep 23, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:enhancement Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:SIEM v7.9.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants