[Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop #71269

madirey · 2020-07-09T15:52:28Z

Summary

Fixes manifest dispatch on policy creation
Refactors ManifestManager code to handle errors and promises properly.
Improves types for matcher functions.
Ensures that callback returns new policy even in the event of errors
Ensures that callback returns manifest even if new data cannot be retrieved

Checklist

Documentation was added for features that require explanation or tutorials
Unit or functional tests were updated or added to match the most common scenarios

For maintainers

This was checked for breaking API changes and was labeled appropriately

elasticmachine · 2020-07-09T15:52:31Z

Pinging @elastic/endpoint-app-team (Feature:Endpoint)

elasticmachine · 2020-07-09T15:52:32Z

Pinging @elastic/endpoint-response (Team:Endpoint Response)

nnamdifrankie · 2020-07-09T15:56:11Z

x-pack/plugins/security_solution/server/endpoint/ingest_integration.ts

+    try {
+      if (snapshot && snapshot.diffs.length > 0) {
+        // create new artifacts
+        const errors = await manifestManager.syncArtifacts(snapshot, 'add');


~~how does syncArtifacts relates getPackageConfigCreateCallback. That appears to update something, perhaps a side effect?~~

Sorry I see the method is handlePackageConfigCreate. There is a few condition blocks that may need to be tested to make sure all branches functions as expected. Is it possible to break this up so it may be more testable and probably maintainable.

@nnamdifrankie If new exception list entries were created, we may need to create new artifacts before sending the updated manifest. That's what syncArtifacts does.

@XavierM and I have been talking about how to improve the way this callback happens... I think we will need some changes to make this more robust. Happy to discuss it @nnamdifrankie

I think a test around what branches are supposed to exist together will suffice for now. Not sure what is the relationship between the snapshot and new package config.

@nnamdifrankie Just pushed some of the tests, working on more.

@nnamdifrankie @paul-tavares I've added several unit tests and have run through some manual tests. I think this fix will ensure that policy creation is not interrupted by any of the manifest processing code.

…rors occur

elasticmachine · 2020-07-09T20:41:57Z

Pinging @elastic/ingest-management (Team:Ingest Management)

madirey · 2020-07-10T02:26:53Z

@elasticmachine merge upstream

…-cleanup

…o user-manifest-cleanup

kibanamachine · 2020-07-10T04:43:17Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 127178a

Build metrics

✅ unchanged

History

💔 Build #60565 failed 76939ee
💔 Build #60542 failed cbe39ec
💔 Build #60473 failed 62b1a2c
💔 Build #60448 failed d856cf9
💔 Build #60298 failed 53303ba

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

paul-tavares

Looks good 👍
All of my comments are minor in nature and optional. Also, I assume you (during Dev.) added a few throw's in the code for testing purposes and that he callback always returned successfully 😁

My only concern overall is that we seem to be doing lots of sync call (await) and that can extend overall time it takes for a Package config to be created in Ingest (as it is, it already take a little bit of time due to the work it does with Packages).

FYI:

One of the things I thought about implementing in the Ingest code around these callbacks for Create Package Config was a way to "timeout" the returned Promise after x amount of time. so that no one callback can hold up the entire operation. (cc/ @jfsiii , @jen-huang , @nchaulet )

paul-tavares · 2020-07-10T17:50:51Z

x-pack/plugins/ingest_manager/common/mocks.ts

+      version: '0.9.0',
+    },
+    inputs: [],
+  } as NewPackageConfig;


Suggestion: add the return type to the function definition instead of casting. Same below

paul-tavares · 2020-07-10T17:56:53Z

x-pack/plugins/security_solution/server/endpoint/ingest_integration.ts

-      return updatedPackageConfig;
+    // Until we get the Default Policy Configuration in the Endpoint package,
+    // we will add it here manually at creation time.
+    if (newPackageConfig.inputs.length === 0) {


Personally - I would remove this if() and always insert our input into the Package Config.

History:

The if() is only here because at one point we thought that we would get data included in the Package Config at creation time by defining it in the Endpoint Package, thus why we were doing a conditional check.

@paul-tavares Will inputs ever have length > 0 here?

@madirey - For endpoint Package Configs, no. At the moment we only expect 1 input at most that includes all of our data.

paul-tavares · 2020-07-10T18:01:11Z

x-pack/plugins/security_solution/server/endpoint/ingest_integration.ts

+    let snapshot: ManifestSnapshot | null = null;
+    let success = true;
+    try {
+      // Try to get most up-to-date manifest data.


Is what's returned here different from what you retrieved above in manifestManager.getLastDispatchedManifest?

@paul-tavares Possibly. Perhaps could look at renaming this function. The above builds a Manifest from the SO record of the last manifest dispatched. This function builds a manifest dynamically from the latest exception lists... there could be new items, so we try to get the most up-to-date data possible to avoid having to immediately re-dispatch.

@madirey thanks for explaining that. I'm not too familiar with the design around Lists, thus the questions 😃 .

I think there is a process in place that periodically (on a timer) goes through and picks up new versions of lists and re-generates the Manifest (which pushes updates to the Package configs), right? If thats the case, I would suggest that we avoid the additional delays here in explicitly triggering that process, so that the overall Package Config creation is not incur a longer delay.

paul-tavares · 2020-07-10T18:13:49Z

x-pack/plugins/security_solution/server/endpoint/ingest_integration.ts


-    try {
+      return updatedPackageConfig;


this may be just be a personal choice 😄
I find this return here as well as the one inside of catch() confusing because one (a future "us" ) add another return to finally {} and that would be the value that is returned out of the function.

I would propose (for clarify) that the return be consolidated to one under the finally {} block.

@paul-tavares It was an attempt to be sure that we return the data before doing other things... but we still don't have a good way to verify that the policy was actually created. We probably need to work together to figure out a better way to do this. Can we address in a follow-up to this PR? There are some important fixes here that will help us in the meantime.

@XavierM ^^

Agreed @madirey
My point really was that the finally {} block could actually change what the function returns (ex. it could overwrite the return from either the catch() or the regular try {}.

Also - I'm tempted to suggest that if we refactor this, that we split up the concerns into 2 callbacks: one for handling only adding the policy data and a second to handle only adding the manifest information, an we register both callbacks with Ingest. We can work on this post 7.9

@paul-tavares As written, that should not happen, but I agree with the concerns. It's ugly.

I can remove the section that pulls the snapshot and tries to get the most up-to-date information. That comes with a few caveats though:

There's no guarantee that we will have already created the manifest when the callback runs, so we may have to send an empty artifact list... the policy returned will not be deterministic on the first policy creation (artifacts may or may not exist), so that will have unit testing implications as well,

If there have been new exception items added within the last minute, we won't pick it up.

In both of the cases above, we'll have to re-send the policy a minute later when the updates are detected. This is probably fine for now, but it may not be ideal when we have large deployments (could result in 2 back-to-back large rollouts if I'm not mistaken).

paul-tavares · 2020-07-10T18:20:16Z

x-pack/plugins/security_solution/server/endpoint/mocks.ts

@@ -6,6 +6,8 @@

 import { ILegacyScopedClusterClient, SavedObjectsClientContract } from 'kibana/server';
 import { loggingSystemMock, savedObjectsServiceMock } from 'src/core/server/mocks';
+// eslint-disable-next-line @kbn/eslint/no-restricted-paths


We may want to ask the Kibana Platform team to include this in src/core/server/mocks

@paul-tavares I see you approved, thank you! But let's follow up and figure out a better way to do this. +++

…ed error handling in user manifest loop (#71269) (#71376) * Clean up matcher types * Rework promise and error-handling in ManifestManager * Write tests for ingest callback and ensure policy is returned when errors occur * More tests for ingest callback * Update tests * Fix tests Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

* master: (78 commits) Bump lodash package version (elastic#71392) refactor: 💡 use allow-list in AppArch codebase (elastic#71400) improve bugfix 7198 test stability (elastic#71250) [Security Solution][Ingest Manager][Endpoint] Optional ingest manager (elastic#71198) [Metrics UI] Round metric threshold time buckets to nearest unit (elastic#71172) [Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop (elastic#71269) [Security Solution] Allow to configure Event Renderers settings (elastic#69693) Fix a11y keyboard overlay (elastic#71214) [APM] UI text updates (elastic#71333) [Logs UI] Limit `extendDatemath` to valid ranges (elastic#71113) [SIEM] fix tooltip of notes (elastic#71342) address index templates feedback (elastic#71353) Upgrade EUI to v26.3.1 (elastic#70243) [build] Creates Linux aarch64 archive (elastic#69165) [SIEM][Detection Engine] Fixes skipped tests (elastic#71347) [SIEM][Detection Engine][Lists] Adds read_privileges route for lists and list items [kbn/optimizer] implement "requiredBundles" property of KP plugins (elastic#70911) [Security Solution][Exceptions] - Exceptions modal pt 2 (elastic#70886) [ML] DF Analytics: stop status polling when job stopped (elastic#71159) [SIEM][CASE] IBM Resilient Connector (elastic#66385) ...

madirey added 2 commits July 9, 2020 09:52

Clean up matcher types

6764d2c

Rework promise and error-handling in ManifestManager

53303ba

madirey added v8.0.0 release_note:skip Skip the PR/issue when compiling release notes Team:Endpoint Response Endpoint Response Team Feature:Endpoint Elastic Endpoint feature v7.9.0 labels Jul 9, 2020

madirey requested review from a team as code owners July 9, 2020 15:52

nnamdifrankie reviewed Jul 9, 2020

View reviewed changes

Write tests for ingest callback and ensure policy is returned when er…

d856cf9

…rors occur

madirey requested a review from a team July 9, 2020 20:41

botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Jul 9, 2020

madirey changed the title ~~[Security Solution][Endpoint] Improved error handling in user manifest loop~~ [Security Solution][Endpoint] Ingest callback fixes + Improved error handling in user manifest loop Jul 9, 2020

madirey added 3 commits July 9, 2020 17:34

More tests for ingest callback

7519ee7

Merge master

62b1a2c

Update tests

cbe39ec

elasticmachine and others added 4 commits July 9, 2020 20:27

Merge branch 'master' into user-manifest-cleanup

76939ee

Fix tests

a399e5e

Merge branch 'master' of github.com:elastic/kibana into user-manifest…

c55ec69

…-cleanup

Merge branch 'user-manifest-cleanup' of github.com:madirey/kibana int…

127178a

…o user-manifest-cleanup

madirey changed the title ~~[Security Solution][Endpoint] Ingest callback fixes + Improved error handling in user manifest loop~~ [Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop Jul 10, 2020

paul-tavares approved these changes Jul 10, 2020

View reviewed changes

madirey merged commit 3fc54e7 into elastic:master Jul 10, 2020

madirey mentioned this pull request Jul 10, 2020

[7.x] [Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop (#71269) #71376

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop #71269

[Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop #71269

madirey commented Jul 9, 2020 •

edited

Loading

elasticmachine commented Jul 9, 2020

elasticmachine commented Jul 9, 2020

nnamdifrankie Jul 9, 2020 •

edited

Loading

madirey Jul 9, 2020

madirey Jul 9, 2020

nnamdifrankie Jul 9, 2020

madirey Jul 9, 2020

madirey Jul 10, 2020 •

edited

Loading

elasticmachine commented Jul 9, 2020

madirey commented Jul 10, 2020

kibanamachine commented Jul 10, 2020

paul-tavares left a comment

paul-tavares Jul 10, 2020

paul-tavares Jul 10, 2020

madirey Jul 10, 2020

paul-tavares Jul 11, 2020

paul-tavares Jul 10, 2020

madirey Jul 10, 2020

paul-tavares Jul 11, 2020

paul-tavares Jul 10, 2020

madirey Jul 10, 2020

madirey Jul 10, 2020

paul-tavares Jul 11, 2020

madirey Jul 11, 2020 •

edited

Loading

paul-tavares Jul 10, 2020

madirey Jul 10, 2020

[Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop #71269

[Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop #71269

Conversation

madirey commented Jul 9, 2020 • edited Loading

Summary

Checklist

For maintainers

elasticmachine commented Jul 9, 2020

elasticmachine commented Jul 9, 2020

nnamdifrankie Jul 9, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

madirey Jul 10, 2020 • edited Loading

Choose a reason for hiding this comment

elasticmachine commented Jul 9, 2020

madirey commented Jul 10, 2020

kibanamachine commented Jul 10, 2020

💚 Build Succeeded

Build metrics

History

paul-tavares left a comment

Choose a reason for hiding this comment

FYI:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

History:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

madirey Jul 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

madirey commented Jul 9, 2020 •

edited

Loading

nnamdifrankie Jul 9, 2020 •

edited

Loading

madirey Jul 10, 2020 •

edited

Loading

madirey Jul 11, 2020 •

edited

Loading