-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution] Change query builder so N exception items don't nest N levels deep #72224
[Security Solution] Change query builder so N exception items don't nest N levels deep #72224
Conversation
}); | ||
|
||
describe('when "exclude" is false', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed because exclude
is no longer used. 👍
dateFormatTZ: 'Zulu', | ||
}; | ||
|
||
const enabledFilters = ((filters as unknown) as Filter[]).filter((f) => !isFilterDisabled(f)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be good to have some extra validation here. We could be explicit about which fields we need Filter
to contain, since it's so loosely defined atm. At this point, enabledFilters
could really be anything and we won't have a good idea of why things break, if they do. I know this PR is just modifying existing code, but maybe could look at this as a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that would be better. I know getQueryFilter is used in a few different places, I haven't looked into why it needs to pass in Partial<Filter> instead of a complete
Filter`, but it seems like something to fix.
|
||
const enabledFilters = ((filters as unknown) as Filter[]).filter((f) => !isFilterDisabled(f)); | ||
return buildEsQuery(indexPattern, queries, enabledFilters, config); | ||
return buildEsQuery(indexPattern, initialQuery, enabledFilters, config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic seems sound 👍
@elasticmachine merge upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (But I'm not an expert in this area or been that involved so caveat emptor with my 👍 )
@@ -48,27 +49,76 @@ export const getQueryFilter = ( | |||
*/ | |||
const exceptionQueries = buildExceptionListQueries({ language: 'kuery', lists }); | |||
if (exceptionQueries.length > 0) { | |||
// Assume that `indices.query.bool.max_clause_count` is at least 1024 (the default value), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice comment 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Focused mainly on the changes to the exceptions build query file. Was gonna say to remove the language
param used to build the exceptions since we should just always build them in KQL (since it supports nested), but you already have that in a TODO. Thanks so much for the work here!
language: Language; | ||
}): string => { | ||
const { field, entries } = item; | ||
const { field, entries: subentries } = entry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the subentries
will look to use that in future PRs for clarity on what it refers to.
for (let index = 0; index < exceptionQueries.length; index += chunkSize) { | ||
const exceptionQueriesChunk = exceptionQueries.slice(index, index + chunkSize); | ||
const esQueryChunk = buildEsQuery(indexPattern, exceptionQueriesChunk, [], config); | ||
const filterChunk: Filter = { | ||
meta: { | ||
alias: null, | ||
negate: false, | ||
disabled: false, | ||
}, | ||
query: { | ||
bool: { | ||
should: esQueryChunk.bool.filter, | ||
}, | ||
}, | ||
}; | ||
chunkedFilters.push(filterChunk); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to rewrite this with a map
? Tendency to prefer a functional style for situations like this.
}); | ||
}); | ||
|
||
test('it should work with a list with multiple items', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this test case could have a more descriptive title. Something like, "exception list with multiple exception items with the same entry".
}); | ||
}); | ||
|
||
test('it should work with a list with multiple items', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this test case could have a more descriptive title. Something like, "exception list with multiple exception items with the same entry".
{ | ||
query: formattedQuery, | ||
return exceptionItems.map((exceptionItem) => { | ||
return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can probably remove this return statement.
const exceptionItemEntries = entries.map((entry) => { | ||
return buildEntry({ entry, language }); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think we can remove the return statement here and rewrite as
const exceptionItemEntries = entries.map((entry) => buildEntry({ entry, language }));
…est N levels deep (elastic#72224) * Change query builder so N exceptions don't nest N levels deep * Fix tests and clarify function naming * Rename evaluateEntry to buildEntry for consistency * Remove duplicate tests * more test fixes * Add tests with multiple exception list items * Chunk exception list items in query to support up to 1000000 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
…est N levels deep (elastic#72224) * Change query builder so N exceptions don't nest N levels deep * Fix tests and clarify function naming * Rename evaluateEntry to buildEntry for consistency * Remove duplicate tests * more test fixes * Add tests with multiple exception list items * Chunk exception list items in query to support up to 1000000 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
…est N levels deep (#72224) (#72867) * Change query builder so N exceptions don't nest N levels deep * Fix tests and clarify function naming * Rename evaluateEntry to buildEntry for consistency * Remove duplicate tests * more test fixes * Add tests with multiple exception list items * Chunk exception list items in query to support up to 1000000 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
…est N levels deep (#72224) (#72868) * Change query builder so N exceptions don't nest N levels deep * Fix tests and clarify function naming * Rename evaluateEntry to buildEntry for consistency * Remove duplicate tests * more test fixes * Add tests with multiple exception list items * Chunk exception list items in query to support up to 1000000 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
💚 Build SucceededBuild metricsasync chunks size
History
To update your PR or re-run it, just comment with: |
Summary
Currently, the usage of buildEsQuery with a single large KQL string causes a deeply nested Elasticsearch query to be generated. The builder generates an AST from a boolean expression of the form
A OR B OR C OR D ...
, and rather than detecting that A, B, C, and D can be put in an array in a singleshould
block, it creates a set of nestedshould
blocks as deep as the number of expressions being OR'd together. In our case, the number of expressions being OR'd is the number of exception items. This eventually causes Elasticsearch to fail on the search when the nesting becomes too deep.This PR changes the logic so that every exception item being OR'd together will be in the same array without unnecessary nesting. A further improvement would be using similar logic on the entries within an exception, which can suffer from the same problem with the AST, however the number of entries per exception has a much lower practical limit.
Additional refactoring for next PR:
language
option frombuildExceptionListQueries
exceptionFilter
to separate functionManual test process:
For maintainers