-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[eem] remove history transforms #193999
[eem] remove history transforms #193999
Conversation
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
… src/core/server/integration_tests/ci_checks'
@@ -57,6 +52,18 @@ export const entityDefinitionSchema = z.object({ | |||
]) | |||
), | |||
installStartedAt: z.optional(z.string()), | |||
installedComponents: z.optional( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this allows cleaning up versions of definitions that contain history components
export const ENTITY_DEFAULT_METADATA_LIMIT = 1000; | ||
export const ENTITY_DEFAULT_LATEST_FREQUENCY = '1m'; | ||
export const ENTITY_DEFAULT_LATEST_SYNC_DELAY = '60s'; | ||
export const ENTITY_DEFAULT_METADATA_LIMIT = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
took the opportunity to reduce the default limit
💛 Build succeeded, but was flaky
Failed CI StepsMetrics [docs]Async chunks
History
To update your PR or re-run it, just comment with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
||
export const SO_ENTITY_DEFINITION_TYPE = 'entity-definition'; | ||
|
||
export const backfillInstalledComponents: SavedObjectModelDataBackfillFn< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!
@jaredburgettelastic @hop-dev I'll be merging that change soon, could you please verify that it doesn’t break anything that would not be covered by tests ? |
@klacabane please go ahead and merge when you're ready, we have a big PR incoming and will fix any issues in that |
💛 Build succeeded, but was flaky
Failed CI StepsMetrics [docs]Async chunks
History
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some comments about changing the entity.id
generation with this change. Since this PR makes a significant change, I'd advocate for changing that too.
{ | ||
script: { | ||
description: 'Generated the entity.id field', | ||
source: cleanScript(` | ||
// This function will recursively collect all the values of a HashMap of HashMaps | ||
Collection collectValues(HashMap subject) { | ||
Collection values = new ArrayList(); | ||
// Iterate through the values | ||
for(Object value: subject.values()) { | ||
// If the value is a HashMap, recurse | ||
if (value instanceof HashMap) { | ||
values.addAll(collectValues((HashMap) value)); | ||
} else { | ||
values.add(String.valueOf(value)); | ||
} | ||
} | ||
return values; | ||
} | ||
|
||
// Create the string builder | ||
StringBuilder entityId = new StringBuilder(); | ||
|
||
if (ctx["entity"]["identity"] != null) { | ||
// Get the values as a collection | ||
Collection values = collectValues(ctx["entity"]["identity"]); | ||
|
||
// Convert to a list and sort | ||
List sortedValues = new ArrayList(values); | ||
Collections.sort(sortedValues); | ||
|
||
// Create comma delimited string | ||
for(String instanceValue: sortedValues) { | ||
entityId.append(instanceValue); | ||
entityId.append(":"); | ||
} | ||
|
||
// Assign the entity.id | ||
ctx["entity"]["id"] = entityId.length() > 0 ? entityId.substring(0, entityId.length() - 1) : "unknown"; | ||
} | ||
`), | ||
}, | ||
}, | ||
{ | ||
fingerprint: { | ||
fields: ['entity.id'], | ||
target_field: 'entity.id', | ||
method: 'MurmurHash3', | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tommyers-elastic I kind of wonder if we shouldn't take this opportunity to simplify the entity.id
generation? I purpose we remove the hashing AND change the entity.id
to be:
{
set: {
field: 'entity.id',
value: definition.identityFields.map((identityField) => `{{entity.identity.${identityField.field)}}`).join(':'),
}
}
For something like ['service.name', 'service.environment']
it would look like {{entity.identity.service.name}}:{{entity.identity.service.environment}}
Thoughts @miltonhultgren @klacabane
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense and it is implemented in #193652. I'd love to keep these separate as this PR has been opened for a while now and all the lights are finally green
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➕ ➕ lets get this merged!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting backport for target branches: 8.x https://github.com/elastic/kibana/actions/runs/11263190736 |
### Summary Remove history and backfill transforms, leaving latest transform in place. Notable changes to latest transform: - it does not read from history output anymore but source indices defined on the definition - it defines a `latest.lookbackPeriod` to limit the amount of data ingested, which defaults to 24h - each metadata aggregation now accepts a `metadata.aggregation.lookbackPeriod` which defaults to the `latest.lookbackPeriod` - `entity.firstSeenTimestamp` is removed. this should be temporary until we have a solution for elastic/elastic-entity-model#174 - latest metrics used to get the latest pre-computed value from history data, but is it now aggregating over the `lookbackPeriod` in the source indices (which can be filtered down with `metrics.filter`) - `latest` block on the entity definition is now mandatory --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Mark Hopkin <mark.hopkin@elastic.co> (cherry picked from commit 8f8e988)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
# Backport This will backport the following commits from `main` to `8.x`: - [[eem] remove history transforms (#193999)](#193999) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Kevin Lacabane","email":"kevin.lacabane@elastic.co"},"sourceCommit":{"committedDate":"2024-10-09T21:15:33Z","message":"[eem] remove history transforms (#193999)\n\n### Summary\r\n\r\nRemove history and backfill transforms, leaving latest transform in\r\nplace.\r\n\r\nNotable changes to latest transform:\r\n- it does not read from history output anymore but source indices\r\ndefined on the definition\r\n- it defines a `latest.lookbackPeriod` to limit the amount of data\r\ningested, which defaults to 24h\r\n- each metadata aggregation now accepts a\r\n`metadata.aggregation.lookbackPeriod` which defaults to the\r\n`latest.lookbackPeriod`\r\n- `entity.firstSeenTimestamp` is removed. this should be temporary until\r\nwe have a solution for\r\nhttps://github.com/elastic/elastic-entity-model/issues/174\r\n- latest metrics used to get the latest pre-computed value from history\r\ndata, but is it now aggregating over the `lookbackPeriod` in the source\r\nindices (which can be filtered down with `metrics.filter`)\r\n- `latest` block on the entity definition is now mandatory\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Mark Hopkin <mark.hopkin@elastic.co>","sha":"8f8e9883e0a8e78a632418a0677980f758450351","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:prev-minor","ci:project-deploy-observability","Feature:EEM","team:obs-entities"],"title":"[eem] remove history transforms","number":193999,"url":"https://github.com/elastic/kibana/pull/193999","mergeCommit":{"message":"[eem] remove history transforms (#193999)\n\n### Summary\r\n\r\nRemove history and backfill transforms, leaving latest transform in\r\nplace.\r\n\r\nNotable changes to latest transform:\r\n- it does not read from history output anymore but source indices\r\ndefined on the definition\r\n- it defines a `latest.lookbackPeriod` to limit the amount of data\r\ningested, which defaults to 24h\r\n- each metadata aggregation now accepts a\r\n`metadata.aggregation.lookbackPeriod` which defaults to the\r\n`latest.lookbackPeriod`\r\n- `entity.firstSeenTimestamp` is removed. this should be temporary until\r\nwe have a solution for\r\nhttps://github.com/elastic/elastic-entity-model/issues/174\r\n- latest metrics used to get the latest pre-computed value from history\r\ndata, but is it now aggregating over the `lookbackPeriod` in the source\r\nindices (which can be filtered down with `metrics.filter`)\r\n- `latest` block on the entity definition is now mandatory\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Mark Hopkin <mark.hopkin@elastic.co>","sha":"8f8e9883e0a8e78a632418a0677980f758450351"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193999","number":193999,"mergeCommit":{"message":"[eem] remove history transforms (#193999)\n\n### Summary\r\n\r\nRemove history and backfill transforms, leaving latest transform in\r\nplace.\r\n\r\nNotable changes to latest transform:\r\n- it does not read from history output anymore but source indices\r\ndefined on the definition\r\n- it defines a `latest.lookbackPeriod` to limit the amount of data\r\ningested, which defaults to 24h\r\n- each metadata aggregation now accepts a\r\n`metadata.aggregation.lookbackPeriod` which defaults to the\r\n`latest.lookbackPeriod`\r\n- `entity.firstSeenTimestamp` is removed. this should be temporary until\r\nwe have a solution for\r\nhttps://github.com/elastic/elastic-entity-model/issues/174\r\n- latest metrics used to get the latest pre-computed value from history\r\ndata, but is it now aggregating over the `lookbackPeriod` in the source\r\nindices (which can be filtered down with `metrics.filter`)\r\n- `latest` block on the entity definition is now mandatory\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Mark Hopkin <mark.hopkin@elastic.co>","sha":"8f8e9883e0a8e78a632418a0677980f758450351"}}]}] BACKPORT--> Co-authored-by: Kevin Lacabane <kevin.lacabane@elastic.co>
Summary
Remove history and backfill transforms, leaving latest transform in place.
Notable changes to latest transform:
latest.lookbackPeriod
to limit the amount of data ingested, which defaults to 24hmetadata.aggregation.lookbackPeriod
which defaults to thelatest.lookbackPeriod
entity.firstSeenTimestamp
is removed. this should be temporary until we have a solution for https://github.com/elastic/elastic-entity-model/issues/174lookbackPeriod
in the source indices (which can be filtered down withmetrics.filter
)latest
block on the entity definition is now mandatory