Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[eem] remove history transforms #193999

Merged
merged 34 commits into from
Oct 9, 2024

Conversation

klacabane
Copy link
Contributor

@klacabane klacabane commented Sep 25, 2024

Summary

Remove history and backfill transforms, leaving latest transform in place.

Notable changes to latest transform:

  • it does not read from history output anymore but source indices defined on the definition
  • it defines a latest.lookbackPeriod to limit the amount of data ingested, which defaults to 24h
  • each metadata aggregation now accepts a metadata.aggregation.lookbackPeriod which defaults to the latest.lookbackPeriod
  • entity.firstSeenTimestamp is removed. this should be temporary until we have a solution for https://github.com/elastic/elastic-entity-model/issues/174
  • latest metrics used to get the latest pre-computed value from history data, but is it now aggregating over the lookbackPeriod in the source indices (which can be filtered down with metrics.filter)
  • latest block on the entity definition is now mandatory

@obltmachine
Copy link

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@@ -57,6 +52,18 @@ export const entityDefinitionSchema = z.object({
])
),
installStartedAt: z.optional(z.string()),
installedComponents: z.optional(
Copy link
Contributor Author

@klacabane klacabane Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this allows cleaning up versions of definitions that contain history components

export const ENTITY_DEFAULT_METADATA_LIMIT = 1000;
export const ENTITY_DEFAULT_LATEST_FREQUENCY = '1m';
export const ENTITY_DEFAULT_LATEST_SYNC_DELAY = '60s';
export const ENTITY_DEFAULT_METADATA_LIMIT = 10;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took the opportunity to reduce the default limit

@klacabane klacabane marked this pull request as ready for review September 30, 2024 12:23
@klacabane klacabane requested review from a team as code owners September 30, 2024 12:23
@klacabane klacabane requested a review from machadoum September 30, 2024 12:23
@klacabane klacabane added release_note:skip Skip the PR/issue when compiling release notes Feature:EEM Elastic Entity Model labels Sep 30, 2024
@klacabane klacabane requested a review from a team as a code owner October 7, 2024 09:54
@botelastic botelastic bot added the ci:project-deploy-observability Create an Observability project label Oct 7, 2024
@kibana-ci
Copy link
Collaborator

kibana-ci commented Oct 7, 2024

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
inventory 222.8KB 222.7KB -30.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@miltonhultgren miltonhultgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!


export const SO_ENTITY_DEFINITION_TYPE = 'entity-definition';

export const backfillInstalledComponents: SavedObjectModelDataBackfillFn<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!

@klacabane
Copy link
Contributor Author

@jaredburgettelastic @hop-dev I'll be merging that change soon, could you please verify that it doesn’t break anything that would not be covered by tests ?

@hop-dev
Copy link
Contributor

hop-dev commented Oct 9, 2024

@klacabane please go ahead and merge when you're ready, we have a big PR incoming and will fix any issues in that

@elasticmachine
Copy link
Contributor

elasticmachine commented Oct 9, 2024

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
inventory 222.7KB 222.7KB -30.0B

History

Copy link
Member

@simianhacker simianhacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments about changing the entity.id generation with this change. Since this PR makes a significant change, I'd advocate for changing that too.

Comment on lines +142 to +190
{
script: {
description: 'Generated the entity.id field',
source: cleanScript(`
// This function will recursively collect all the values of a HashMap of HashMaps
Collection collectValues(HashMap subject) {
Collection values = new ArrayList();
// Iterate through the values
for(Object value: subject.values()) {
// If the value is a HashMap, recurse
if (value instanceof HashMap) {
values.addAll(collectValues((HashMap) value));
} else {
values.add(String.valueOf(value));
}
}
return values;
}

// Create the string builder
StringBuilder entityId = new StringBuilder();

if (ctx["entity"]["identity"] != null) {
// Get the values as a collection
Collection values = collectValues(ctx["entity"]["identity"]);

// Convert to a list and sort
List sortedValues = new ArrayList(values);
Collections.sort(sortedValues);

// Create comma delimited string
for(String instanceValue: sortedValues) {
entityId.append(instanceValue);
entityId.append(":");
}

// Assign the entity.id
ctx["entity"]["id"] = entityId.length() > 0 ? entityId.substring(0, entityId.length() - 1) : "unknown";
}
`),
},
},
{
fingerprint: {
fields: ['entity.id'],
target_field: 'entity.id',
method: 'MurmurHash3',
},
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tommyers-elastic I kind of wonder if we shouldn't take this opportunity to simplify the entity.id generation? I purpose we remove the hashing AND change the entity.id to be:

{
  set: {
     field: 'entity.id',
     value: definition.identityFields.map((identityField) => `{{entity.identity.${identityField.field)}}`).join(':'),
  }
}

For something like ['service.name', 'service.environment'] it would look like {{entity.identity.service.name}}:{{entity.identity.service.environment}}

Thoughts @miltonhultgren @klacabane

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense and it is implemented in #193652. I'd love to keep these separate as this PR has been opened for a while now and all the lights are finally green

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➕ ➕ lets get this merged!

Copy link
Member

@simianhacker simianhacker Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MERGE IT!

image

@klacabane Good call... ❤️

@hop-dev hop-dev requested review from simianhacker and removed request for machadoum October 9, 2024 20:47
@simianhacker simianhacker merged commit 8f8e988 into elastic:main Oct 9, 2024
51 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/11263190736

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 9, 2024
### Summary

Remove history and backfill transforms, leaving latest transform in
place.

Notable changes to latest transform:
- it does not read from history output anymore but source indices
defined on the definition
- it defines a `latest.lookbackPeriod` to limit the amount of data
ingested, which defaults to 24h
- each metadata aggregation now accepts a
`metadata.aggregation.lookbackPeriod` which defaults to the
`latest.lookbackPeriod`
- `entity.firstSeenTimestamp` is removed. this should be temporary until
we have a solution for
elastic/elastic-entity-model#174
- latest metrics used to get the latest pre-computed value from history
data, but is it now aggregating over the `lookbackPeriod` in the source
indices (which can be filtered down with `metrics.filter`)
- `latest` block on the entity definition is now mandatory

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Mark Hopkin <mark.hopkin@elastic.co>
(cherry picked from commit 8f8e988)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Oct 9, 2024
# Backport

This will backport the following commits from `main` to `8.x`:
- [[eem] remove history transforms
(#193999)](#193999)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Kevin
Lacabane","email":"kevin.lacabane@elastic.co"},"sourceCommit":{"committedDate":"2024-10-09T21:15:33Z","message":"[eem]
remove history transforms (#193999)\n\n### Summary\r\n\r\nRemove history
and backfill transforms, leaving latest transform
in\r\nplace.\r\n\r\nNotable changes to latest transform:\r\n- it does
not read from history output anymore but source indices\r\ndefined on
the definition\r\n- it defines a `latest.lookbackPeriod` to limit the
amount of data\r\ningested, which defaults to 24h\r\n- each metadata
aggregation now accepts a\r\n`metadata.aggregation.lookbackPeriod` which
defaults to the\r\n`latest.lookbackPeriod`\r\n-
`entity.firstSeenTimestamp` is removed. this should be temporary
until\r\nwe have a solution
for\r\nhttps://github.com/elastic/elastic-entity-model/issues/174\r\n-
latest metrics used to get the latest pre-computed value from
history\r\ndata, but is it now aggregating over the `lookbackPeriod` in
the source\r\nindices (which can be filtered down with
`metrics.filter`)\r\n- `latest` block on the entity definition is now
mandatory\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by:
Mark Hopkin
<mark.hopkin@elastic.co>","sha":"8f8e9883e0a8e78a632418a0677980f758450351","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:prev-minor","ci:project-deploy-observability","Feature:EEM","team:obs-entities"],"title":"[eem]
remove history
transforms","number":193999,"url":"https://github.com/elastic/kibana/pull/193999","mergeCommit":{"message":"[eem]
remove history transforms (#193999)\n\n### Summary\r\n\r\nRemove history
and backfill transforms, leaving latest transform
in\r\nplace.\r\n\r\nNotable changes to latest transform:\r\n- it does
not read from history output anymore but source indices\r\ndefined on
the definition\r\n- it defines a `latest.lookbackPeriod` to limit the
amount of data\r\ningested, which defaults to 24h\r\n- each metadata
aggregation now accepts a\r\n`metadata.aggregation.lookbackPeriod` which
defaults to the\r\n`latest.lookbackPeriod`\r\n-
`entity.firstSeenTimestamp` is removed. this should be temporary
until\r\nwe have a solution
for\r\nhttps://github.com/elastic/elastic-entity-model/issues/174\r\n-
latest metrics used to get the latest pre-computed value from
history\r\ndata, but is it now aggregating over the `lookbackPeriod` in
the source\r\nindices (which can be filtered down with
`metrics.filter`)\r\n- `latest` block on the entity definition is now
mandatory\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by:
Mark Hopkin
<mark.hopkin@elastic.co>","sha":"8f8e9883e0a8e78a632418a0677980f758450351"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193999","number":193999,"mergeCommit":{"message":"[eem]
remove history transforms (#193999)\n\n### Summary\r\n\r\nRemove history
and backfill transforms, leaving latest transform
in\r\nplace.\r\n\r\nNotable changes to latest transform:\r\n- it does
not read from history output anymore but source indices\r\ndefined on
the definition\r\n- it defines a `latest.lookbackPeriod` to limit the
amount of data\r\ningested, which defaults to 24h\r\n- each metadata
aggregation now accepts a\r\n`metadata.aggregation.lookbackPeriod` which
defaults to the\r\n`latest.lookbackPeriod`\r\n-
`entity.firstSeenTimestamp` is removed. this should be temporary
until\r\nwe have a solution
for\r\nhttps://github.com/elastic/elastic-entity-model/issues/174\r\n-
latest metrics used to get the latest pre-computed value from
history\r\ndata, but is it now aggregating over the `lookbackPeriod` in
the source\r\nindices (which can be filtered down with
`metrics.filter`)\r\n- `latest` block on the entity definition is now
mandatory\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by:
Mark Hopkin
<mark.hopkin@elastic.co>","sha":"8f8e9883e0a8e78a632418a0677980f758450351"}}]}]
BACKPORT-->

Co-authored-by: Kevin Lacabane <kevin.lacabane@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (9.0) the previous minor version (i.e. one version back from main) ci:project-deploy-observability Create an Observability project Feature:EEM Elastic Entity Model release_note:skip Skip the PR/issue when compiling release notes Team:obs-entities Observability Entities Team v8.16.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.