[Bug] When loading server after upgrades and reverts, the Endpoint package failed to install momentarily in Fleet start #89802

kevinlog · 2021-01-29T22:12:57Z

Kibana version:
8.x

Elasticsearch version:
8.x

Describe the bug:
After upgrading and reverting the SIEM server, the Endpoint package failed to load with the following error message:

[resource_already_exists_exception] Transform with id [endpoint.metadata_current-default-0.17.0] already exists response from /_transform/endpoint.metadata_current-default-0.17.0: {"error":{"root_cause":[{"type":"resource_already_exists_exception","reason":"Transform with id [endpoint.metadata_current-default-0.17.0] already exists"}],"type":"resource_already_exists_exception","reason":"Transform with id [endpoint.metadata_current-default-0.17.0] already exists"},"status":400}

I am not completely sure which steps went into getting the SIEM server in that state. @EricDavisX any help is appreciated.

Steps to reproduce:

Go to Fleet after upgrading and reverting the SIEM server
See Endpoint package error

Expected behavior:
Fleet should load correctly.

Screenshots (if relevant):

Errors in browser console (if relevant):

Provide logs and/or server output (if relevant):

Any additional context:

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-01-29T22:13:48Z

Pinging @elastic/security-onboarding-and-lifecycle-mgt (Team:Onboarding and Lifecycle Mgt)

kevinlog · 2021-01-29T22:14:54Z

I'm not sure how reproducible this is or the exact steps in gettin the SIEM server into this state. @EricDavisX any help is appreciated.

That particular transform error I hadn't seen before so I thought it was important to capture. FYI @pzl

EricDavisX · 2021-01-29T22:31:48Z

hi. i do have some minimal info on server state and reproducing.

the server was on an 2-3 day old 8.x kibana snapshot and was alive and accessible. this error in fleet may have been showing, I don't know, but I doubt it was because it is not showing it now and it is reset to this same precondition.
i ran the 'kibana updater' ansible script that lives in the siem-team/cm repo that the engg productivity group uses for that server and for the other endpoint sever.

That's it. in detail about the 'updater', it downloads the latest kibana snapshot and installs it and re-starts the kibana service. the latest snapshot is known to be broken unless you use some modification in kibana startup values. i did employ a recommended modification there to turn off 'v2 templates' and some other common (and long-standing) start up variables. we could try to reproduce it when it is easier to get it back into a known good working state (right now it is a bit manual), that is, if we want to follow up more. I'm not really sure if it is a situation users will find themselves in on production clusters. not sure about that.

nchaulet · 2021-02-01T17:50:18Z

We are running on the same issue in the Fleet test suite #89776

In the test suite we remove the .kibana indice and perform a setup again, it's probably something that can happen to a user one day.

Should we ignore error when installing an already existing transform?

kevinlog · 2021-02-01T19:05:52Z

@nchaulet

Should we ignore error when installing an already existing transform?

Yes I think we should do that. I don't see a reason to fail when installing an Endpoint package and a certain asset already exists. Ideally, we remove whatever is existing and install the asset as it is in the package we're currently trying to install.

cc\ @pzl

kevinlog added bug Fixes for quality problems that affect the customer experience Team:Defend Workflows “EDR Workflows” sub-team of Security Solution labels Jan 29, 2021

kevinlog self-assigned this Jan 29, 2021

kevinlog added the impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. label Jan 29, 2021

nchaulet mentioned this issue Feb 1, 2021

Updating package registry snapshot distribution version #89776

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] When loading server after upgrades and reverts, the Endpoint package failed to install momentarily in Fleet start #89802

[Bug] When loading server after upgrades and reverts, the Endpoint package failed to install momentarily in Fleet start #89802

kevinlog commented Jan 29, 2021 •

edited

Loading

elasticmachine commented Jan 29, 2021

kevinlog commented Jan 29, 2021

EricDavisX commented Jan 29, 2021

nchaulet commented Feb 1, 2021 •

edited

Loading

kevinlog commented Feb 1, 2021

[Bug] When loading server after upgrades and reverts, the Endpoint package failed to install momentarily in Fleet start #89802

[Bug] When loading server after upgrades and reverts, the Endpoint package failed to install momentarily in Fleet start #89802

Comments

kevinlog commented Jan 29, 2021 • edited Loading

elasticmachine commented Jan 29, 2021

kevinlog commented Jan 29, 2021

EricDavisX commented Jan 29, 2021

nchaulet commented Feb 1, 2021 • edited Loading

kevinlog commented Feb 1, 2021

kevinlog commented Jan 29, 2021 •

edited

Loading

nchaulet commented Feb 1, 2021 •

edited

Loading