Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] When loading server after upgrades and reverts, the Endpoint package failed to install momentarily in Fleet start #89802

Open
kevinlog opened this issue Jan 29, 2021 · 5 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. Team:Defend Workflows “EDR Workflows” sub-team of Security Solution

Comments

@kevinlog
Copy link
Contributor

kevinlog commented Jan 29, 2021

Kibana version:
8.x

Elasticsearch version:
8.x

Describe the bug:
After upgrading and reverting the SIEM server, the Endpoint package failed to load with the following error message:

[resource_already_exists_exception] Transform with id [endpoint.metadata_current-default-0.17.0] already exists response from /_transform/endpoint.metadata_current-default-0.17.0: {"error":{"root_cause":[{"type":"resource_already_exists_exception","reason":"Transform with id [endpoint.metadata_current-default-0.17.0] already exists"}],"type":"resource_already_exists_exception","reason":"Transform with id [endpoint.metadata_current-default-0.17.0] already exists"},"status":400}

I am not completely sure which steps went into getting the SIEM server in that state. @EricDavisX any help is appreciated.

Steps to reproduce:

  1. Go to Fleet after upgrading and reverting the SIEM server
  2. See Endpoint package error

Expected behavior:
Fleet should load correctly.

Screenshots (if relevant):

Errors in browser console (if relevant):

Provide logs and/or server output (if relevant):

Any additional context:

@kevinlog kevinlog added bug Fixes for quality problems that affect the customer experience Team:Defend Workflows “EDR Workflows” sub-team of Security Solution labels Jan 29, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-onboarding-and-lifecycle-mgt (Team:Onboarding and Lifecycle Mgt)

@kevinlog kevinlog self-assigned this Jan 29, 2021
@kevinlog kevinlog added the impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. label Jan 29, 2021
@kevinlog
Copy link
Contributor Author

I'm not sure how reproducible this is or the exact steps in gettin the SIEM server into this state. @EricDavisX any help is appreciated.

That particular transform error I hadn't seen before so I thought it was important to capture. FYI @pzl

@EricDavisX
Copy link
Contributor

hi. i do have some minimal info on server state and reproducing.

  1. the server was on an 2-3 day old 8.x kibana snapshot and was alive and accessible. this error in fleet may have been showing, I don't know, but I doubt it was because it is not showing it now and it is reset to this same precondition.
  2. i ran the 'kibana updater' ansible script that lives in the siem-team/cm repo that the engg productivity group uses for that server and for the other endpoint sever.

That's it. in detail about the 'updater', it downloads the latest kibana snapshot and installs it and re-starts the kibana service. the latest snapshot is known to be broken unless you use some modification in kibana startup values. i did employ a recommended modification there to turn off 'v2 templates' and some other common (and long-standing) start up variables. we could try to reproduce it when it is easier to get it back into a known good working state (right now it is a bit manual), that is, if we want to follow up more. I'm not really sure if it is a situation users will find themselves in on production clusters. not sure about that.

@nchaulet
Copy link
Member

nchaulet commented Feb 1, 2021

We are running on the same issue in the Fleet test suite #89776

In the test suite we remove the .kibana indice and perform a setup again, it's probably something that can happen to a user one day.

Should we ignore error when installing an already existing transform?

@kevinlog
Copy link
Contributor Author

kevinlog commented Feb 1, 2021

@nchaulet

Should we ignore error when installing an already existing transform?

Yes I think we should do that. I don't see a reason to fail when installing an Endpoint package and a certain asset already exists. Ideally, we remove whatever is existing and install the asset as it is in the package we're currently trying to install.

cc\ @pzl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. Team:Defend Workflows “EDR Workflows” sub-team of Security Solution
Projects
None yet
Development

No branches or pull requests

4 participants