Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host reports incorrect encryption key status when operating system is re-installed but Fleet host record is not deleted #24654

Closed
allenhouchins opened this issue Dec 11, 2024 · 5 comments
Assignees
Labels
~backend Backend-related issue. bug Something isn't working as documented customer-pingouin #g-mdm MDM product group :incoming New issue in triage process. :product Product Design department (shows up on 🦢 Drafting board) prospect-quantz ~released bug This bug was found in a stable release.

Comments

@allenhouchins
Copy link
Member

allenhouchins commented Dec 11, 2024

UPDAYE: @noahtalerman: We decided to document the expected workflow as the expected behavior. PR is here.


Fleet version: 4.60.1

Web browser and operating system: Any


💥  Actual behavior

If a user needs to wipe and re-install their OS, when Fleet is reinstalled, Fleet will report incorrect encryption key status because it attaches itself to the exist host ID record. I have had customers/prospects report this on both Linux and Windows.

🧑‍💻  Steps to reproduce

  1. Setup a Linux or Windows host in Fleet and verify successful encryption key escrow
  2. Wipe the device and re-install the OS without deleting the host record in Fleet
  3. Enroll the device to Fleet and notice the encryption key status is incorrect. We also have an encryption key stored that is unusable.

🕯️ More info (optional)

Semi-related to this: #24592
Slack thread (Linux): https://fleetdm.slack.com/archives/C07GLME5P7C/p1733868862303489
Slack thread (Windows): https://fleetdm.slack.com/archives/C080TU8UP45/p1733934648582779

🛠️ To fix

@marko-lisica:
Docs improvement as this is expected behavior
#26377

@allenhouchins allenhouchins added bug Something isn't working as documented :release Ready to write code. Scheduled in a release. See "Making changes" in handbook. ~released bug This bug was found in a stable release. #g-endpoint-ops Endpoint ops product group :incoming New issue in triage process. customer-pingouin prospect-quantz labels Dec 11, 2024
@sharon-fdm sharon-fdm added the ~backend Backend-related issue. label Dec 11, 2024
@sharon-fdm
Copy link
Collaborator

Hey team! Please add your planning poker estimate with Zenhub @iansltx @ksykulev @lucasmrod @mostlikelee

@sharon-fdm sharon-fdm added Epic DO NOT USE. Auto-created by ZenHub, cannot be disabled. and removed Epic DO NOT USE. Auto-created by ZenHub, cannot be disabled. labels Dec 11, 2024
@lukeheath lukeheath added #g-mdm MDM product group and removed #g-endpoint-ops Endpoint ops product group labels Dec 19, 2024
@georgekarrv georgekarrv added this to the 4.63.0-tentative milestone Jan 3, 2025
@georgekarrv georgekarrv modified the milestones: 4.63.0, 4.64.0-tentative Jan 14, 2025
@gillespi314
Copy link
Contributor

@georgekarrv, this issue probably requires additional product input to define expected behavior as it relates to some long-standing questions regarding the host lifecycle and how/if Fleet should be responsible for differentiating wiped hosts from other hosts during fleetd enrollment/re-enrollment.

Generally speaking, Fleet preserves host records unless and until the host is deleted in Fleet. The reason for this, as I understand it, is that there are myriad scenarios where fleetd enrollment/re-enrollment can happen (e.g., fleetd is uninstalled and re-installed on the same host) and so far at least there has been a bias toward preserving as much of the historical record as possible as the default.
The current approach runs into questionable UX when it comes to workflows that involve re-imaging devices from the IT warehouse as it introduces some friction to the process when we expect admins to delete the host from Fleet manually.

There are a wide range of scenarios for wiping a host, some of which have limited visibility to Fleet. And so the challenge becomes how to best support the re-imaging workflow in a way that is clearly defined and deterministic, while at the same time not falling into the trap of relying on a bunch of ad hoc, platform-specific heuristics that will be difficult to maintain overtime as the platforms drift over time. With that in mind when these sorts of issues have been raised in the past, the bright-line, keep-it-simple approach that depends on the IT admin deleting the host in Fleet has been considered "good enough" (at least that's where things ended up the last time we were working through this with Roberto). There's room for improvement to be sure, but I think will require a fair bit of design thinking plus development effort to get it right.

@lukeheath lukeheath added :product Product Design department (shows up on 🦢 Drafting board) and removed :release Ready to write code. Scheduled in a release. See "Making changes" in handbook. labels Feb 7, 2025
@noahtalerman noahtalerman removed this from the 4.64.0 milestone Feb 10, 2025
marko-lisica added a commit that referenced this issue Feb 17, 2025
@marko-lisica
Copy link
Member

Hey @allenhouchins, this is expected behavior for now. I talked with @noahtalerman and for now we decided to improve Wipe and lock guide: #26377

If you think we should improve this further, please file a feature request.

noahtalerman pushed a commit that referenced this issue Feb 17, 2025
Related to: #24654

Added a callout to describe that the host should be deleted after it's
wiped if a user wants to re-enroll the host and escrow a new disk
encryption key.
@noahtalerman
Copy link
Member

FYI @marko-lisica I merged in your PR so I'm closing this bug.

@fleet-release
Copy link
Contributor

Host reborn anew,
Keys align in cloud city,
Fleet's truth shines through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
~backend Backend-related issue. bug Something isn't working as documented customer-pingouin #g-mdm MDM product group :incoming New issue in triage process. :product Product Design department (shows up on 🦢 Drafting board) prospect-quantz ~released bug This bug was found in a stable release.
Projects
None yet
Development

No branches or pull requests

8 participants