You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
rpm-ostreed is a single point of failure that can potentially cripple vital functions, such as installing/removing packages or applying OS updates. Hence rpm-ostree-based systems need additional safeguards before they can be safely used by non-technical users.
My laptop running Silverblue 33 is normally set to update automatically in the background using rpm-ostreed-automatic. Recently, however, rpm-ostree managed to wedge itself in a mess from which it was unable to extricate itself (#2548). If this bug had afflicted non-technical users, their systems would have been frozen in time and unable to receive any security updates. This fragility would be unacceptable for production systems; rpm-ostreed must never fail.
Since test suites could still unintentionally let bugs through (who thought about #2548 before it occurred?), the system itself should be designed to automatically recover from problems with rpm-ostreed. For instance, could crashes of rpm-ostreed (such as reported recently in #2603) trigger an automatic rollback?
The text was updated successfully, but these errors were encountered:
Hi, thanks for the issue! But I think it's the responsibility of the OS vendor to ship ostree commits (filesystem trees) that have been tested as a coherent unit together. Enabling that is in fact one of the major goals of the project, and we actually do it with e.g. Fedora/RHEL CoreOS. Our testing system caught this bug and prevented it from shipping there. But Fedora IoT and Silverblue do not currently have integrated gating test systems.
I wanted to clarify this a bit more: I am very embarrassed by the libsolv issue but rpm-ostree issues are for things that need to be solved here versus elsewhere. And this issue I think is more of a Fedora issue than an rpm-ostree issue. Here we provide all the tools and techniquies needed to solve this problem, they just aren't wired up together there.
This also relates a lot to https://github.com/cgwalters/fedora-silverblue-config which bases Silverblue on FCOS (although in this case we should actually use the FCOS lockfiles to validate that it's the same libsolv + rpm-ostree etc.)
rpm-ostreed
is a single point of failure that can potentially cripple vital functions, such as installing/removing packages or applying OS updates. Hence rpm-ostree-based systems need additional safeguards before they can be safely used by non-technical users.My laptop running Silverblue 33 is normally set to update automatically in the background using
rpm-ostreed-automatic
. Recently, however, rpm-ostree managed to wedge itself in a mess from which it was unable to extricate itself (#2548). If this bug had afflicted non-technical users, their systems would have been frozen in time and unable to receive any security updates. This fragility would be unacceptable for production systems;rpm-ostreed
must never fail.Since test suites could still unintentionally let bugs through (who thought about #2548 before it occurred?), the system itself should be designed to automatically recover from problems with
rpm-ostreed
. For instance, could crashes ofrpm-ostreed
(such as reported recently in #2603) trigger an automatic rollback?The text was updated successfully, but these errors were encountered: