-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It would be nice to have a 'safe mode' for zfs and zpool commands #4134
Comments
👍 |
I would be happy to take a shot at this... |
I think that's a totally reasonable idea which would be a great feature. Can you flesh out a little more detail about how it should work. Also I'd be cautious about a |
Hmm, well, I am not thinking of locking the entity per-se, just making it harder to screw yourself. Although, thinking about this more, I kinda like your idea. If I want to blow away 'tank/vsphere', where all my VMs live, it's not unreasonable for me to have to do 'zfs set locked=off tank/vsphere' first. So, yeah, that sounds good to me. Was there anything else? |
I think you'll want to think through exactly what commands this should apply too. |
-f flags are bad UI design, you just end up training people to always type -f (see also kill -9) In general, If you don't want to destroy a dataset, don't type "destroy" :-p NB, there are UIs that mark datasets for later destruction, giving you an opportunity to change your mind. However, the most common use case for dataset destroy is to reclaim space, so the deferred destroy is not a general solution. zpool destroy is reversible, no need for more UI complexity. -- richard
|
On 2015-12-22 13:40, Richard Elling wrote:
Yes, thank you. I'm familiar with the chainsaw with no safety guard :)
Fair enough, I'd forgotten about that. That said, I don't think it's |
What other subcommand can you confuse with "destroy?" Answer: none So the problem you're trying to solve is one of naming datasets. Since there is Fully automated systems, when designed well, are not subject to this problem as
Low effort for sure, but also not effective, which is why it doesn't exist. For those listening at home, this was one of the first great debates when ZFS was first released. |
Well, to pitch in $.02, I am always terrified of needing to use "zfs destroy foo/bar@baz" to nuke snapshots, and consider the overloading of "destroy" here to be a little on the hazardous side of things. I'd much rather an "unsnapshot" command or something that errored if it was given a name without @. |
As nwf said, 'destroy' is used to destroy both, datasets and snapshots. Removing a snapshot from a dataset always feels a bit risky, thus. Until entering '@', you typed a valid command to accidentally destroy your dataset. |
but destroy will not destroy a dataset with children unless passed "-r" as well? so if you want to type
next step: train yourself to put "-r" AFTER the dataset name, not in front (to prevent the same problem of premature ending of a command when you actually want to recursively delete a dataset) |
But if there is no snapshot for some reason, the dataset will either be destroyed. I am using dummy-datasets like Thanks for the recommendation to put the '-r'-flag after the target-name. |
I would like to vote for adding a settable flag to pools, datasets, & snapshots, that is something like "protected", that can be set to on/off or yes/no. If the object is marked as destroyable, it behaves as normal: a call to Also, the ability to do a recursive set=on would be useful, but a recursive set=off, seems somewhat dangerous. Perhaps a confirmations message, but, maybe I'm just paranoid.
Where entering no, or Ctrl+c 'ing abort the change. @behlendorf An additional flag of "locked" to prevent all admin changes also sounds good. I imagine each being set independently, but if "locked" is on, maybe it forces "protected" to true? -Yurelle |
Holds are only good for snapshots though. While you could make a snapshot and put a hold on it to prevent the main dataset's destruction, that snapshot references data you might want to free up. It works, but it's not great. |
On Apr 15, 2018, at 1:56 PM, yurelle ***@***.***> wrote:
I would like to vote for adding a settable flag to pools, datasets, & snapshots, that is something like "protected", that can be set to on/off or yes/no. If the object is marked as destroyable, it behaves as normal: a call to zfs destroy myThing destroys it, no questions asked. However, if you manually set the flag to protect the object, it behaves differently. It is still writable/modifiable if the readOnly flag is off, but any call to zfs destroy myThing will abort and error out; something like: "Cannot destroy 'myThing'. Pool/Dataset/Snapshot is protected."
It seems like you are requesting a new command for something that you can do today with a procedure.
For example, it is not uncommon for service providers to use a "delete = move to trash" approach, because
a zfs destroy is destructive. This is trivially implemented thusly:
+ move to trash:
zfs set readonly=on
zfs set volmode=none # for volumes
zfs set canmount=noauto|off # for datasets
+ empty trash:
zfs destroy
I can see no justification for a new option to zfs destroy, use the existing features.
-- richard
… @behlendorf <https://github.com/behlendorf> An additional flag of "locked" to prevent all admin changes also sounds good. I imagine each can be set independently, but if "locked" is on, it forces "protected" to true?
-Yurelle
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#4134 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA08zQ-S4T7Lf2FQwrLPrInvUgnJiHTGks5to7QMgaJpZM4G6HVV>.
|
@richardelling That doesn't work for snapshots, since we can't set readonly: "this property can not be modified for snapshots". While I know that some sysadmins don't make mistakes, snapshots are, ostensibly, delegate-able to users, and we know that users like to typo commands. |
I would totally love a On my desktop system, I'm often playing around with pools. Creating and destroying them. It's easy for muscle memory to kickin and accidentally type or mistype the wrong dataset name, resulting in destroying the main system's data. Or perhaps you are creating a script that is slightly buggy? A protected flag would help to alleviate the terror I have anytime I'm working with |
just throwing it out there but zfs create and destroy, as well as zpool create all have the no-op flag. |
The difference between:
and
Is the difference between my two year old running in and "practicing" his typing while I pause to sip my coffee contemplating if that is actually the command I want to run... Silly....but it absolutely happened. (There was actually no snapshot on the precious dataset in this case) Would love to see a protected flag available on the dataset. @richardelling I'm mostly wanting to protect myself from myself vs an operational procedure. Thoughts? |
automate it away (DRY) |
I agree with the comment about -f, as this particular flag often prevents people from thinking about what it is they actually asked the machine to do versus what they thought they were asking. I actually think we already have a more zfs/solaris-esque way of doing this, although I have not yet used it,
I like how something like this would be lean and clean, and how there's no flag to force it; we explicitly have to run a both seperate and unambiguous command to remove that protection again. By making the command separate, as I feel is a pattern I have seen and liked in illumos, it means at that point, when I hit enter, my mind is thinking about one thing and one thing only, unholding that dataset. Using a flag means that I could go back through my terminal history, edit a command, and accidentally force overwriting a different dataset that I didn't want to, or could mean that we're still thinking about the original command where we were about to damage a filesystem irreperably and the specifics of this, rather than the simple action of "I'm now marking this dataset as okay to be destroyed, but then I must have marked this as not to be overwritten about 9 months aho and can't remember why - what was that reason...?" However the feature might be implemented, an optional comment on why the dataset is to be marked as not to be destroyed would be fantastic for admins IMO. Using a property on the dataset could also offer all the above features in a different format, I believe (e.g. I feel an argument that we simply shouldn't run destroy if we're not sure isn't a particularly good one. I recently had a similar issue where I was hoping to make a read-only "copy" of /dev on linux so I could open device files for reading but not writing when I needed to backup my partition tables. I asked about this on the chat for my university's computing society and the only replies I had were that I should use a backup tool (not possible when you need access to raw bytes of course), that Our job is often to put in as many sensible measures as we see fit for the job at hand to protect our machine from ourselves, so in my opinion, if a user would like to add another layer of protection because she knows she is likely to make mistakes or errors, perhaps on a system that has not been touched in a very long time, perhaps at a particularly unholy time of day, I think we should absolutely facilitate this if there is little to be compromised by doing so. We have a common policy of offering a program or user as little as they need to get the job done, and nothing more. Unix unfortunately throws this out on many occasions where you are either a normal user called In my case, I have been writing a script for the last several hours and destroying and creating datasets over and over, necessarily as root, because I cannot mount them in linux without root when I can on illumos. In an ideal world, yes, I would create a pool that I could junk later but I wanted the job done as quickly as possible having spent 12 hours non-stop on this already. I turns out I have both a filesystem called |
|
Polkit, Udev, etc. also came to mind - it might be quite nice to be able to write small functions in something that looks a bit like a scripting language that could be run using hooks before/after executing various commands and allow administrators to customise what happens before and after specific operations according to their own organisational policies. Perhaps a data controller is required by law to keep a copy of data for a minimum of say, 30 days. If it were possible to have a small piece of code to be triggered before deleting a snapshot, it could verify that the snapshot in its arguments is less than 30 days old, and spit out an error, before zfs itself goes and deletes the snapshot, if this isn't the case. In this case, we could easily add a property like the above, and prevent destroying any dataset unless the property is removed first. |
Just thinking there's already precedent for ZFS environment variables to control how some commands work (see zpool script code). What about a variable ZFS_SAFE_MODE which users can export which will make potentially destructive operations act as though "-n" is set? |
so you can use @gdevenyi actually a default deny policy for write operations would be kinda useful but i could see that being problematic if one has it set by default in one terminal and goes to another expecting the same restrictions. it is similar to why shell aliases are dangerous. |
Since the zpool/zfs commands are just open source programs, nothing prevents a user from providing their own that has different policies. For appliances, it is quite common to implement direct calls, like zpool does, from policy elements. |
I found this thread after having run accidentally How do I prevent shooting myself in the foot with stuff like this? PS: In reality the pool name of the source is |
I think zfs is the only FS that has such destructive commands with absolutely no warning and questions. Even deleting all the snapshots etc... Sorry, but thats a big - on zfs currently and that this is an open bug for 6 years.... |
Especially because the command to destroy datasets & snapshots are basically identical, only the path is different, and the command to destroy a small narrow thing, includes the command to destroy a broader thing. if you accidentally hit enter part way through typing a command (or copy & paste a command template, and accidentally get a carriage return in the clipboard), you can end up deleting the entire dataset when you only meant to destroy a single snapshot.
|
But dataset with children won't be deleted even now, so it's not the case
And one more comment to older one:
Unfortunately, even with After some years I began to agree with @richardelling here, basic |
I think the easiest thing would be to simply add a setting "protected" and no fs or pool can be deleted when the flag is set. You have to manually unset the flag to delete. The flag is not propagated to children. I never ever accidentally deleted a ceph pool. |
But it's not at least in CLI interface-wise.
You can't delete all snapshots if you didn't explicitly run command with |
I'd vote for |
It's scary typing But a new fs flag is potential overkill. Making the The default behavior could be set via env variable. e.g.:
|
ZFS is all about protecting data and this is one issue that is actually really worrisome! I know, in the good sense of Unix, you can shoot yourself in the foot with When I have:
And I no longer need the snapshots until I make a new, doing this by mistake is REALLY SCARY (even with the
All it takes is for me is to have my son run into the room while I am typing, making me loose my concentration for a second, and when I type I forget to provide the name of the snapshot rather than the Having an option is the best solution, it is backwards compatible and it can protect a dataset optimally:
Please add this feature! UPDATE: It seems like Oracle has added this feature as well, if I read the doc correctly. |
This is my main concern handling my pools right now and doing more scripts. Don't trust myself that much, cause I know I'll do a typo sooner or later that will destroy my 30TB of data.. |
When is this coming? ZFS should not make it this easy to lose your data. A protected flag is absolutely essential. |
Come on - issue opened in Dec 2015, now it’s 2023. SEVEN f*ing years later we’re still debating. What’s the problem with solving this? |
While we're back in bad error message country with that (#14538) it solves this issue. |
I, too, would like to see such an option. But as nearly always my view is a bit different. What I am lacking is something, which changes depending on where and what I am. Hence just a flag like The idea is to protect against a workflow. Something you have done a billion of times, but this time you happen to be on the wrong side of the cluster. So the complete layout of both sides looks identical. So the same command sequence could work. Or does it need to be this way? No! My proposal to how to protect ZFS is something like
with the inverse
(Oh no, second command fails. What happend?) I'd also vote that pools, fs, snapshots and zvols can independently protected/unprotected with more than just one word, such that different pieces of software (or different roles of people) can set different pieces of locks independently and do not need to form complex error prone cooperation meshes with scripts which then easily introduce nasty race conditions and so on.
The important part is, that this setting is not replicated with Why? Because of
Think about somebody accidentally deleting the source with all snapshots. Then after a But with protected snapshots the Also it is always a PITA to find out which snapshot may be important in the archive while it is already destroyed on the source, BEFORE you apply But with some independent protection feature you do no more puzzle as you can see the ticket ID on the protection without need to keep some additional informational hints. So you can easily spot important snapshots held in the archive and save the important parts before releasing the protection. And independent people can set independent locks this way, too, as you can set multiple protections. Note that I am talking about productions which has several TiB online with a incremental of several GiB per day. And some law which perhaps says, you must keep the backup history for 30 years or so. Hence backing up the archive to recreate it, with several 100 TiB, isn't very feasible. So neither is possible:
In 99% of the time there will be no problem, as you can just To sum it up:
It is right that you can always write scripts which are plain too clever and hence automatically adapt to the safeguards. But this then is no more a ZFS issue, this is purely on your own local side. But without such protections and with the critical need of (Perhaps fix the PS: For those who think |
To answer this: it only happens for streams created using |
You can use following method to protect datasets:
First you create child dataset which will be empty and occupy no space. |
I really don't like that I can mistype something and vaporize datasets, and even entire pools. I would like to propose a property defined for datasets, zvols and pools that basically says 'if this is set to ON, any destructive operation will require use of the -f flag'. Thinking of destroying datasets, zvols and pools. Maybe snapshots too? This could really be a totally userland abstraction. What do y'all think?
The text was updated successfully, but these errors were encountered: