Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs send -R (non-raw) of encrypted dataset does not work #10507

Open
johnnyjacq16 opened this issue Jun 27, 2020 · 23 comments
Open

zfs send -R (non-raw) of encrypted dataset does not work #10507

johnnyjacq16 opened this issue Jun 27, 2020 · 23 comments
Labels
Component: Send/Recv "zfs send/recv" feature Type: Feature Feature request or new feature

Comments

@johnnyjacq16
Copy link

System information

Type Version/Name
Distribution Name CentOS & Arch
Distribution Version
Linux Kernel 4.18.0-193.6.3.el8_2.x86_64
Architecture x86_64
ZFS Version zstd-zfs
SPL Version --

Problem 1

Describe the problem you're observing

zfs send -R SyS_Backup/ROOT/Arch_OS@22.06.2020_Harddrive-Backup | ssh 10.10.10.1 | zfs recv -v ztesting/Test 
&& echo 'done!'   
Pseudo-terminal will not be allocated because stdin is not a terminal.
cannot send SyS_Backup/ROOT/Arch_OS@22.06.2020_Harddrive-Backup: encrypted dataset SyS_Backup/ROOT/Arch_OS 
may not be sent with properties without the raw flag
root@10.10.10.1's password:
cannot receive: failed to read from stream

Describe how to reproduce the problem

@ahrens I tied that but it did not work, I am using the zstd-zfs version of zfs, git clone from BrainSlayer git main or master branch (I taught on github the word master was to be changed to the word main but the main branch is still called master) , been using this version of zfs for the past 6 months having no issues from zstd perspective.

Show below is my setup:

[root@Cent_OS-box ~]# lsmod | grep zfs
zfs                  4530176  8
zunicode              335872  1 zfs
zzstd                 565248  1 zfs
zlua                  188416  1 zfs
zcommon                94208  1 zfs
znvpair                98304  2 zfs,zcommon
zavl                   16384  1 zfs
icp                   327680  1 zfs
spl                   118784  6 zfs,icp,zzstd,znvpair,zcommon,zavl
[root@Cent_OS-box ~]# zfs get compression,encryption,keyformat,keylocation SyS_Backup/ROOT/Arch_OS
NAME                     PROPERTY     VALUE           SOURCE
SyS_Backup/ROOT/Arch_OS  compression  zstd-19         inherited from SyS_Backup
SyS_Backup/ROOT/Arch_OS  encryption   aes-256-gcm     -
SyS_Backup/ROOT/Arch_OS  keyformat    passphrase      -
SyS_Backup/ROOT/Arch_OS  keylocation  prompt          local
[root@Cent_OS-box ~]# 


Problem 2

Describe the problem you're observing

pool cannot import

Describe how to reproduce the problem

Created a disk file use dd if=/dev/zero of= bs=4k
Created two partition on the disk file using GTP partition table
The first small partition contained vfat, and the second partition contained zfs
I imported the pool successfully with the command zpool import -R <directory> -N <Original name> <New name>
Removed some zfs dataset

Include any warning/errors/backtraces from the system logs

@ahrens ahrens added the Component: Send/Recv "zfs send/recv" feature label Jun 29, 2020
@ahrens ahrens changed the title ZFS send Encryption & Pool not importing zfs send -R (non-raw) of encrypted dataset does not work Jun 29, 2020
@ahrens ahrens added the Type: Feature Feature request or new feature label Jun 29, 2020
@ahrens
Copy link
Member

ahrens commented Jun 29, 2020

The error message seems to be saying that what you're asking for is not implemented:
encrypted dataset ... may not be sent with properties without the raw flag
Note that -R means (among other things) "send with properties".
Maybe @tcaputi can comment on why this restriction exists and what it might take to implement this. Presumably we would not want to send the encryption-related properties.

@tcaputi
Copy link
Contributor

tcaputi commented Jun 29, 2020

Yeah... So this was a bit complicated, but the gist of the reasoning goes something like this:

  • Encryption is a property, but it is a setonce property. This allows us to enforce the fact that the encryption algorithm cant change over the life of the dataset (which is required for various reasons). setonce properties are not sent in a zfs send -p / -R stream.
  • Encryption is also a property that "feels" like compression / checksum, which are not setonce properties. These properties ARE sent in a zfs send stream that includes properties. Users would reasonably expect that these properties would all "travel" together in a send stream.
  • The only way to "send" the encryption property is to use raw streams.
  • It would be bad if a user thinks that they were going to send data to an encrypted destination but that didn't happen. This could open up the end user to liability / security problems.
  • In order to ensure the user gets reasonable results we force them to use zfs send -w when sending with properties.

If we can think of a better way to do this, I would be all for pursuing it. The key reasoning behind all this is that we really want to prevent the user from accidentally decrypting their data at all costs.

@bghira
Copy link

bghira commented Jun 29, 2020

you would do that with an additional option, -X|--allow-decrypt

@clhedrick
Copy link

clhedrick commented Jul 14, 2021

I see why this happens, but I think it might be useful to be able to do an unencypted backup. Our backup system is pretty secure. An extra option to zfs send that suppresses the encryption properties seems like the right approach. You can already omit pr override specitic properties.

in addition to allowing unencrypted backup, it lets you change your mind if you decide you don't want something encrypted.

@PhilZ-cwm6
Copy link

PhilZ-cwm6 commented Oct 15, 2021

This also would make it possible to immediately apply a new encryption scheme/key on the receive side
When we want a backup with a different encryption, currently:

  • we need to create an encrypted pool on receive side, with its own encryption key
  • send with -w
  • we receive the migrated data with source encryption key, so different from the root receiving pool
  • we cannot change the target encryption key, as it will be reset on next incremental replication tasks

Sadly, this is actually the only way to preserve dataset properties with encryption.

As said above, something like -R/-p --no-preserve-encryption would be useful and in that case, -w would be implicit with -p/-R so that things are not so confusing/unconventional for end users

@stewartadam
Copy link

stewartadam commented Nov 14, 2021

+1 for a flag to send an unencrypted stream. I wanted to replicate existing datasets (including all properties and snapshots) from one pool to another with a new encrypted root (different algo + key) and as I understand it, I have to choose between two bad options:

  1. send/recv all data with -R --raw which replicates everything but keeps the same key/algorithm on the new pool
  2. send/recv all data without --raw which does re-encrypt the data under the new encryption root, but fails to replicate all previous snapshots or the dataset properties due to the inability to use -R

A --no-preserve-encryption flag to decrypt the encrypted dataset at the source (and therefore would permit -R to be used along side it) would permit for users to decrypt data at the source, permitting encrypted->unencrypted dataset replication or to re-encrypt datasets under new encryption roots.

@clhedrick
Copy link

clhedrick commented Nov 14, 2021 via email

@DrGeek
Copy link

DrGeek commented Mar 14, 2022

Hi ! Any news for this problem :( ?

@sotiris-bos
Copy link

sotiris-bos commented Apr 14, 2022

I too need to replicate a system that has root on encrypted ZFS to a new system that will be unencrypted. Is there no way to do this other than rsync everything and manually set all the dataset properties?

This goes against the documentation for the -w flag: https://openzfs.github.io/openzfs-docs/man/8/zfs-send.8.html

@PhilZ-cwm6
Copy link

I too need to replicate a system that has root on encrypted ZFS to a new system that will be unencrypted. Is there no way to do this other than rsync everything and manually set all the dataset properties?

This goes against the documentation for the -w flag: https://openzfs.github.io/openzfs-docs/man/8/zfs-send.8.html

That's in fact another case where the --no-preserve-encryption would be helpful, sending data to an unencrypted dataset. Currently it is not possible to decrypt on target while using -R. So, in all cases, you loose all the dataset properties. It is like once you encrypt, you can never go back using zfs send. It is ok, but you have to be aware of the limitations, until a new option is added

@clhedrick
Copy link

clhedrick commented Apr 14, 2022 via email

@PhilZ-cwm6
Copy link

PhilZ-cwm6 commented Apr 14, 2022

You can send individual filesystems. It will send all snapshots. You need a script rather than a single command

No, you cannot send, as he asked, from encrypted, using -R or -p, without --raw
So, for encrypted to decrypted operations, your only option is sending without preserving any property
The alternative to preserve properties is the -R or -p WITH --raw, but you end up with an encrypted target having the same encryption as source

@sotiris-bos
Copy link

sotiris-bos commented Apr 14, 2022

No, you cannot send, as he asked, from encrypted, using -R or -p, without --raw
So, for encrypted to decrypted operations, your only option is sending without preserving any property
The alternative to preserve properties is the -R or -p WITH --raw, but you end up with an encrypted target having the same encryption as source

This, the encryption is recursive, so if any of the above datasets are encrypted, then you cannot send the individual filesystems while retaining dataset properties.

I ended up sending without the -R flag and setting all the dataset properties manually.

@specious-logic
Copy link

I just ran into this as well. This is pretty bad for a number of the use cases other folks mentioned but also one I think is even worse.

The current operation forces you to accept either a) fragmented encryptionroot or b) you must always backup and restore the entire tree of datasets from encryptionroot down.

Imagine I have the following tree of datasets
encrypt_root
encrypt_root/parent1
encrypt_root/parent1/descendants
...
encrypt_root/parent2
encrypt_root/parent2/descendants
...

I zfs raw send -Rw encrypt_root | zfs recv to another zpool. Maybe not desirable[1] but at least the structure of the tree is preserved.

Now for whatever reason I need to restore just encrypt_root/parent2 and its descendants. Given current operation encrypt_root/parent2 and its descendants will now have a variant encryptionroot that cannot be merged back to encrypt_root. And there is no way around this unless I write a script that grabs encrypt_root/parent2 and all its descendants one by one to restore.

Unless I have severely misunderstood something, this seems wrong.

[1] As other folks mentioned, it's entirely plausible I want the encryption harmonized to the zfs target not the source.

@PhilZ-cwm6
Copy link

PhilZ-cwm6 commented Jul 19, 2022

I just ran into this as well. This is pretty bad for a number of the use cases other folks mentioned but also one I think is even worse.

The current operation forces you to accept either a) fragmented encryptionroot or b) you must always backup and restore the entire tree of datasets from encryptionroot down.

Imagine I have the following tree of datasets encrypt_root encrypt_root/parent1 encrypt_root/parent1/descendants ... encrypt_root/parent2 encrypt_root/parent2/descendants ...

I zfs raw send -Rw encrypt_root | zfs recv to another zpool. Maybe not desirable[1] but at least the structure of the tree is preserved.

Now for whatever reason I need to restore just encrypt_root/parent2 and its descendants. Given current operation encrypt_root/parent2 and its descendants will now have a variant encryptionroot that cannot be merged back to encrypt_root. And there is no way around this unless I write a script that grabs encrypt_root/parent2 and all its descendants one by one to restore.

Unless I have severely misunderstood something, this seems wrong.

[1] As other folks mentioned, it's entirely plausible I want the encryption harmonized to the zfs target not the source.

No, encryption is same on target, unless you modified it. So data in your case should be restored the same, that's what I found in all my tests if I remember well.
The only issue is the inverse in fact, using different encryption on target while still preserving other properties of the dataset
Currently, encryption is tied to all properties

@alexsmartens
Copy link

I'm trying to backup an encrypted zpool to an external drive and this doesn't to work because of the aforementioned issue. Any ides how to get around it in the meantime?

@alexsmartens
Copy link

seems like it would make sense to have a reference to a related problem on the other end of the pipe zfs receive -F

@wb14123
Copy link

wb14123 commented Apr 14, 2024

I'm trying to migrate the encryption from ccm to gcm but this limitation prevent me from sending the dataset to another one with a different encryption algorithm.

@blackwood821
Copy link

I ended up sending without the -R flag and setting all the dataset properties manually.

@sotiris-bos I'm going to try something similar but I'm mainly concerned about the origin property. For example, if the source file system that is being sent was ZFS cloned from another file system, I want to preserve that inheritance. Is there any way to do that after the fact by manually setting origin afterward or is that too late?

@blackwood821
Copy link

Looks like that can be handled by using -o origin=<snapshot> on the zfs receive side.

@tschoening81
Copy link

My use-case: I don't care about the individual properties at all too much, but the snapshots associated with all individual datasets. The pool is used with lots of datasets and lots of regularly created snapshots for backup purposes I would like to keep.

At some point I enabled encryption and XATTR=ON which mapped to LEGACY instead of SA in my setup and I've recognized poor performance when using RSYNC based backups. So my plan was to simply rewrite all datasets with an explicit setting of XATTR=SA using SEND+RECEIVE. This would include -R to preserve the already existing snapshots.

I'm totally fine if the encrypted data is sent and received encrypted, the new datasets encrypt it anyway again, as long as the XATTR -related data is rewritten in the new dataset to not use additional file system objects anymore, but follow the new setting of SA.

Does anyone know if that is the case? Or does raw sent data means that really nothing changes, especially not the XATTR related data and how it's stored?

@PhilZ-cwm6
Copy link

My use-case:
Does anyone know if that is the case? Or does raw sent data means that really nothing changes, especially not the XATTR related data and how it's stored?

As explained above, you cannot send an encrypted dataset preserving attributes without --raw. So, either you preserve encryption AND attributes or you loose both during send

@tschoening81
Copy link

tschoening81 commented Nov 29, 2024

As explained above, you cannot send an encrypted dataset preserving attributes without --raw. So, either you preserve
encryption AND attributes or you loose both during send

That doesn't explain how data related to XATTR is stored in the received dataset in the end: The property in the old dataset is XATTR=SA, while the actual stored data might be DIR on disk. I'm able to forward -o xattr=sa to the receive command as well and the dataset gets created with that property in the end. Which is e.g. not the case with -x encryption, which results in an error and most of the properties of the source are use. Even not all, because the keylocation is changed to PROMPT for some reason instead of using a file like the source dataset does. So it's not the case that all properties are taken 1:1 anyway.

But the question is how is the XATTR related data actually stored on disk in the new received dataset? As SA like the property says already or as DIR, because it might be that way in the sent dataset? Property value and on-disk layout are simply different in this case.

How do I check the on-disk storage of XATTR on my own? I found some ZDB using articles, but the output for my datasets was different and I didn't understand what I saw for existing files very likely using DIR compared to new files using SA. Or is the data really replaced already after changing to SA? The ZDB-output for my files looked a little bit like that, but the performance is still unexpected bad.

If I read your patch correctly, it will replace xattrs according to
the current xattr setting whenever they are changed / when
zpl_xattr_set is called. That still does not seem to be the case in
the current official code base...

#3472 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Send/Recv "zfs send/recv" feature Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests