Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS Receive of encrypted incremental data stream causes a PANIC #13445

Open
cyberpower678 opened this issue May 10, 2022 · 53 comments
Open

ZFS Receive of encrypted incremental data stream causes a PANIC #13445

cyberpower678 opened this issue May 10, 2022 · 53 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@cyberpower678
Copy link

System information

Type Version/Name
Distribution Name Debian
Distribution Version Bullseye
Kernel Version 5.10.109+truenas
Architecture amd64
OpenZFS Version zfs-2.1.2-95_g1d2cdd23b zfs-kmod-2.1.2-95_g1d2cdd23b

Describe the problem you're observing

During an incremental receive, ZFS caused a panic and a system hangup.

Describe how to reproduce the problem

It happens randomly.

Include any warning/errors/backtraces from the system logs

[456678.240841] VERIFY3(0 == dmu_object_set_blocksize(rwa->os, drro->drr_object, drro->drr_blksz, drro->drr_indblkshift, tx)) failed (0 == 95)
[456678.243815] PANIC at dmu_recv.c:1776:receive_object()
[456678.245141] Showing stack for process 2936808
[456678.246532] CPU: 10 PID: 2936808 Comm: receive_writer Tainted: P           OE     5.10.109+truenas #1
[456678.247840] Hardware name: Supermicro X9QR7-TF+/X9QRi-F+/X9QR7-TF+/X9QRi-F+, BIOS 3.0b 05/20/2015
[456678.249138] Call Trace:
[456678.250421]  dump_stack+0x6b/0x83
[456678.251676]  spl_panic+0xd4/0xfc [spl]
[456678.253038]  ? arc_buf_access+0x14c/0x250 [zfs]
[456678.254276]  ? dnode_hold_impl+0x4e9/0xef0 [zfs]
[456678.255493]  ? dnode_set_blksz+0x13b/0x300 [zfs]
[456678.256677]  ? dnode_rele_and_unlock+0x5c/0xc0 [zfs]
[456678.257846]  receive_object+0xc2c/0xca0 [zfs]
[456678.258984]  ? dmu_object_next+0xd6/0x120 [zfs]
[456678.260098]  ? receive_writer_thread+0xbd/0xad0 [zfs]
[456678.261160]  ? kfree+0x40c/0x480
[456678.262202]  ? _cond_resched+0x16/0x40
[456678.263244]  receive_writer_thread+0x1cc/0xad0 [zfs]
[456678.264280]  ? thread_generic_wrapper+0x62/0x80 [spl]
[456678.265252]  ? kfree+0x40c/0x480
[456678.266242]  ? receive_process_write_record+0x190/0x190 [zfs]
[456678.267177]  ? thread_generic_wrapper+0x6f/0x80 [spl]
[456678.268092]  thread_generic_wrapper+0x6f/0x80 [spl]
[456678.268988]  ? __thread_exit+0x20/0x20 [spl]
[456678.269864]  kthread+0x11b/0x140
[456678.270706]  ? __kthread_bind_mask+0x60/0x60
[456678.271538]  ret_from_fork+0x22/0x30
@cyberpower678 cyberpower678 added the Type: Defect Incorrect behavior (e.g. crash, hang) label May 10, 2022
@rincebrain
Copy link
Contributor

I would suggest trying a 2.1.4-based build, as there were fixes for encrypted send/recv and blocksize changing in the interim.

@Nemezor
Copy link

Nemezor commented May 15, 2022

I am experiencing the same issue. I copied the bug issue template to organize my response better, but since it is a duplicate I don't want to open a new issue.

System information

Type Version/Name
Distribution Name Debian
Distribution Version Bullseye
Kernel Version 5.10.0-14-amd64
Architecture amd64
OpenZFS Version zfs-2.1.4-1 zfs-kmod-2.1.4-1

Describe the problem you're observing

Same result - receiving system locks up in a ZFS panic.

Describe how to reproduce the problem

Happens reliably with both of my TrueNAS SCALE datasets. I vaguely remember it happening after I changed permissions of some folders, but nothing more concrete.

Source dataset is a dataset created by TrueNAS with no special manual changes/settings, it is fully encrypted.
Destination dataset was created using:

openssl rand -out /key 32
zpool create -O encryption=aes-256-gcm -O keyformat=raw -O keylocation=file:///key pool /dev/sdb3
zfs create pool/test

Dataset was then sent to pool/test, resulting in a panic and lockup of the system.

If required, I can somehow try to provide the dataset that causes this (it is about 135GB in size and in active use), currently not sure how since zfs sending it is not an option... Encryption keys will of course not be provided.

Include any warning/errors/backtraces from the system logs

From my production server running ZFS 2.1.2 where this was found (Sorry for the image, not sure why I took a screenshot instead of copying the text):
image

Reproduced in a virtual machine running ZFS 2.1.4:

[ 1712.885113] VERIFY3(0 == dmu_object_set_blocksize(rwa->os, drro->drr_object, drro->drr_blksz, drro->drr_indblkshift, tx)) failed (0 == 95)
[ 1712.885147] PANIC at dmu_recv.c:1784:receive_object()
[ 1712.885159] Showing stack for process 359011
[ 1712.885161] CPU: 2 PID: 359011 Comm: receive_writer Tainted: P           OE     5.10.0-14-amd64 #1 Debian 5.10.113-1
[ 1712.885161] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 1712.885162] Call Trace:
[ 1712.885168]  dump_stack+0x6b/0x83
[ 1712.885173]  spl_panic+0xd4/0xfc [spl]
[ 1712.885204]  ? arc_buf_access+0x14c/0x250 [zfs]
[ 1712.885231]  ? dnode_hold_impl+0x4e9/0xef0 [zfs]
[ 1712.885258]  ? dnode_set_blksz+0x13b/0x300 [zfs]
[ 1712.885283]  ? dnode_rele_and_unlock+0x64/0xc0 [zfs]
[ 1712.885316]  receive_object+0xc2c/0xca0 [zfs]
[ 1712.885342]  ? dnode_rele_and_unlock+0x64/0xc0 [zfs]
[ 1712.885368]  ? flush_write_batch+0x1de/0x560 [zfs]
[ 1712.885393]  receive_writer_thread+0x1cc/0xad0 [zfs]
[ 1712.885395]  ? kfree+0xba/0x480
[ 1712.885420]  ? receive_process_write_record+0x190/0x190 [zfs]
[ 1712.885423]  ? thread_generic_wrapper+0x6f/0x80 [spl]
[ 1712.885425]  thread_generic_wrapper+0x6f/0x80 [spl]
[ 1712.885428]  ? __thread_exit+0x20/0x20 [spl]
[ 1712.885430]  kthread+0x11b/0x140
[ 1712.885431]  ? __kthread_bind_mask+0x60/0x60
[ 1712.885432]  ret_from_fork+0x22/0x30

Additionally, attempting to delete the partially received dataset on the receiving side results in another ZFS panic:

[  173.583242] PANIC: zfs: adding existent segment to range tree (offset=3c2937b800 size=400)
[  173.583267] Showing stack for process 1342
[  173.583269] CPU: 3 PID: 1342 Comm: txg_sync Tainted: P           OE     5.10.0-14-amd64 #1 Debian 5.10.113-1
[  173.583269] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  173.583270] Call Trace:
[  173.583275]  dump_stack+0x6b/0x83
[  173.583281]  vcmn_err.cold+0x58/0x80 [spl]
[  173.583284]  ? sysvec_apic_timer_interrupt+0x36/0x80
[  173.583286]  ? irq_exit_rcu+0x3e/0xc0
[  173.583287]  ? sysvec_apic_timer_interrupt+0x36/0x80
[  173.583288]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[  173.583317]  ? zfs_btree_insert_into_leaf+0x233/0x2a0 [zfs]
[  173.583352]  zfs_panic_recover+0x6d/0x90 [zfs]
[  173.583384]  range_tree_add_impl+0x305/0xe40 [zfs]
[  173.583414]  metaslab_free_concrete+0x11d/0x250 [zfs]
[  173.583441]  metaslab_free_impl+0xa9/0xe0 [zfs]
[  173.583467]  metaslab_free+0x168/0x190 [zfs]
[  173.583501]  zio_free_sync+0xda/0xf0 [zfs]
[  173.583533]  zio_free+0xb3/0xf0 [zfs]
[  173.583563]  delete_blkptr_cb+0x22/0x110 [zfs]
[  173.583589]  bplist_iterate+0xc9/0x130 [zfs]
[  173.583617]  ? sublist_delete_sync+0x50/0x50 [zfs]
[  173.583644]  sublist_delete_sync+0x2e/0x50 [zfs]
[  173.583673]  dsl_sync_task_sync+0xa6/0xf0 [zfs]
[  173.583701]  dsl_pool_sync+0x40d/0x520 [zfs]
[  173.583730]  spa_sync+0x542/0xfa0 [zfs]
[  173.583731]  ? mutex_lock+0xe/0x30
[  173.583762]  ? spa_txg_history_init_io+0x101/0x110 [zfs]
[  173.583792]  txg_sync_thread+0x2e0/0x4a0 [zfs]
[  173.583821]  ? txg_fini+0x250/0x250 [zfs]
[  173.583824]  thread_generic_wrapper+0x6f/0x80 [spl]
[  173.583827]  ? __thread_exit+0x20/0x20 [spl]
[  173.583828]  kthread+0x11b/0x140
[  173.583829]  ? __kthread_bind_mask+0x60/0x60
[  173.583831]  ret_from_fork+0x22/0x30

@schreiberstein
Copy link

I experience the same issue on ZFS 2.1.4 on Debian Bullseye amd64 and on another TrueNAS system. This practically renders my ZFS setup useless because I cannot perform any snapshot based replication and/or backup. Is it something dataset specific that could be fixed? Is there a workaround for this? It's been nearly three months since this has been reported. 😢

@cyberpower678
Copy link
Author

I experience the same issue on ZFS 2.1.4 on Debian Bullseye amd64 and on another TrueNAS system. This practically renders my ZFS setup useless because I cannot perform any snapshot based replication and/or backup. Is it something dataset specific that could be fixed? Is there a workaround for this? It's been nearly three months since this has been reported. 😢

You can definitely work around it, especially if you know the date of the last snapshot that was successfully sent. The idea is to find all files and folders that were created/modified since the last snapshot was sent, back it up, roll the dataset back, and then put the backed up files back on. Create a new incremental snapshot, and send it. It should work.

@schreiberstein
Copy link

You can definitely work around it, especially if you know the date of the last snapshot that was successfully sent. The idea is to find all files and folders that were created/modified since the last snapshot was sent, back it up, roll the dataset back, and then put the backed up files back on. Create a new incremental snapshot, and send it. It should work.

Interesting approach! But sadly wouldn't work for me.
I sent a fresh snapshot to a brand new system and after the transfer was completed, I wanted to send a newer (final) snapshot of all the data that was changed in the meantime to completely switch over to said new system. (Lots of small files, rsync not viable, indexing would take longer than the permissible weekend of downtime) However, no matter what I do, all incremental sends end in a PANIC. Also tried different snapshots. The original dataset got transferred a long time ago from a different TrueNAS system. The only thing that got changed was switching the encryption from passphrase to keyfile on the previous system (the system which I try to send from now) some time ago. There are dozens of snapshots and I cannot brute-force-style attempt all. My only hope would be to copy (send and receive) the dataset locally and pray that it may fix itself.

I wonder how this is possible. Userspace operations causing a literal kernel panic. I know everybody just focusses on issues / bugs that he experiences himself, but I think this bug is really, really serious. Literally crashed a TrueNAS storage server (reboot!) and causes a panic message in Linux. 😢

@rincebrain
Copy link
Contributor

The original bug appears to be #13699, people seem to be piling in various other things.

I am going to bite my tongue and only say that if you find a way to convince OpenZFS leadership that native encryption bugs are worth more than "I hope someone fixes it", please let me know.

@schreiberstein
Copy link

The original bug appears to be #13699, people seem to be piling in various other things.

I am going to bite my tongue and only say that if you find a way to convince OpenZFS leadership that native encryption bugs are worth more than "I hope someone fixes it", please let me know.

I get what you are saying...
Do you think I could work around it by sending the dataset decrypted instead? I mean, complete an initial sync decrypted, then perform decrypted incremental send? Is it just the encryption that's causing it?

@rincebrain
Copy link
Contributor

rincebrain commented Aug 5, 2022 via email

@schreiberstein
Copy link

I haven't dug into this bug, so I don't know if the flaw in this code is specific to encryption, but it doesn't seem like trying it can make things worse...

On Fri, Aug 5, 2022 at 4:56 AM Alexander Schreiber @.> wrote: The original bug appears to be #13699 <#13699>, people seem to be piling in various other things. I am going to bite my tongue and only say that if you find a way to convince OpenZFS leadership that native encryption bugs are worth more than "I hope someone fixes it", please let me know. I get what you are saying... Do you think I could work around it by sending the dataset decrypted instead? I mean, complete an initial sync decrypted, then perform decrypted incremental send? Is it just the encryption that's causing it? — Reply to this email directly, view it on GitHub <#13445 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABUI7LOUN6B7YDL6GPIWQLVXTJLPANCNFSM5VSUYXLA . You are receiving this because you commented.Message ID: @.>

I sent an unencrypted (that is, not raw (-w)) version of the dataset and can confirm that sending incremental snapshots to it works flawlessly... Given that I have sent the dataset raw and encrypted twice and I was unable to send incremental snapshots, I conclude the issue lies with the encryption / sending "raw".

@cyberpower678
Copy link
Author

cyberpower678 commented Aug 15, 2022

I haven't dug into this bug, so I don't know if the flaw in this code is specific to encryption, but it doesn't seem like trying it can make things worse...

On Fri, Aug 5, 2022 at 4:56 AM Alexander Schreiber @.> wrote: The original bug appears to be #13699 <#13699>, people seem to be piling in various other things. I am going to bite my tongue and only say that if you find a way to convince OpenZFS leadership that native encryption bugs are worth more than "I hope someone fixes it", please let me know. I get what you are saying... Do you think I could work around it by sending the dataset decrypted instead? I mean, complete an initial sync decrypted, then perform decrypted incremental send? Is it just the encryption that's causing it? — Reply to this email directly, view it on GitHub <#13445 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABUI7LOUN6B7YDL6GPIWQLVXTJLPANCNFSM5VSUYXLA . You are receiving this because you commented.Message ID: _@**.**_>

I sent an unencrypted (that is, not raw (-w)) version of the dataset and can confirm that sending incremental snapshots to it works flawlessly... Given that I have sent the dataset raw and encrypted twice and I was unable to send incremental snapshots, I conclude the issue lies with the encryption / sending "raw".

Actually it’s an issue with receiving encrypted streams. The send function works as expected.

If you downgraded the receiver to 2.0.2 it would probably receive just fine.

@grahamperrin

This comment was marked as off-topic.

@AttilaFueloep
Copy link
Contributor

Tentatively (without linking): might https://github.com/openzfs/zfs/pull/14119 partially fix this issue #13445?

I don't think so. #14119 is fixing an issue where receiving non-raw unencrypted incremental sends into an encrypted target without the key loaded where accidentally allowed, causing havoc down the stack.

@dobsonj
Copy link

dobsonj commented Nov 12, 2022

I ran into this issue on FreeBSD 12.2-RELEASE-p14 with zfs-2.0.7-1 (TrueNAS-12.0-U8.1).
The source system does a raw send (i.e. zfs send -Rw) as an incremental update with intermediary snapshots (-I).
Properties recordsize=128K, compression=lz4, encryption=aes-256-gcm are set on the source dataset, so it sends raw compressed and encrypted blocks to the target system. The target system does not have the encryption key in this case, it should just store the encrypted blocks and never mount it.
The target system (running the same release) panics during zfs receive:

panic: VERIFY3(0 == dmu_object_set_blocksize(rwa->os, drro->drr_object, drro->drr_blksz, drro->drr_indblkshift, tx)) failed (0 == 45)

cpuid = 7
time = 1654421418
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe017d8ae880
vpanic() at vpanic+0x17b/frame 0xfffffe017d8ae8d0
spl_panic() at spl_panic+0x3a/frame 0xfffffe017d8ae930
receive_object() at receive_object+0x99f/frame 0xfffffe017d8aea40
receive_writer_thread() at receive_writer_thread+0x115/frame 0xfffffe017d8aeab0
fork_exit() at fork_exit+0x7e/frame 0xfffffe017d8aeaf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe017d8aeaf0
--- trap 0xc, rip = 0x800876caa, rsp = 0x7fffffff7508, rbp = 0x7fffffff7570 ---
KDB: enter: panic
panic.txt06000017714247073652  7237 ustarrootwheelVERIFY3(0 == dmu_object_set_blocksize(rwa->os, drro->drr_object, drro->drr_blksz, drro->drr_indblkshift, tx)) failed (0 == 45)
version.txt0600006314247073652  7544 ustarrootwheelFreeBSD 12.2-RELEASE-p11 75566f060d4(HEAD) TRUENAS

The value returned by dmu_object_set_blocksize() on FreeBSD is EOPNOTSUPP:

     45 EOPNOTSUPP Operation not supported.  The attempted operation is not supported for the type of object referenced.  Usually this occurs when
             a file descriptor refers to a file or socket that cannot support this operation, for example, trying to accept a connection on a
             datagram socket.

On Debian, it's 95 (consistent with the panic in the original description of this issue).

#define EOPNOTSUPP      95      /* Operation not supported on transport endpoint */

It's hitting this VERIFY statement, which is only called when raw send is enabled:

VERIFY0(dmu_object_set_blocksize(rwa->os, drro->drr_object,

EOPNOTSUPP appears to be coming from dnode_set_blksz:
return (SET_ERROR(ENOTSUP));

I'll post more details if I can narrow it down to a simple test case, I don't know yet exactly how to reproduce it in isolation.

@TBhatti90
Copy link

Hi, I just wanted to report, I think I'm having the issue. Has there been any progress in tackling this issue? I'm having trouble figuring out how I'm going to make a backup of my main pool, since replication always fails. I can post the dmesg log I get.
dmesg_zfs.txt

@cyberpower678
Copy link
Author

Hi, I just wanted to report, I think I'm having the issue. Has there been any progress in tackling this issue? I'm having trouble figuring out how I'm going to make a backup of my main pool, since replication always fails. I can post the dmesg log I get.
dmesg_zfs.txt

Are you sending an incremental snapshot, or the initial snapshot?

If the former, identify which files were changed since the last snapshot, copy them out of the dataset/pool, delete them from the pool, put them back, delete the newest snapshot, and make a new one. Replication should work now.

If the later, create a new dataset of the dataset failing to replicate and copy all of the files from the old set to the new one. Delete old dataset once data is copied.

@TBhatti90
Copy link

Hold on, I'm confused about your former option. That's the problem. It's an incremental snapshot, a series of them actually.
"I copy the diff files from the newest snapshot, delete them from the newest snapshot, put them back in the newest snapshost, delete the newest snapshot, make a new snapshot". This is weird. Why is it like this? This isn't how sending and replication is supposed to behave. What's going on?
I'm using sanoid for my snapshot management, so it creates automatic snapshots at a set time frame, like daily, or monthly. Should I delete every snapshot, except the initial one, that's the same on the server and the backup?

@cyberpower678
Copy link
Author

cyberpower678 commented Jan 30, 2023

Hold on, I'm confused about your former option. That's the problem. It's an incremental snapshot, a series of them actually.
"I copy the diff files from the newest snapshot, delete them from the newest snapshot, put them back in the newest snapshost, delete the newest snapshot, make a new snapshot". This is weird. Why is it like this? This isn't how sending and replication is supposed to behave. What's going on?
I'm using sanoid for my snapshot management, so it creates automatic snapshots at a set time frame, like daily, or monthly. Should I delete every snapshot, except the initial one, that's the same on the server and the backup?

I’m venturing a guess your affected dataset is an older dataset from an older ZFS version. On newer versions of ZFS, some dataset metadata left behind by older versions don’t seem to play nice with the new versions upon sending and receiving them. The idea here is to go back as many snapshots as needed, to get to the point where the last snapshot sent was still working on the backup. Make a note of every file that’s been modified, or created on the dataset since the last working snapshot, and copy them out of the pool. Roll back the source dataset to the last snapshot that sent successfully to the backup. Do the same for the destination. Delete all the files on source that you copied out, that still existed at that point. Copy files back into the dataset, and create a new snapshot. Sending the snapshot should be successful this time around. I’ve discovered this to be an issue that eventually goes away on its own. Just deal with it a couple of times, and it will likely go away for you too.

@TBhatti90
Copy link

TBhatti90 commented Jan 30, 2023

I’m venturing a guess your affected dataset is an older dataset from an older ZFS version. On newer versions of ZFS, some dataset metadata left behind by older versions don’t seem to play nice with the new versions upon sending and receiving them. The idea here is to go back as many snapshots as needed, to get to the point where the last snapshot sent was still working on the backup. Make a note of every file that’s been modified, or created on the dataset since the last working snapshot, and copy them out of the pool. Roll back the source dataset to the last snapshot that sent successfully to the backup. Do the same for the destination. Delete all the files on source that you copied out, that still existed at that point. Copy files back into the dataset, and create a new snapshot. Sending the snapshot should be successful this time around. I’ve discovered this to be an issue that eventually goes away on its own. Just deal with it a couple of times, and it will likely go away for you too.

So is there a way to detect which version of zfs the snapshot is from? Now, I don't mind losing the old snapshots. I'm ok with only having relatively recent snapshots. But I'm on Debian Bullseye, so my zfs version shouldn't have changed. Think it's 2.0.3.

The pool itself, appears to be from December 2021 on truenas core, freebsd. I'm not sure what version of zfs they were using back then.

If it helps, here's my syncoid script

#!/bin/sh
FILESYSTEMS="Data"
LOG=/var/log/syncoid.log
for i in $FILESYSTEMS; do
echo "--------------- date - syncing $i -------------" >> $LOG 2>&1
/usr/sbin/syncoid -r --exclude=Data/Encrypted/Media --exclude=Data/Unencrypted/Media/Collections --exclude="Data/Unencrypted/Media/TV Shows" --exclude=Data/Unencrypted/Media/Movies --exclude=Data/Unencrypted/Media/Music --exclude=Data/Unencrypted/Media/Temp --exclude=Data/Encrypted/Users/USER/Backups/Desktop --exclude=Data/Encrypted/Users/USER/Backups/Laptop --sendoptions="wp" --skip-parent --no-sync-snap $i root@backup:Backup/$i >> $LOG 2>&1
echo "--------------- date - done $i -------------" >> $LOG 2>&1
done

@rincebrain
Copy link
Contributor

rincebrain commented Jan 30, 2023

I don't really think your description of the problem, or how to work around it, is accurate, @cyberpower678.

More precisely, if someone who can reproduce this reliably wants to add a few debug prints the next time this happens to them, it's probably pretty straightforward to see why it's getting upset? I have a couple of wild guesses, but without a reproducer I can take home with me, I'm going to get to ask whoever to try things and see how it breaks.

e: reading back, no, this is just the same known issue, probably. Go ask Debian to cherrypick #13782 if you want it to work, I think, or run 2.1.7 or newer.

@cyberpower678
Copy link
Author

I don't really think your description of the problem, or how to work around it, is accurate, @cyberpower678.

More precisely, if someone who can reproduce this reliably wants to add a few debug prints the next time this happens to them, it's probably pretty straightforward to see why it's getting upset? I have a couple of wild guesses, but without a reproducer I can take home with me, I'm going to get to ask whoever to try things and see how it breaks.

e: reading back, no, this is just the same known issue, probably. Go ask Debian to cherrypick #13782 if you want it to work, I think, or run 2.1.7 or newer.

It’s accurate for me. When I did this, what ever was causing zfs receive to either corrupt the dataset, or throw a panic was gone and the subsequent receive was successful. After 3 more instances of this happening, and doing my tedious workaround, it stopped happening entirely. It’s been almost a year since the last occurrence.

@cyberpower678
Copy link
Author

I’m venturing a guess your affected dataset is an older dataset from an older ZFS version. On newer versions of ZFS, some dataset metadata left behind by older versions don’t seem to play nice with the new versions upon sending and receiving them. The idea here is to go back as many snapshots as needed, to get to the point where the last snapshot sent was still working on the backup. Make a note of every file that’s been modified, or created on the dataset since the last working snapshot, and copy them out of the pool. Roll back the source dataset to the last snapshot that sent successfully to the backup. Do the same for the destination. Delete all the files on source that you copied out, that still existed at that point. Copy files back into the dataset, and create a new snapshot. Sending the snapshot should be successful this time around. I’ve discovered this to be an issue that eventually goes away on its own. Just deal with it a couple of times, and it will likely go away for you too.

So is there a way to detect which version of zfs the snapshot is from? Now, I don't mind losing the old snapshots. I'm ok with only having relatively recent snapshots. But I'm on Debian Bullseye, so my zfs version shouldn't have changed. Think it's 2.0.3.

The pool itself, appears to be from December 2021 on truenas core, freebsd. I'm not sure what version of zfs they were using back then.

If it helps, here's my syncoid script

#!/bin/sh
FILESYSTEMS="Data"
LOG=/var/log/syncoid.log
for i in $FILESYSTEMS; do
echo "--------------- date - syncing $i -------------" >> $LOG 2>&1
/usr/sbin/syncoid -r --exclude=Data/Encrypted/Media --exclude=Data/Unencrypted/Media/Collections --exclude="Data/Unencrypted/Media/TV Shows" --exclude=Data/Unencrypted/Media/Movies --exclude=Data/Unencrypted/Media/Music --exclude=Data/Unencrypted/Media/Temp --exclude=Data/Encrypted/Users/USER/Backups/Desktop --exclude=Data/Encrypted/Users/USER/Backups/Laptop --sendoptions="wp" --skip-parent --no-sync-snap $i root@backup:Backup/$i >> $LOG 2>&1
echo "--------------- date - done $i -------------" >> $LOG 2>&1
done

@rincebrain may have a point that I may not be on point here. I’m laying out to you what worked for me. The general idea is if older snapshots made it to the destination just fine, you keep those. Only delete the snapshots, that are failing to send/receive successfully. Identify what has changed since the last successful snapshot, rollback to the last working snapshot after copying the changed/added files out, and then completely replace said files (delete and copy back) in the dataset, and create a new snapshot based on the last successful one. Try sending that to the destination. If it still panics, then the only option for now is to simply create a new dataset on the source, and move the files to the new dataset, deleting the old one and send the whole thing again. Really annoying, but most effective in the long-run, from my experience.

@rincebrain
Copy link
Contributor

Or you could just apply the patch that fixes the error on the receiver.

@cyberpower678
Copy link
Author

cyberpower678 commented Jan 30, 2023

Or you could just apply the patch that fixes the error on the receiver.

That would be a great solution. Too bad I'm not familiar with how ZFS works internally to actually fix it myself. :-( So, I just work around issues until they are fixed, or are no longer issues.

@rincebrain
Copy link
Contributor

I wasn't suggesting you write a fix. I was suggesting you use 2.1.7 or newer, which have this fixed, or get your distro du jour to apply the relevant fix.

@TBhatti90
Copy link

Hey so just want to make sure I'm understanding correctly, 2.1.7 fixes this right? Debian has that in the backports, along with a newer kernel. I can check that out. Would I have to upgrade my pool too?

If there's any way I can get some debug print statements for you guys, I can provide that. The error seems pretty consistent. Always with the one dataset.

@cyberpower678
Copy link
Author

Or you could just apply the patch that fixes the error on the receiver.

That would be a great solution. Too bad I'm not familiar with how ZFS works internally to actually fix it myself. So, I just work around issues until they are fixed, or are no longer issues.

I wasn't suggesting you write a fix. I was suggesting you use 2.1.7 or newer, which have this fixed, or get your distro du jour to apply the relevant fix.

Oh. Whoops. Now that I reread your initial comment, that does make more sense. lol That being said, I haven't dealt with this issue since I initially opened it here. My encrypted incremental have been replicating without issue.

@rincebrain
Copy link
Contributor

rincebrain commented Jan 30, 2023

You don't need to pull a newer kernel to use the newer version of ZFS, you can just pull the zfs packages from backports.

You also shouldn't need to run "zpool upgrade" or otherwise tweak any feature flags.

My understanding is that redoing the snapshot, essentially, would just reroll the dice on whether this triggered, because if I understand the bug correctly, in certain cases it wasn't setting that it used large blocks before attempting to make a large block, so at which point specific blocks got included in the send stream (plus variation in your local system) could affect whether this broke for you or not. So I would speculate it's conceivably possible for someone to write a zstream command that reordered when various things were in the send stream so this didn't come up? I'm just spitballing though.

@cyberpower678
Copy link
Author

cyberpower678 commented Jan 30, 2023

Hey so just want to make sure I'm understanding correctly, 2.1.7 fixes this right? Debian has that in the backports, along with a newer kernel. I can check that out. Would I have to upgrade my pool too?

If there's any way I can get some debug print statements for you guys, I can provide that. The error seems pretty consistent. Always with the one dataset.

Pool upgrades shouldn't be needed, but it shouldn't hurt. Though I'm not sure about upgrading ZFS in your particular installation since it's TrueNAS.

The latest TrueNAS SCALE is on 2.1.6

zfs --version

zfs-2.1.6-1
zfs-kmod-2.1.6-1

@TBhatti90
Copy link

Just wanted to report the same issue happened with 2.1.7 with Debian 6.0.0. This is the same on the source and destination. The source and destination have an initial snapshot from 12/21/21. The source has many snapshots after that, as it's the primary server. The source was sending a snapshot from 1/30/23, and that's when the kernel panic happened on the destination.

So you want me to delete the new snapshots and paste the changed files in the old snapshot? That sounds difficult.

[23638.869168] VERIFY3(0 == dmu_object_set_blocksize(rwa->os, drro->drr_object, drro->drr_blksz, drro->drr_indblkshift, tx)) failed (0 == 95)
[23638.869231] PANIC at dmu_recv.c:1784:receive_object()
[23638.869255] Showing stack for process 21420
[23638.869256] CPU: 0 PID: 21420 Comm: receive_writer Tainted: P OE 6.0.0-0.deb11.6-amd64 #1 Debian 6.0.12-1~bpo11+1
[23638.869259] Hardware name: ASUS All Series/Z97I-PLUS, BIOS 2704 02/18/2016
[23638.869260] Call Trace:
[23638.869261]
[23638.869263] dump_stack_lvl+0x45/0x5e
[23638.869270] spl_panic+0xd1/0xe9 [spl]
[23638.869279] ? _raw_spin_unlock+0x15/0x30
[23638.869282] ? dnode_hold_impl+0x4eb/0xee0 [zfs]
[23638.869355] ? preempt_count_add+0x70/0xa0
[23638.869358] ? preempt_count_add+0x70/0xa0
[23638.869360] ? _raw_spin_lock+0x13/0x40
[23638.869362] ? dnode_rele_and_unlock+0x59/0xf0 [zfs]
[23638.869406] ? _raw_spin_unlock+0x15/0x30
[23638.869408] receive_object+0xc30/0xca0 [zfs]
[23638.869452] ? dnode_rele_and_unlock+0x59/0xf0 [zfs]
[23638.869515] ? preempt_count_add+0x70/0xa0
[23638.869517] receive_writer_thread+0x1cc/0xad0 [zfs]
[23638.869559] ? set_next_task_fair+0x2d/0xd0
[23638.869563] ? _raw_spin_unlock_irqrestore+0x23/0x40
[23638.869565] ? receive_process_write_record+0x1a0/0x1a0 [zfs]
[23638.869606] ? __thread_exit+0x20/0x20 [spl]
[23638.869621] ? thread_generic_wrapper+0x59/0x70 [spl]
[23638.869626] thread_generic_wrapper+0x59/0x70 [spl]
[23638.869631] kthread+0xe7/0x110
[23638.869633] ? kthread_complete_and_exit+0x20/0x20
[23638.869635] ret_from_fork+0x22/0x30
[23638.869639]

@rincebrain
Copy link
Contributor

rincebrain commented Jan 31, 2023

Neat. Could you share precisely what zfs version reports on the receiver? I want to go see what's in your precise version...

e: My next suggestion if 2.1.7 didn't resolve it for you would be to try 2.1.9, since 2.1.8 includes f806306 which further revised the code that c8d2ab0 touched.

@TBhatti90
Copy link

Neat. Could you share precisely what zfs version reports on the receiver? I want to go see what's in your precise version...

e: My next suggestion if 2.1.7 didn't resolve it for you would be to try 2.1.9, since 2.1.8 includes f806306 which further revised the code that c8d2ab0 touched.

Hey,
So on both the source and receiver this is the zfs version:

zfs-2.1.7-1bpo11+1
zfs-kmod-2.1.7-1
bpo11+1

I'm not sure if Debian is going to include the later versions. Hopefully they do. Didn't that stack trace provide more information? I do have the blocksize set to greater than or equal to 1M on all my datasets. Thing is, all the snapshots, from when I first created the pool, should have had that block size.

@rincebrain
Copy link
Contributor

rincebrain commented Feb 1, 2023

The bug that first patch, at least, wants to fix, is that it wasn't correctly setting the "we can use large blocks here" flag on the dataset in a timely fashion, so then attempting to create a large block on it would, naturally, fail.

I would expect Debian to pull in 2.1.9 eventually, though I can't make any promises on what timetable.

@mcflis
Copy link

mcflis commented Mar 9, 2023

I've ran into a similar issue replicating from Truenas Scale to FreeBSD 12.2-Release-p3:
Panic String: Solaris(panic): zfs: adding existent segment to range tree (offset=276dff5000 size=1000)

Out of nowhere, I had a presumably broken snapshot of the replicated dataset on the FreeBSD side. I've been able to resolve the issue by destroying all snapshots of the dataset except for the broken one and by running:

zfs rollback <dataset@broken-snap> # <- rollback to the broken snapshot was the key for me
zfs destroy -r <dataset>

Afterwards, I've just issued another replication run and Truenas was happily replicating my snapshots to FreeBSD including the previously destroyed dataset.

I know this github issue is about Linux but maybe it can help someone struggling for days trying to get their replication fixed without the possibility to update to zfs 2.1.9.

Note: Before the rollback, I've upgraded the os to FreeBSD 13.1-RELEASE-p5 and ran zpool upgrade <pool>

A little more info

Truenas version: TrueNAS-SCALE-22.12.0

Truenas zfs version:

zfs-2.1.6-1
zfs-kmod-2.1.6-1

FreeBSD zfs version:

zfs-2.1.4-FreeBSD_g52bad4f23
zfs-kmod-2.1.4-FreeBSD_g52bad4f23

FreeBSD crash info:

# cat /var/crash/info.last
Dump header from device: /dev/vtbd0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 416559104
  Blocksize: 512
  Compression: none
  Dumptime: 2023-03-08 21:40:01 +0100
  Hostname: REDACTED
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 13.1-RELEASE-p3 GENERIC
  Panic String: Solaris(panic): zfs: adding existent segment to range tree (offset=276dff5000 size=1000)
  Dump Parity: 3924248330
  Bounds: 0
  Dump Status: good

@rincebrain
Copy link
Contributor

Since FreeBSD 13 is running OpenZFS in the tree now, seems perfectly germane to mention.

Did you try just the rollback and then receive again, out of curiosity, before destroying the rest?

@mcflis
Copy link

mcflis commented Mar 9, 2023

Ok cool, well I've destroyed it first before receiving, just to make sure the replication for this particular dataset starts from scratch.

@mcflis
Copy link

mcflis commented Apr 15, 2023

Did you try just the rollback and then receive again, out of curiosity, before destroying the rest?

I've ran into the same issue once more. This time I only ran zfs rollback -r <dataset@broken-snap> without destroying the dataset afterwards: replication succeeded.

To find the broken snapshot, I've used https://github.com/bahamas10/zfs-prune-snapshots and started from oldest to newest using the -v option to print each action. It would delete snapshots until it hit the broken one and simply become unresponsive (over ssh)

@pvelati
Copy link

pvelati commented May 28, 2023

Having the same issue replicating a dataset from TrueNAS Scale to PBS.

System information SENDER

Type Version/Name
Distribution Name TrueNAS Scale (Debian 11)
Distribution Version 22.12.2
Kernel Version 5.15.79+truenas
Architecture amd64
OpenZFS Version zfs-2.1.9-1 zfs-kmod-2.1.9-1

System information RECEIVER

Type Version/Name
Distribution Name Proxmox Backup Server (Debian 11)
Distribution Version 2.4-2
Kernel Version 6.1.15-1-pve
Architecture amd64
OpenZFS Version zfs-2.1.11-pve1 zfs-kmod-2.1.9-pve1

Describe the problem you're observing

During an incremental receive, ZFS caused a panic. no system hungup. only the dataset is busy.

Describe how to reproduce the problem

When I need to replicate a new snapshot with big change (~100 GB), but not always happening. Can't really replicate it.

Include any warning/errors/backtraces from the system logs

[  404.517496] VERIFY3(0 == dmu_object_set_blocksize(rwa->os, drro->drr_object, drro->drr_blksz, drro->drr_indblkshift, tx)) failed (0 == 95)
[  404.517510] PANIC at dmu_recv.c:1784:receive_object()
[  404.517514] Showing stack for process 3250
[  404.517518] CPU: 2 PID: 3250 Comm: receive_writer Tainted: P           O       6.1.15-1-pve #1
[  404.517524] Hardware name: retsamarret 000-F2423-FBA002-2000-N/Default string, BIOS 5.19 06/24/2022
[  404.517527] Call Trace:
[  404.517531]  <TASK>
[  404.517536]  dump_stack_lvl+0x49/0x63
[  404.517547]  dump_stack+0x10/0x16
[  404.517553]  spl_dumpstack+0x29/0x2f [spl]
[  404.517570]  spl_panic+0xd1/0xe9 [spl]
[  404.517586]  ? dnode_rele_and_unlock+0x68/0x100 [zfs]
[  404.517711]  ? dmu_object_set_blocksize+0x64/0x90 [zfs]
[  404.517811]  receive_object+0xcf5/0xdb0 [zfs]
[  404.517913]  ? dnode_rele+0x3d/0x50 [zfs]
[  404.518015]  ? flush_write_batch+0x1cb/0x5a0 [zfs]
[  404.518117]  receive_writer_thread+0x1cc/0xb50 [zfs]
[  404.518217]  ? set_next_task_fair+0x70/0xa0
[  404.518224]  ? raw_spin_rq_unlock+0x10/0x40
[  404.518229]  ? set_user_nice.part.0+0x167/0x430
[  404.518235]  ? receive_process_write_record+0x190/0x190 [zfs]
[  404.518334]  ? receive_process_write_record+0x190/0x190 [zfs]
[  404.518435]  ? __thread_exit+0x20/0x20 [spl]
[  404.518450]  thread_generic_wrapper+0x61/0x80 [spl]
[  404.518463]  ? thread_generic_wrapper+0x61/0x80 [spl]
[  404.518477]  kthread+0xee/0x120
[  404.518482]  ? kthread_complete_and_exit+0x20/0x20
[  404.518486]  ret_from_fork+0x1f/0x30
[  404.518493]  </TASK>

Message from syslogd@pbs at May 28 19:17:37 ...
kernel:[  404.517496] VERIFY3(0 == dmu_object_set_blocksize(rwa->os, drro->drr_object, drro->drr_blksz, drro->drr_indblkshift, tx)) failed (0 == 95)

Message from syslogd@pbs at May 28 19:17:37 ...
kernel:[  404.517510] PANIC at dmu_recv.c:1784:receive_object()

@sfatula
Copy link

sfatula commented Jul 9, 2023

Openzfs 2.1.11 both sender (truenas) and receiver (debian) same panic as OP. So, I have to scrap using zfs replication as we cannot store unencrypted and it's too much effort to send all the data again.

@amotin
Copy link
Member

amotin commented Jul 10, 2023

@sfatula and others, can you try this patch: #15039 ? It should fix at least some cases.

@cyberpower678
Copy link
Author

I would love to, but I worked around the issue by reloading the changed data and recreating a snapshot of it and sending again, which worked around the panic. I haven’t had a panic lately like this, but rather a high amount regarding range trees that I recently fixed/worked around too.

@cyberpower678
Copy link
Author

So right now, my systems are running fine without any panics or ZFS warnings.

@sfatula
Copy link

sfatula commented Jul 10, 2023

@sfatula and others, can you try this patch: #15039 ? It should fix at least some cases.

I should be able to try that sometime this week and report back, never built zfs but doesn't seem too bad on Debian. I am fortunate in that zfs on our backup machine is only used for backups which is not working now anyway, so, doesn't matter if zfs doesn't run if I mess something up.

@sfatula
Copy link

sfatula commented Jul 10, 2023

I applied said patch to current master and first wanted to rollback as I did previously to have the same conditions, I typed:

zfs rollback -R Backup/Truenas/tank/Data@backup-20230626033001

And got:

kernel:[ 252.381901] PANIC: zfs: adding existent segment to range tree (offset=1e24c8c000 size=1000)

Before the patch, I could in fact rollback. zfs -V shows (if this is correct):

zfs-2.2.99-1
zfs-kmod-2.2.99-1

Now I also cannot even zpool import the pool anymore as that panics with same message. Heck, can't even boot as on boot I get the same message.

@amotin
Copy link
Member

amotin commented Jul 10, 2023

@sfatula My patch is supposed to affects only receive. I can't imagine how it can affect the rollback. especially in a way that would mess up the pool space map. I wonder if you have more than one problem, like bad RAM. Unfortunately there is no solution to already corrupted space maps other than read-only pool import and data evacuation.

@sfatula
Copy link

sfatula commented Jul 10, 2023

I restored 2.1.11 and machine boots fine, pool works fine. Obviously can't zfs receive still, but it does work. Not sure what the issue might be then. Any other ideas?

@cyberpower678
Copy link
Author

I applied said patch to current master and first wanted to rollback as I did previously to have the same conditions, I typed:

zfs rollback -R Backup/Truenas/tank/Data@backup-20230626033001

And got:

kernel:[ 252.381901] PANIC: zfs: adding existent segment to range tree (offset=1e24c8c000 size=1000)

Before the patch, I could in fact rollback. zfs -V shows (if this is correct):

zfs-2.2.99-1 zfs-kmod-2.2.99-1

Now I also cannot even zpool import the pool anymore as that panics with same message. Heck, can't even boot as on boot I get the same message.

That’s a bug that exists for many including myself. You can set zfs tunables to turn those panics to warnings. The space map isn’t corrupted, it’s being corrupted by newer versions of ZFS on older datasets.

@sfatula My patch is supposed to affects only receive. I can't imagine how it can affect the rollback. especially in a way that would mess up the pool space map. I wonder if you have more than one problem, like bad RAM. Unfortunately there is no solution to already corrupted space maps other than read-only pool import and data evacuation.

It’s not related to your patch but rather something else in newer ZFS versions corrupting older ZFS datasets. This happened to me and I was able to recover from it. It tends to happen with datasets containing large amounts of files.

@rincebrain
Copy link
Contributor

Do you have any pointers to when this happened to you, and what you did? "Newer versions eat my data" is a pretty serious statement.

@sfatula
Copy link

sfatula commented Jul 11, 2023

I can't find any evidence it's zfs 2.2, but, I did a zdb -b and nothing was detected, no leaks, etc. were reported. Would it possibly make sense to try and apply the patch to a zfs 2.1 version? Maybe I could retry then.

zdb -AAA -b Backup

Traversing all blocks to verify nothing leaked ...

loading concrete vdev 0, metaslab 245 of 246 ...
1.08T completed ( 611MB/s) estimated time remaining: 0hr 00min 04sec
No leaks (block sum matches space maps exactly)

bp count:               5157941
ganged count:                 0
bp logical:       1239224202240      avg: 240255
bp physical:      1169053898240      avg: 226651     compression:   1.06
bp allocated:     1185692446720      avg: 229877     compression:   1.05
bp deduped:                   0    ref>1:      0   deduplication:   1.00
Normal class:     1186744664064     used: 56.39%
Embedded log class              0     used:  0.00%

additional, non-pointer bps of type 0:        132
Dittoed blocks on same vdev: 1836315

@sfatula
Copy link

sfatula commented Jul 11, 2023

I applied the patch to zfs 2.1.12 (manually of course) and I was able to get the send/recv to work, whereas, it repeatedly failed previous. Same snapshots, so, it would appear that 15039 had a positive effect for me! Thanks @amotin

@sfatula
Copy link

sfatula commented Jul 15, 2023

@amotin I hope you can gt the patch in a future version of Truenas Scale, I now just got the exact same error replicating to drives attached to the same machine, so, source and destination encrypted but same Scale machine. So, my current Scale backup strategy not working and no good way to apply patch to Scale zfs.

@amotin
Copy link
Member

amotin commented Jul 15, 2023

@sfatula Yes, that is the plan: https://ixsystems.atlassian.net/browse/NAS-122583 .

@sfatula
Copy link

sfatula commented Jul 15, 2023

Am I to understand this issue only presents when using raw sends? i.e., if I decrypt and re-encrypt on a send/recv, the issue should not manifest?

@amotin
Copy link
Member

amotin commented Jul 15, 2023

My PR fixes only raw send issues. Non-raw does not care about indirect block size and should not be affected. I am not sure "re-encrypt" part actually works (while in theory it I think it could), but if yes, it should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests