-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
permanent errors after upgrading ZFS #13763
Comments
I have similar issues... I created a new encrypted dataset under zfs 2.1.5 (syncoid) and destroyed the former encrypted dataset (initially created under zfs 0.8.3 with syncoid) So far so good :-)... no permanent errors anymore |
Same problems here after upgrade from ubuntu 20.04 to 22.04 |
Some new flavour of this problem. On an other machine I got after the update hangs an this log-entries: Aug 21 20:44:08 backup kernel: [ 4841.628971] VERIFY3(0 == zap_add(mos, dsl_dir_phys(pds)->dd_child_dir_zapobj, name, sizeof (uint64_t), 1, &ddobj, tx)) failed (0 == 17) |
Are you sure this is the same problem?
On Aug 21, 2022, at 3:56 PM, Volker Süß ***@***.***> wrote:
Some new flavour of this problem. On an other machine I got after the update hangs an this log-entries:
Aug 21 20:44:08 backup kernel: [ 4841.628971] VERIFY3(0 == zap_add(mos, dsl_dir_phys(pds)->dd_child_dir_zapobj, name, sizeof (uint64_t), 1, &ddobj, tx)) failed (0 == 17)
Aug 21 20:44:08 backup kernel: [ 4841.629271] PANIC at dsl_dir.c:951:dsl_dir_create_sync()
Aug 21 20:44:08 backup kernel: [ 4841.629338] Showing stack for process 675
Aug 21 20:44:08 backup kernel: [ 4841.629340] CPU: 0 PID: 675 Comm: txg_sync Tainted: P O 5.15.0-46-generic #49<#49>-Ubuntu
Aug 21 20:44:08 backup kernel: [ 4841.629344] Hardware name: Gigabyte Technology Co., Ltd. GA-A55M-S2V/GA-A55M-S2V, BIOS F6 11/18/2011
Aug 21 20:44:08 backup kernel: [ 4841.629346] Call Trace:
Aug 21 20:44:08 backup kernel: [ 4841.629349]
Aug 21 20:44:08 backup kernel: [ 4841.629352] show_stack+0x52/0x5c
Aug 21 20:44:08 backup kernel: [ 4841.629357] dump_stack_lvl+0x4a/0x63
Aug 21 20:44:08 backup kernel: [ 4841.629363] dump_stack+0x10/0x16
Aug 21 20:44:08 backup kernel: [ 4841.629366] spl_dumpstack+0x29/0x2f [spl]
Aug 21 20:44:08 backup kernel: [ 4841.629382] spl_panic+0xd1/0xe9 [spl]
Aug 21 20:44:08 backup kernel: [ 4841.629394] ? dmu_buf_rele+0xe/0x20 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.629598] ? zap_unlockdir+0x46/0x60 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.629777] ? zap_add_impl+0x96/0x160 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.629957] ? zap_add+0x7b/0xb0 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.630138] dsl_dir_create_sync+0x1ff/0x280 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.630306] ? spl_kmem_free_impl+0x29/0x40 [spl]
Aug 21 20:44:08 backup kernel: [ 4841.630319] dsl_dataset_create_sync+0x52/0x380 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.630498] dmu_recv_begin_sync+0x374/0xa00 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.630659] ? spa_get_slop_space+0x6e/0xc0 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.630833] ? __cond_resched+0x1a/0x50
Aug 21 20:44:08 backup kernel: [ 4841.630838] dsl_sync_task_sync+0xb9/0x110 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.631010] dsl_pool_sync+0x369/0x400 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.631177] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.631353] spa_sync+0x2dc/0x5b0 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.631526] txg_sync_thread+0x266/0x2f0 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.631703] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
Aug 21 20:44:08 backup kernel: [ 4841.631883] thread_generic_wrapper+0x64/0x80 [spl]
Aug 21 20:44:08 backup kernel: [ 4841.631896] ? __thread_exit+0x20/0x20 [spl]
Aug 21 20:44:08 backup kernel: [ 4841.631907] kthread+0x12a/0x150
Aug 21 20:44:08 backup kernel: [ 4841.631912] ? set_kthread_struct+0x50/0x50
Aug 21 20:44:08 backup kernel: [ 4841.631914] ret_from_fork+0x22/0x30
Aug 21 20:44:08 backup kernel: [ 4841.631919]
Aug 21 20:48:03 backup kernel: [ 5076.258637] INFO: task txg_sync:675 blocked for more than 120 seconds.
Aug 21 20:48:03 backup kernel: [ 5076.258829] Tainted: P O 5.15.0-46-generic #49<#49>-Ubuntu
Aug 21 20:48:03 backup kernel: [ 5076.259007] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 21 20:48:03 backup kernel: [ 5076.259149] task:txg_sync state:D stack: 0 pid: 675 ppid: 2 flags:0x00004000
Aug 21 20:48:03 backup kernel: [ 5076.259163] Call Trace:
Aug 21 20:48:03 backup kernel: [ 5076.259169]
Aug 21 20:48:03 backup kernel: [ 5076.259176] __schedule+0x23d/0x590
Aug 21 20:48:03 backup kernel: [ 5076.259197] schedule+0x4e/0xc0
Aug 21 20:48:03 backup kernel: [ 5076.259206] spl_panic+0xe7/0xe9 [spl]
Aug 21 20:48:03 backup kernel: [ 5076.259254] ? dmu_buf_rele+0xe/0x20 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.259710] ? zap_unlockdir+0x46/0x60 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.260216] ? zap_add_impl+0x96/0x160 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.260722] ? zap_add+0x7b/0xb0 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.261229] dsl_dir_create_sync+0x1ff/0x280 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.261690] ? spl_kmem_free_impl+0x29/0x40 [spl]
Aug 21 20:48:03 backup kernel: [ 5076.261728] dsl_dataset_create_sync+0x52/0x380 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.262192] dmu_recv_begin_sync+0x374/0xa00 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.262696] ? spa_get_slop_space+0x6e/0xc0 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.263289] ? __cond_resched+0x1a/0x50
Aug 21 20:48:03 backup kernel: [ 5076.263303] dsl_sync_task_sync+0xb9/0x110 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.263773] dsl_pool_sync+0x369/0x400 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.264239] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.264726] spa_sync+0x2dc/0x5b0 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.265213] txg_sync_thread+0x266/0x2f0 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.265712] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.266207] thread_generic_wrapper+0x64/0x80 [spl]
Aug 21 20:48:03 backup kernel: [ 5076.266246] ? __thread_exit+0x20/0x20 [spl]
Aug 21 20:48:03 backup kernel: [ 5076.266284] kthread+0x12a/0x150
Aug 21 20:48:03 backup kernel: [ 5076.266295] ? set_kthread_struct+0x50/0x50
Aug 21 20:48:03 backup kernel: [ 5076.266305] ret_from_fork+0x22/0x30
Aug 21 20:48:03 backup kernel: [ 5076.266318]
Aug 21 20:48:03 backup kernel: [ 5076.266351] INFO: task zfs:1782 blocked for more than 120 seconds.
Aug 21 20:48:03 backup kernel: [ 5076.266561] Tainted: P O 5.15.0-46-generic #49<#49>-Ubuntu
Aug 21 20:48:03 backup kernel: [ 5076.266714] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 21 20:48:03 backup kernel: [ 5076.266857] task:zfs state:D stack: 0 pid: 1782 ppid: 1781 flags:0x00004002
Aug 21 20:48:03 backup kernel: [ 5076.266870] Call Trace:
Aug 21 20:48:03 backup kernel: [ 5076.266874]
Aug 21 20:48:03 backup kernel: [ 5076.266878] __schedule+0x23d/0x590
Aug 21 20:48:03 backup kernel: [ 5076.266887] ? autoremove_wake_function+0x12/0x40
Aug 21 20:48:03 backup kernel: [ 5076.266897] schedule+0x4e/0xc0
Aug 21 20:48:03 backup kernel: [ 5076.266905] io_schedule+0x46/0x80
Aug 21 20:48:03 backup kernel: [ 5076.266913] cv_wait_common+0xab/0x130 [spl]
Aug 21 20:48:03 backup kernel: [ 5076.266953] ? wait_woken+0x70/0x70
Aug 21 20:48:03 backup kernel: [ 5076.266962] __cv_wait_io+0x18/0x20 [spl]
Aug 21 20:48:03 backup kernel: [ 5076.267002] txg_wait_synced_impl+0x9b/0x120 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.267520] txg_wait_synced+0x10/0x50 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.268016] dsl_sync_task_common+0x1c6/0x2a0 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.268486] ? recv_begin_check_existing_impl+0x590/0x590 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.268924] ? recv_check_large_blocks+0x60/0x60 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.269365] ? recv_begin_check_existing_impl+0x590/0x590 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.269804] ? recv_check_large_blocks+0x60/0x60 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.270242] dsl_sync_task+0x1a/0x20 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.270754] dmu_recv_begin+0x1e2/0x390 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.271292] zfs_ioc_recv_impl.constprop.0+0x106/0xb20 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.271898] zfs_ioc_recv_new+0x310/0x3b0 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.272498] ? spl_kmem_alloc_impl+0xbe/0xd0 [spl]
Aug 21 20:48:03 backup kernel: [ 5076.272542] ? spl_vmem_alloc+0x19/0x20 [spl]
Aug 21 20:48:03 backup kernel: [ 5076.272586] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Aug 21 20:48:03 backup kernel: [ 5076.272629] ? nv_mem_zalloc+0x33/0x50 [znvpair]
Aug 21 20:48:03 backup kernel: [ 5076.272668] ? nvlist_xalloc+0x51/0xa0 [znvpair]
Aug 21 20:48:03 backup kernel: [ 5076.272707] ? nvlist_alloc+0x28/0x40 [znvpair]
Aug 21 20:48:03 backup kernel: [ 5076.272747] zfsdev_ioctl_common+0x285/0x740 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.273270] ? _copy_from_user+0x2e/0x70
Aug 21 20:48:03 backup kernel: [ 5076.273281] zfsdev_ioctl+0x57/0xf0 [zfs]
Aug 21 20:48:03 backup kernel: [ 5076.273790] __x64_sys_ioctl+0x95/0xd0
Aug 21 20:48:03 backup kernel: [ 5076.273803] do_syscall_64+0x5c/0xc0
Aug 21 20:48:03 backup kernel: [ 5076.273812] ? do_user_addr_fault+0x1e7/0x670
Aug 21 20:48:03 backup kernel: [ 5076.273821] ? do_syscall_64+0x69/0xc0
Aug 21 20:48:03 backup kernel: [ 5076.273828] ? exit_to_user_mode_prepare+0x37/0xb0
Aug 21 20:48:03 backup kernel: [ 5076.273838] ? irqentry_exit_to_user_mode+0x9/0x20
Aug 21 20:48:03 backup kernel: [ 5076.273847] ? irqentry_exit+0x1d/0x30
Aug 21 20:48:03 backup kernel: [ 5076.273856] ? exc_page_fault+0x89/0x170
Aug 21 20:48:03 backup kernel: [ 5076.273865] entry_SYSCALL_64_after_hwframe+0x61/0xcb
Aug 21 20:48:03 backup kernel: [ 5076.273876] RIP: 0033:0x7faa82a99aff
Aug 21 20:48:03 backup kernel: [ 5076.273884] RSP: 002b:00007ffcd73c4bb0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Aug 21 20:48:03 backup kernel: [ 5076.273893] RAX: ffffffffffffffda RBX: 00007ffcd73c8280 RCX: 00007faa82a99aff
Aug 21 20:48:03 backup kernel: [ 5076.273899] RDX: 00007ffcd73c4c30 RSI: 0000000000005a46 RDI: 0000000000000005
Aug 21 20:48:03 backup kernel: [ 5076.273904] RBP: 00007ffcd73c8220 R08: 0000000000000000 R09: 0000555b46c32d70
Aug 21 20:48:03 backup kernel: [ 5076.273909] R10: 00007faa82b98da0 R11: 0000000000000246 R12: 0000000000005a46
Aug 21 20:48:03 backup kernel: [ 5076.273914] R13: 00007ffcd73c4c30 R14: 0000000000005a46 R15: 0000555b46c0f7a0
Aug 21 20:48:03 backup kernel: [ 5076.273923]
—
Reply to this email directly, view it on GitHub<#13763 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAORUCGKQA7ISVIR6BVHLQLV2KCX5ANCNFSM56EUNQGA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
In the meantime, I no longer believe that. There is only the connection that the change from Ubuntu 20 to 22 took place only on Saturday and there were problems with zfs send/receive. But there are no more errors reported. So probably not the same problem. |
If you're talking about send / receive of encrypted data, that's could be known issues that were also in 20.04. It's unsafe to send or receive from or into an encrypted file system. It's unclear whether it is safe to use encryption without using send / receive. I'm currently skeptical. |
I'm talking about "send from unencrypted to encrypted dataset". I use this constellation for about one year now withoout any problems. And now - after upgrade - I run in this problem... |
I just had the same experience as @clhedrick - I upgraded to Ubuntu 22, and lost a few ZFS datasets in my ZFS pool!
(and a couple of other datasets)... In my case, the root (/) filesystem is OK, whereas e.g. the /home/ filesystem is not, although both reside in the same pool. - I suspect this could be the same issue as #13709 . |
+1 @jonryk , this looks like #13709
|
System information
Describe the problem you're observing
After upgrading from Ubuntu 20 to 22, zpool status show 143 permanent errors. I've never had an issue with devices. No errors shown then or after a scrub.
This is a backup system. I backup to it by send | receive. Originally one of the systems backed up was encrypted. After a crash I reconstructed it unencrypted, but I didn't reconstruct the backup system, as it had no errors. I did create unencrypted versions of the file systems on the backup system, but kept some of the encrypted ones around. They caused no problems under Ubuntu 20. But under 22, I got failures to mount, and 143 permanent errors. zpool status -v showed file names that were all in encrypted file systems.
I destroyed the encrypted file systems and then run a scrub. Now I've got 2 permanent errors
<0x1c336>:<0x0>
<0x2b49c>:<0x0>
Based on other reports I'll do a second scrub this weekend.
Note that the root file system is encrypted. It has no data, not even mount points. It's not mounted, although it will mount.
It would be useful to be able to clear the errors. We have monitoring scripts that check for problems with our ZFS file systems. This shows as a problem. We can ignore it, but that would hide any new errors that might occur.
Describe how to reproduce the problem
Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered: