Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS writers go into d-state #7924

Closed
lkateley opened this issue Sep 18, 2018 · 35 comments
Closed

ZFS writers go into d-state #7924

lkateley opened this issue Sep 18, 2018 · 35 comments
Labels
Status: Stale No recent activity for issue

Comments

@lkateley
Copy link

I have run into this issue several times and only with virtual disks. This on is with 4 ebs disk on aws.

This has been hanging the instance tight every few days.

I have also seen this several times with VMware and native Xen but was not able to capture this quality of data.

System information

Type Version/Name
Distribution Name Centos
Distribution Version
Linux Kernel
Architecture
ZFS Version 0.7.9-1
SPL Version 0.7.9-1

[root@ip-10-1-10-62 etc]# modinfo zfs
filename: /lib/modules/3.10.0-862.11.6.el7.x86_64/extra/zfs.ko
version: 0.7.9-1
license: CDDL
author: OpenZFS on Linux
description: ZFS
retpoline: Y
rhelversion: 7.5

Describe the problem you're observing

ZFS IO hangs in d-state
[422180.591613] INFO: task z_wr_iss:1645 blocked for more than 120 seconds.
[422180.597666] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[422180.605080] z_wr_iss D ffff89c033f5bf40 0 1645 2 0x00000000
[422180.612627] Call Trace:
[422180.616032] [] ? mutex_lock+0x12/0x2f
[422180.621871] [] schedule+0x29/0x70
[422180.627354] [] rwsem_down_write_failed+0x225/0x3a0
[422180.634310] [] call_rwsem_down_write_failed+0x17/0x30
[422180.641826] [] ? spl_kmem_free+0x35/0x40 [spl]
[422180.648809] [] down_write+0x2d/0x3d
[422180.654544] [] dbuf_write_ready+0x1e7/0x2f0 [zfs]
[422180.661888] [] arc_write_ready+0xac/0x300 [zfs]
[422180.668970] [] ? mutex_lock+0x12/0x2f
[422180.675001] [] zio_ready+0x65/0x3d0 [zfs]
[422180.681564] [] ? tsd_get_by_thread+0x2e/0x50 [spl]
[422180.688748] [] ? taskq_member+0x18/0x30 [spl]
[422180.695932] [] zio_execute+0xa2/0x100 [zfs]
[422180.702588] [] taskq_thread+0x2ac/0x4f0 [spl]
[422180.709225] [] ? wake_up_state+0x20/0x20
[422180.715461] [] ? zio_taskq_member.isra.7.constprop.10+0x80/0x80 [zfs]
[422180.724316] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[422180.731705] [] kthread+0xd1/0xe0
[422180.737123] [] ? insert_kthread_work+0x40/0x40
[422180.744223] [] ret_from_fork_nospec_begin+0x21/0x21
[422180.751584] [] ? insert_kthread_work+0x40/0x40

root 1636 6.8 0.0 0 0 ? D< Sep13 484:46 [z_wr_iss]
root 1637 6.8 0.0 0 0 ? D< Sep13 484:32 [z_wr_iss]
root 1638 6.8 0.0 0 0 ? D< Sep13 484:20 [z_wr_iss]
root 1639 6.8 0.0 0 0 ? D< Sep13 484:33 [z_wr_iss]
root 1640 6.8 0.0 0 0 ? D< Sep13 484:06 [z_wr_iss]
root 1641 6.8 0.0 0 0 ? D< Sep13 484:24 [z_wr_iss]
root 1642 6.8 0.0 0 0 ? D< Sep13 484:33 [z_wr_iss]
root 1643 6.8 0.0 0 0 ? D< Sep13 484:15 [z_wr_iss]
root 1644 6.8 0.0 0 0 ? D< Sep13 484:26 [z_wr_iss]
root 1645 6.8 0.0 0 0 ? D< Sep13 484:34 [z_wr_iss]
root 1646 6.8 0.0 0 0 ? D< Sep13 484:23 [z_wr_iss]
root 1647 6.8 0.0 0 0 ? D< Sep13 484:41 [z_wr_iss]
root 1648 6.8 0.0 0 0 ? D< Sep13 484:27 [z_wr_iss]
root 1649 6.8 0.0 0 0 ? D< Sep13 484:36 [z_wr_iss]
root 1650 6.8 0.0 0 0 ? D< Sep13 484:27 [z_wr_iss]
root 1651 6.8 0.0 0 0 ? D< Sep13 484:29 [z_wr_iss]
root 1652 6.8 0.0 0 0 ? D< Sep13 484:05 [z_wr_iss]
root 1653 6.8 0.0 0 0 ? D< Sep13 484:20 [z_wr_iss]
root 1654 6.8 0.0 0 0 ? D< Sep13 484:20 [z_wr_iss]
root 1655 6.8 0.0 0 0 ? D< Sep13 484:28 [z_wr_iss]
root 1656 6.8 0.0 0 0 ? D< Sep13 484:32 [z_wr_iss]
root 1657 6.8 0.0 0 0 ? D< Sep13 484:18 [z_wr_iss]
root 1658 6.8 0.0 0 0 ? D< Sep13 484:23 [z_wr_iss]
root 1659 6.8 0.0 0 0 ? D< Sep13 484:36 [z_wr_iss]
root 1636 6.8 0.0 0 0 ? D< Sep13 484:46 [z_wr_iss]
root 1637 6.8 0.0 0 0 ? D< Sep13 484:32 [z_wr_iss]
root 1638 6.8 0.0 0 0 ? D< Sep13 484:20 [z_wr_iss]
root 1639 6.8 0.0 0 0 ? D< Sep13 484:33 [z_wr_iss]
root 1640 6.8 0.0 0 0 ? D< Sep13 484:06 [z_wr_iss]
root 1641 6.8 0.0 0 0 ? D< Sep13 484:24 [z_wr_iss]
root 1642 6.8 0.0 0 0 ? D< Sep13 484:33 [z_wr_iss]
root 1643 6.8 0.0 0 0 ? D< Sep13 484:15 [z_wr_iss]
root 1644 6.8 0.0 0 0 ? D< Sep13 484:26 [z_wr_iss]
root 1645 6.8 0.0 0 0 ? D< Sep13 484:34 [z_wr_iss]
root 1646 6.8 0.0 0 0 ? D< Sep13 484:23 [z_wr_iss]
root 1647 6.8 0.0 0 0 ? D< Sep13 484:41 [z_wr_iss]
root 1648 6.8 0.0 0 0 ? D< Sep13 484:27 [z_wr_iss]
root 1649 6.8 0.0 0 0 ? D< Sep13 484:36 [z_wr_iss]
root 1650 6.8 0.0 0 0 ? D< Sep13 484:27 [z_wr_iss]
root 1651 6.8 0.0 0 0 ? D< Sep13 484:29 [z_wr_iss]
root 1652 6.8 0.0 0 0 ? D< Sep13 484:05 [z_wr_iss]
root 1653 6.8 0.0 0 0 ? D< Sep13 484:20 [z_wr_iss]
root 1654 6.8 0.0 0 0 ? D< Sep13 484:20 [z_wr_iss]
root 1655 6.8 0.0 0 0 ? D< Sep13 484:28 [z_wr_iss]
root 1656 6.8 0.0 0 0 ? D< Sep13 484:32 [z_wr_iss]
root 1657 6.8 0.0 0 0 ? D< Sep13 484:18 [z_wr_iss]
root 1658 6.8 0.0 0 0 ? D< Sep13 484:23 [z_wr_iss]
root 1659 6.8 0.0 0 0 ? D< Sep13 484:36 [z_wr_iss]

Describe how to reproduce the problem

This does seem to happen at close to this same amount of of cpu time.. I know it doesn't make sense but it seems like I have seen this at 480' something... of total cpu time.

It happens every couple days on this log server. I have seen this similar in very write centric workloads onto virtual disk.

@lkateley
Copy link
Author

lkateley commented Sep 18, 2018

We have 3k iops per disk we are getting about 1500 read and write combined per disk.

If I hit that limit will it hang the box?

This is a log server so it will peak with load.

@lkateley
Copy link
Author

lkateley commented Sep 18, 2018

Wouldn't I see that?

If it is a bug... what next?

@lkateley
Copy link
Author

lkateley commented Sep 18, 2018

Dang will do.. been freezing every about 2 days. should be able to grab. Do I need all the stacks? or just for the writers in dstate?

@behlendorf
Copy link
Contributor

are you hitting throttle limits for the EBS volumes? you can find this information in the AWS Console.

This would be my guess as well given the bug description and stack trace above. The IO pipeline appears to be waiting on the disks which is what prompted the console message. Both a zpool iostat -lv and zpool iostat -w should give you more visibility in to how quickly the IOs are being processed.

@kpande by default no, we don't need all of the stacks. The most relevant ones are typically automatically dumped to the system logs, if we need more it's easy to ask.

@bunder2015
Copy link
Contributor

Does the system ever come back from these situations, or does it hang indefinitely? I had a problem once on virtualbox where it would just decide the guest had no disks and the guest would hang. If it never comes back, it could be a EBS bug. I've been setting failmode=panic on some of my systems to see if this gives any useful information as soon as zfs detects all the disks have suddenly disappeared.

@rwyse
Copy link

rwyse commented Sep 19, 2018

The instance type we are using is r4.8xlarge. Once it hangs, it doesn't come back until we start bouncing services or the instance itself.

@lkateley
Copy link
Author

@kpanda What version of zfs are you running?

We also know we have a thing, I can't remember what it's called, but it's something like burst forgiveness where it allows us to go over the limits. Some of the people on the team said that we would see if we were bursting. We haven't seen a burst. We run at half our limit.

Our instance size is quite large. It has 32 procs and 256 GB ram. I only have 4 ebs drives which I am in process of scaling out a little. going to 8 drives.

I thought I put the zpool iostat in... but I am running consistently, day and night 6-7k of each read and write and about 1.5k iops per disk of each read and write. The drives are running at 90+% busy, which is why I am rolling out the larger set.

I have seen this several times. I will see if I can get a new kernel.

We have been compiling the zfs mod's, I have heard that the rpm's are pretty safe to use?

Thanks for your help. Need to code freeze by Oct 1st and would hate to lose the zfs footprint here.

@lkateley
Copy link
Author

Oh I should add that when the z_wr_iss go into dstate there is no io to the pool.

@lkateley
Copy link
Author

We had this happen again. This is the info that was captured.
z_wr_iss.stacks.pdf
zed.messages.pdf
zfs.versions.pdf
zpool.pdf

I was also thinking back to times I have seen this and wondered if cache flushing might cause this.

@lkateley
Copy link
Author

Since we have a pretty massive infrastructure, we don't have the ability to roll outside of "standards" into aws. Have to run standard centos 7.5

@lkateley
Copy link
Author

I did see cache flushing is on.. In my zfs history have run into almost every bug with that..

@lkateley
Copy link
Author

Still seeing hitting this bug almost everyday. Ideas?

@Avishay999
Copy link

Avishay999 commented Nov 20, 2018

Hi,
we are getting the same issue:
Redhat Enterprise 7.5
EBS Based storage(up to 80k iops, and 14Gb Bandwitdth) - m5.24xlarge
version: 0.7.12-1
license: CDDL
author: OpenZFS on Linux
description: ZFS
retpoline: Y
rhelversion: 7.5

NFS 4.1 server,
not too loaded at the time of kernel hung.

kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: intel_powerclamp: No package C-state available
kernel: z_wr_iss D
kernel: ffffa12c37d38000 0 1782 2 0x00000000
kernel: Call Trace:
kernel: [] ? mutex_lock+0x12/0x2f
kernel: [] schedule+0x29/0x70
kernel: [] rwsem_down_write_failed+0x225/0x3a0
kernel: [] call_rwsem_down_write_failed+0x17/0x30
kernel: [] ? spl_kmem_free+0x35/0x40 [spl]
kernel: [] down_write+0x2d/0x3d
kernel: [] dbuf_write_ready+0x1e7/0x2f0 [zfs]
kernel: [] arc_write_ready+0xac/0x300 [zfs]
kernel: [] ? mutex_lock+0x12/0x2f
kernel: [] zio_ready+0x60/0x3e0 [zfs]
kernel: [] ? taskq_member+0x18/0x30 [spl]
kernel: [] zio_execute+0x9f/0x100 [zfs]
kernel: [] taskq_thread+0x2ac/0x4f0 [spl]
kernel: [] ? wake_up_state+0x20/0x20
kernel: [] ? zio_taskq_member.isra.7.constprop.10+0x80/0x80 [zfs]
kernel: [] ? taskq_thread_spawn+0x60/0x60 [spl]
kernel: [] kthread+0xd1/0xe0
kernel: [] ? insert_kthread_work+0x40/0x40
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: txg_sync D ffffa12c29914f10 0 2216 2 0x00000000
kernel: Call Trace:
kernel: [] ? taskq_dispatch_ent+0x57/0x170 [spl]
kernel: [] ? zio_taskq_member.isra.7.constprop.10+0x80
/0x80 [zfs]
kernel: [] schedule+0x29/0x70
kernel: [] schedule_timeout+0x239/0x2c0
kernel: [] ? zio_taskq_dispatch+0x8f/0xa0 [zfs]
kernel: [] ? zio_issue_async+0x12/0x20 [zfs]
kernel: [] ? zio_nowait+0xc2/0x160 [zfs]
kernel: [] io_schedule_timeout+0xad/0x130
kernel: [] ? prepare_to_wait_exclusive+0x56/0x90
kernel: [] io_schedule+0x18/0x20
kernel: [] cv_wait_common+0xb2/0x150 [spl]
kernel: [] ? wake_up_atomic_t+0x30/0x30
kernel: [] __cv_wait_io+0x18/0x20 [spl]
kernel: [] zio_wait+0x11b/0x1c0 [zfs]
kernel: [] dsl_pool_sync+0xbf/0x440 [zfs]
kernel: [] spa_sync+0x437/0xd90 [zfs]
kernel: [] ? default_wake_function+0x12/0x20
kernel: [] ? __wake_up+0x44/0x50
kernel: [] txg_sync_thread+0x301/0x510 [zfs]
kernel: [] ? txg_fini+0x2a0/0x2a0 [zfs]
kernel: [] thread_generic_wrapper+0x73/0x80 [spl]
kernel: [] ? __thread_exit+0x20/0x20 [spl]
kernel: [] kthread+0xd1/0xe0
kernel: [] ? insert_kthread_work+0x40/0x40

@Avishay999
Copy link

Avishay999 commented Nov 20, 2018

Hi,
I don't think so, there is a plenty of RAM in this instance and there is a lot of free RAM.(368GB RAM)
moreover, i am limiting the ZFS_MAX_ARC size to be 260GB, so i leave 120GB for OS
about 12TB out of 30TB is in use.

p.s.
yes, it's metadata heavy pool, lots of small files, between 1k and 16k. that's why i used 16k as record size.
options zfs zfs_arc_max=280687091200
options zfs zfs_vdev_sync_write_max_active=32
options zfs zfs_vdev_async_read_max_active=32
options zfs zfs_vdev_async_write_max_active=32
options zfs zfs_txg_timeout=5
options zfs zfs_top_maxinflight=128
options zfs metaslab_lba_weighting_enabled=0
options zfs zfs_vdev_aggregation_limit=131072
options zfs l2arc_noprefetch=0
options zfs zfs_prefetch_disable=1

@Avishay999
Copy link

time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c

15:31:16 0 0 0 0 0 0 0 0 0 261G 261G
15:31:17 4.1K 12 0 12 0 0 0 12 0 261G 261G
15:31:18 2.3K 0 0 0 0 0 0 0 0 261G 261G
15:31:19 15K 18 0 18 0 0 0 18 0 261G 261G
15:31:20 2.9K 12 0 12 0 0 0 12 0 261G 261G
15:31:21 18K 33 0 33 0 0 0 33 0 261G 261G
15:31:22 4.6K 34 0 17 0 17 94 34 0 261G 261G
15:31:23 9.6K 82 0 82 0 0 0 18 0 261G 261G
15:31:24 13K 5.5K 42 5.5K 42 0 0 6 0 261G 261G
15:31:25 22K 6.5K 28 6.5K 28 0 0 38 0 261G 261G
15:31:26 17K 5.8K 33 5.8K 33 0 0 6 0 261G 261G
15:31:27 17K 5.0K 27 5.0K 27 0 0 19 0 261G 261G
15:31:28 13K 2.6K 20 2.6K 20 0 0 14 0 261G 261G
15:31:29 2.3K 0 0 0 0 0 0 0 0 261G 261G
15:31:30 11K 28 0 28 0 0 0 28 0 261G 261G
15:31:31 4.0K 16 0 16 0 0 0 16 0 261G 261G
15:31:32 14K 28 0 28 0 0 0 28 0 261G 261G
15:31:33 18K 25 0 25 0 0 0 25 0 261G 261G
15:31:34 10K 16 0 16 0 0 0 16 0 261G 261G
15:31:35 4.5K 0 0 0 0 0 0 0 0 261G 261G
15:31:36 4.3K 16 0 16 0 0 0 16 0 261G 261G
15:31:37 10K 18 0 18 0 0 0 18 0 261G 261G
15:31:38 12K 16 0 16 0 0 0 16 0 261G 261G

@Avishay999
Copy link

m5.24xlrage.txt

@Avishay999
Copy link

arcstat-Nov20.txt

@Avishay999
Copy link

arcstat-proc.txt

@Avishay999
Copy link

3 x raidz1(each vdev is comprised of 4 X 5000GB EBS
10000 iops per EBS as a baseline

SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
58.5T 16.4T 42.1T - 13% 28% 1.00x ONLINE -

@Avishay999
Copy link

pool properties:
recordsize 16K local
compression lz4 local
atime off local
xattr sa local
logbias latency default
dedup off default
sync disabled local
acltype off default
redundant_metadata most local

@Avishay999
Copy link

Hi
thanks for your suggestion.
the 16kb record size, is to avoid unneeded whole record fetch during reads and write.
Will ask in ZFS-DISCUSS

@lkateley
Copy link
Author

@kpande Why did you ask about "are you hitting any metadata memory limits?" We have lots of small files. We are hitting on demand metadata at 75%, but I can't see a limit?

@Avishay999
Copy link

Avishay999 commented Nov 21, 2018

HI,
since we are using nvme module by default starting redhat 7.5 with ebs,
the timeout is too short, by default 30 secs.
i have modified it to 255
echo 255 > /sys/module/nvme_core/parameters/io_timeout
now, we will wait and see it still get stuck.
see the following thread:
https://unix.stackexchange.com/questions/445519/how-to-increase-nvme-core-io-timeout-on-my-c5-ec2-instacnce

@rwyse
Copy link

rwyse commented Nov 22, 2018

Some further information for original issue. We resized the instance to an r5.12xlarge in AWS just over 24 hours ago. We grew the L2ARC to ~300GB. Ran into another crash in the system similar to what we had seen before with the exception that this time we actually had something written into /var/log/messages about it:

2018-11-21T23:32:24.989855-06:00 ip- kernel: [93485.086350] Call Trace: 2018-11-21T23:32:24.989857-06:00 ip-2 kernel: [93485.089122] [<ffffffffa9f67bc9>] schedule+0x29/0x70 2018-11-21T23:32:24.997188-06:00 ip- kernel: [93485.092616] [<ffffffffc02172d5>] cv_wait_common+0x125/0x150 [spl] 2018-11-21T23:32:25.000985-06:00 ip- kernel: [93485.096457] [<ffffffffa98c2d00>] ? wake_up_atomic_t+0x30/0x30 2018-11-21T23:32:25.004656-06:00 ip- kernel: [93485.100258] [<ffffffffc0217315>] __cv_wait+0x15/0x20 [spl] 2018-11-21T23:32:25.008474-06:00 ip- kernel: [93485.103950] [<ffffffffc079bb33>] dmu_tx_wait+0x213/0x3c0 [zfs] 2018-11-21T23:32:25.012300-06:00 ip- kernel: [93485.107767] [<ffffffffc0827021>] zfs_create+0x311/0x8f0 [zfs] 2018-11-21T23:32:25.016032-06:00 ip- kernel: [93485.111591] [<ffffffffc0847103>] zpl_create+0xc3/0x190 [zfs] 2018-11-21T23:32:25.016035-06:00 ip- kernel: [93485.115300] [<ffffffffa9a4e343>] vfs_create+0xd3/0x140 2018-11-21T23:32:25.023185-06:00 ip- kernel: [93485.118882] [<ffffffffa9a5041d>] do_last+0x10cd/0x12a0 2018-11-21T23:32:25.023188-06:00 ip- kernel: [93485.122452] [<ffffffffa9a523f7>] path_openat+0xd7/0x640 2018-11-21T23:32:25.030609-06:00 ip- kernel: [93485.126085] [<ffffffffa9a53cd2>] ? user_path_at_empty+0x72/0xc0 2018-11-21T23:32:25.030612-06:00 ip- kernel: [93485.129876] [<ffffffffa9a53dfd>] do_filp_open+0x4d/0xb0 2018-11-21T23:32:25.037888-06:00 ip- kernel: [93485.133507] [<ffffffffa9a61487>] ? __alloc_fd+0x47/0x170 2018-11-21T23:32:25.037889-06:00 ip- kernel: [93485.137147] [<ffffffffa9a3ff27>] do_sys_open+0x137/0x240 2018-11-21T23:32:25.045241-06:00 ip- kernel: [93485.140842] [<ffffffffa9a40064>] SyS_openat+0x14/0x20 2018-11-21T23:32:25.045245-06:00 ip- kernel: [93485.144509] [<ffffffffa9f74ddb>] system_call_fastpath+0x22/0x27 2018-11-21T23:32:25.053116-06:00 ip- kernel: [93485.148332] INFO: task go-carbon:21781 blocked for more than 120 seconds. 2018-11-21T23:32:25.053120-06:00 ip- kernel: [93485.152382] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2018-11-21T23:32:25.066456-06:00 ip- kernel: [93485.158726] go-carbon D ffff96379ae4e180 0 21781 1 0x00000080

@lkateley
Copy link
Author

We moved to a new instance type with nvme, hoping that might help. I now have moved to a new error. We are hung on txg_quiesce now. I did notice that txg_sync sits in D state most of the time and then quiesce will hang every few days. This looks like it is from a known bug but we are fairly current 0.7.11

Here is the txg_quiesce
[] cv_wait_common+0x125/0x150 [spl]
[] __cv_wait+0x15/0x20 [spl]
[] txg_quiesce_thread+0x2fb/0x410 [zfs]
[] thread_generic_wrapper+0x73/0x80 [spl]
[] kthread+0xd1/0xe0
[] ret_from_fork_nospec_end+0x0/0x39
[] 0xffffffffffffffff

@lkateley
Copy link
Author

I would like to point out why I would want raidz on ebs, I want to be able to correct for bad checksums. That was actually what I thought my error might be. I believe all checksum errors show up as events though.

@lkateley
Copy link
Author

@gmelikov Are you saying that a box could potentially hang due to a poor metadata to data ratio? All of my zfs has been on freebsd and solaris and there are very few things that can hang a box. usually only driver or physical hardware issues. Even in those cases you can usually recover.

@lkateley
Copy link
Author

lkateley commented Dec 20, 2018

We have adjusted, instance size and type. We have gone nitro. Tuning arc_size and zfs_nocacheflush. Turned off prefetch. Stayed up for 2 days with an instance with 48vcpus and 192GB ram.

We have the calling traces from go-carbon. Still hangin on txg_quiesce

INFO: task txg_quiesce:12980 blocked for more than 120 seconds.
2018-12-20T09:11:46.554767-06:00 graphite kernel: [182399.878095] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-12-20T09:11:46.561540-06:00 graphite kernel: [182399.884297] txg_quiesce D ffff98e2b7edc100 0 12980 2 0x00000000
2018-12-20T09:11:46.561546-06:00 graphite kernel: [182399.888390] Call Trace:
2018-12-20T09:11:46.561547-06:00 graphite kernel: [182399.891175] [] schedule+0x29/0x70
2018-12-20T09:11:46.565059-06:00 graphite kernel: [182399.894656] [] cv_wait_common+0x125/0x150 [spl]
2018-12-20T09:11:46.568854-06:00 graphite kernel: [182399.898448] [] ? wake_up_atomic_t+0x30/0x30
2018-12-20T09:11:46.572575-06:00 graphite kernel: [182399.902162] [] __cv_wait+0x15/0x20 [spl]
2018-12-20T09:11:46.580131-06:00 graphite kernel: [182399.905816] [] txg_quiesce_thread+0x2fb/0x410 [zfs]
2018-12-20T09:11:46.580148-06:00 graphite kernel: [182399.909750] [] ? txg_init+0x2b0/0x2b0 [zfs]
2018-12-20T09:11:46.583828-06:00 graphite kernel: [182399.913407] [] thread_generic_wrapper+0x73/0x80 [spl]
2018-12-20T09:11:46.591489-06:00 graphite kernel: [182399.917313] [] ? __thread_exit+0x20/0x20 [spl]
2018-12-20T09:11:46.591496-06:00 graphite kernel: [182399.921051] [] kthread+0xd1/0xe0
2018-12-20T09:11:46.594885-06:00 graphite kernel: [182399.924477] [] ? insert_kthread_work+0x40/0x40
2018-12-20T09:11:46.598621-06:00 graphite kernel: [182399.928210] [] ret_from_fork_nospec_begin+0x21/0x21
2018-12-20T09:11:46.602484-06:00 graphite kernel: [182399.932066] [] ? insert_kthread_work+0x40/0x40
2018-12-20T09:11:46.606283-06:00 graphite kernel: [182399.935851] INFO: task go-carbon:15569 blocked for more than 120 seconds.
2018-12-20T09:11:46.610344-06:00 graphite kernel: [182399.939931] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-12-20T09:11:46.623594-06:00 graphite kernel: [182399.946322] go-carbon D ffff98e2e9f88000 0 15569 1 0x00000080
2018-12-20T09:11:46.623600-06:00 graphite kernel: [182399.950433] Call Trace:
2018-12-20T09:11:46.623601-06:00 graphite kernel: [182399.953230] [] ? path_lookupat+0x122/0x8b0
2018-12-20T09:11:46.627245-06:00 graphite kernel: [182399.956878] [] schedule_preempt_disabled+0x29/0x70
2018-12-20T09:11:46.631149-06:00 graphite kernel: [182399.960729] [] __mutex_lock_slowpath+0xc7/0x1d0
2018-12-20T09:11:46.634910-06:00 graphite kernel: [182399.964491] [] mutex_lock+0x1f/0x2f
2018-12-20T09:11:46.638424-06:00 graphite kernel: [182399.968012] [] filename_create+0x85/0x180
2018-12-20T09:11:46.642034-06:00 graphite kernel: [182399.971614] [] ? getname_flags+0x4f/0x1a0
2018-12-20T09:11:46.645682-06:00 graphite kernel: [182399.975258] [] ? getname_flags+0xc4/0x1a0
2018-12-20T09:11:46.649279-06:00 graphite kernel: [182399.978870] [] user_path_create+0x41/0x60
2018-12-20T09:11:46.656439-06:00 graphite kernel: [182399.982489] [] SyS_mkdirat+0x48/0x100
2018-12-20T09:11:46.656446-06:00 graphite kernel: [182399.986011] [] system_call_fastpath+0x22/0x27
2018-12-20T09:11:46.660157-06:00 graphite kernel: [182399.989746] INFO: task go-carbon:15570 blocked for more than 120 seconds.
2018-12-20T09:11:46.664105-06:00 graphite kernel: [182399.993694] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-12-20T09:11:46.722564-06:00 graphite kernel: [182399.999988] go-carbon D ffff98e2eb015140 0 15570 1 0x00000080
2018-12-20T09:11:46.722569-06:00 graphite kernel: [182400.049749] Call Trace:
2018-12-20T09:11:46.722570-06:00 graphite kernel: [182400.052195] [] schedule+0x29/0x70
2018-12-20T09:11:46.725682-06:00 graphite kernel: [182400.055265] [] cv_wait_common+0x125/0x150 [spl]
2018-12-20T09:11:46.732266-06:00 graphite kernel: [182400.058614] [] ? wake_up_atomic_t+0x30/0x30
2018-12-20T09:11:46.732272-06:00 graphite kernel: [182400.061901] [] __cv_wait+0x15/0x20 [spl]
2018-12-20T09:11:46.738795-06:00 graphite kernel: [182400.065111] [] dmu_tx_wait+0x213/0x3c0 [zfs]
2018-12-20T09:11:46.738802-06:00 graphite kernel: [182400.068424] [] zfs_mkdir+0x22a/0x700 [zfs]
2018-12-20T09:11:46.745259-06:00 graphite kernel: [182400.071691] [] ? lookup_dcache+0x62/0xc0
2018-12-20T09:11:46.745265-06:00 graphite kernel: [182400.074886] [] ? __lookup_hash+0x2d/0x60
2018-12-20T09:11:46.748593-06:00 graphite kernel: [182400.078208] [] zpl_mkdir+0xb5/0x180 [zfs]
2018-12-20T09:11:46.751818-06:00 graphite kernel: [182400.081449] [] vfs_mkdir+0xbd/0x170
2018-12-20T09:11:46.754881-06:00 graphite kernel: [182400.084515] [] SyS_mkdirat+0xca/0x100
2018-12-20T09:11:46.761352-06:00 graphite kernel: [182400.087661] [] system_call_fastpath+0x22/0x27
2018-12-20T09:11:46.761369-06:00 graphite kernel: [182400.090938] INFO: task go-carbon:15572 blocked for more than 120 seconds.
2018-12-20T09:11:46.764853-06:00 graphite kernel: [182400.094442] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-12-20T09:11:46.771529-06:00 graphite kernel: [182400.099944] go-carbon D ffff98e2f2a79040 0 15572 1 0x00000080
2018-12-20T09:11:46.776499-06:00 graphite kernel: [182400.103596] Call Trace:
2018-12-20T09:11:46.776506-06:00 graphite kernel: [182400.106073] [] ? dequeue_task_fair+0x41e/0x660
2018-12-20T09:11:46.779829-06:00 graphite kernel: [182400.109418] [] schedule_preempt_disabled+0x29/0x70
2018-12-20T09:11:46.783278-06:00 graphite kernel: [182400.112862] [] __mutex_lock_slowpath+0xc7/0x1d0
2018-12-20T09:11:46.786624-06:00 graphite kernel: [182400.116213] [] mutex_lock+0x1f/0x2f
2018-12-20T09:11:46.789744-06:00 graphite kernel: [182400.119334] [] lookup_slow+0x33/0xa7
2018-12-20T09:11:46.792906-06:00 graphite kernel: [182400.122493] [] link_path_walk+0x80f/0x8b0
2018-12-20T09:11:46.796275-06:00 graphite kernel: [182400.125842] [] ? sched_clock_cpu+0x85/0xc0
2018-12-20T09:11:46.799597-06:00 graphite kernel: [182400.129171] [] ? check_preempt_curr+0x80/0xa0
2018-12-20T09:11:46.802937-06:00 graphite kernel: [182400.132510] [] path_openat+0xb5/0x640
2018-12-20T09:11:46.806136-06:00 graphite kernel: [182400.135709] [] do_filp_open+0x4d/0xb0
2018-12-20T09:11:46.809308-06:00 graphite kernel: [182400.138880] [] ? __alloc_fd+0x47/0x170
2018-12-20T09:11:46.812520-06:00 graphite kernel: [182400.142092] [] do_sys_open+0x137/0x240
2018-12-20T09:11:46.815723-06:00 graphite kernel: [182400.145296] [] SyS_openat+0x14/0x20
2018-12-20T09:11:46.818846-06:00 graphite kernel: [182400.148434] [] system_call_fastpath+0x22/0x27
2018-12-20T09:11:46.822275-06:00 graphite kernel: [182400.151842] INFO: task go-carbon:15575 blocked for more than 120 seconds.
2018-12-20T09:11:46.825818-06:00 graphite kernel: [182400.155405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-12-20T09:11:46.832570-06:00 graphite kernel: [182400.160965] go-carbon D ffff98e2e9524100 0 15575 1 0x00000080
2018-12-20T09:11:46.837537-06:00 graphite kernel: [182400.164639] Call Trace:
2018-12-20T09:11:46.837545-06:00 graphite kernel: [182400.167166] [] schedule_preempt_disabled+0x29/0x70
2018-12-20T09:11:46.841092-06:00 graphite kernel: [182400.170680] [] __mutex_lock_slowpath+0xc7/0x1d0
2018-12-20T09:11:46.844555-06:00 graphite kernel: [182400.174129] [] mutex_lock+0x1f/0x2f
2018-12-20T09:11:46.850875-06:00 graphite kernel: [182400.177253] [] lookup_slow+0x33/0xa7
2018-12-20T09:11:46.850882-06:00 graphite kernel: [182400.180448] [] link_path_walk+0x80f/0x8b0
2018-12-20T09:11:46.854152-06:00 graphite kernel: [182400.183739] [] path_openat+0xb5/0x640
2018-12-20T09:11:46.860597-06:00 graphite kernel: [182400.186982] [] do_filp_open+0x4d/0xb0
2018-12-20T09:11:46.860604-06:00 graphite kernel: [182400.190171] [] ? __alloc_fd+0x47/0x170
2018-12-20T09:11:46.863805-06:00 graphite kernel: [182400.193380] [] do_sys_open+0x137/0x240
2018-12-20T09:11:46.870149-06:00 graphite kernel: [182400.196582] [] SyS_openat+0x14/0x20
2018-12-20T09:11:46.870156-06:00 graphite kernel: [182400.199720] [] system_call_fastpath+0x22/0x27
2018-12-20T09:11:46.873488-06:00 graphite kernel: [182400.203076] INFO: task go-carbon:15576 blocked for more than 120 seconds.
2018-12-20T09:11:46.877017-06:00 graphite kernel: [182400.206645] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-12-20T09:11:46.888800-06:00 graphite kernel: [182400.212256] go-carbon D ffff98e2e98db0c0 0 15576 1 0x00000080
2018-12-20T09:11:46.888817-06:00 graphite kernel: [182400.215924] Call Trace:
2018-12-20T09:11:46.888817-06:00 graphite kernel: [182400.218387] [] ? dequeue_task_fair+0x41e/0x660
2018-12-20T09:11:46.892167-06:00 graphite kernel: [182400.221744] [] schedule_preempt_disabled+0x29/0x70
2018-12-20T09:11:46.895638-06:00 graphite kernel: [182400.225208] [] __mutex_lock_slowpath+0xc7/0x1d0
2018-12-20T09:11:46.899014-06:00 graphite kernel: [182400.228599] [] mutex_lock+0x1f/0x2f
2018-12-20T09:11:46.902111-06:00 graphite kernel: [182400.231696] [] lookup_slow+0x33/0xa7
2018-12-20T09:11:46.905266-06:00 graphite kernel: [182400.234825] [] link_path_walk+0x80f/0x8b0
2018-12-20T09:11:46.908510-06:00 graphite kernel: [182400.238091] [] ? sched_clock_cpu+0x85/0xc0
2018-12-20T09:11:46.911790-06:00 graphite kernel: [182400.241361] [] ? check_preempt_curr+0x80/0xa0
2018-12-20T09:11:46.915111-06:00 graphite kernel: [182400.244682] [] path_openat+0xb5/0x640
2018-12-20T09:11:46.918269-06:00 graphite kernel: [182400.247854] [] do_filp_open+0x4d/0xb0
2018-12-20T09:11:46.921428-06:00 graphite kernel: [182400.251014] [] ? __alloc_fd+0x47/0x170
2018-12-20T09:11:46.924617-06:00 graphite kernel: [182400.254205] [] do_sys_open+0x137/0x240
2018-12-20T09:11:46.930941-06:00 graphite kernel: [182400.257368] [] SyS_openat+0x14/0x20
2018-12-20T09:11:46.930958-06:00 graphite kernel: [182400.260525] [] system_call_fastpath+0x22/0x27
2018-12-20T09:11:46.934360-06:00 graphite kernel: [182400.263946] INFO: task go-carbon:15660 blocked for more than 120 seconds.
2018-12-20T09:11:46.938025-06:00 graphite kernel: [182400.267594] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-12-20T09:11:46.950163-06:00 graphite kernel: [182400.273366] go-carbon D ffff98e2b4005140 0 15660 1 0x00000080
2018-12-20T09:11:46.950180-06:00 graphite kernel: [182400.277180] Call Trace:
2018-12-20T09:11:46.950181-06:00 graphite kernel: [182400.279747] [] ? path_lookupat+0x122/0x8b0
2018-12-20T09:11:46.953563-06:00 graphite kernel: [182400.283137] [] schedule_preempt_disabled+0x29/0x70
2018-12-20T09:11:46.957095-06:00 graphite kernel: [182400.286721] [] __mutex_lock_slowpath+0xc7/0x1d0
2018-12-20T09:11:46.960611-06:00 graphite kernel: [182400.290195] [] mutex_lock+0x1f/0x2f
2018-12-20T09:11:46.963762-06:00 graphite kernel: [182400.293335] [] filename_create+0x85/0x180
2018-12-20T09:11:46.966942-06:00 graphite kernel: [182400.296567] [] ? getname_flags+0x4f/0x1a0
2018-12-20T09:11:46.970249-06:00 graphite kernel: [182400.299829] [] ? getname_flags+0xc4/0x1a0
2018-12-20T09:11:46.976725-06:00 graphite kernel: [182400.303096] [] user_path_create+0x41/0x60
2018-12-20T09:11:46.976731-06:00 graphite kernel: [182400.306350] [] SyS_mkdirat+0x48/0x100
2018-12-20T09:11:46.979893-06:00 graphite kernel: [182400.309519] [] system_call_fastpath+0x22/0x27
2018-12-20T09:11:46.983262-06:00 graphite kernel: [182400.312889] INFO: task go-carbon:15665 blocked for more than 120 seconds.
2018-12-20T09:11:46.986833-06:00 graphite kernel: [182400.316455] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-12-20T09:11:46.993652-06:00 graphite kernel: [182400.322055] go-carbon D ffff98e2b3ebb0c0 0 15665 1 0x00000080
2018-12-20T09:11:46.998649-06:00 graphite kernel: [182400.325730] Call Trace:
2018-12-20T09:11:46.998656-06:00 graphite kernel: [182400.328221] [] ? dequeue_task_fair+0x41e/0x660
2018-12-20T09:11:47.002007-06:00 graphite kernel: [182400.331579] [] schedule_preempt_disabled+0x29/0x70
2018-12-20T09:11:47.005462-06:00 graphite kernel: [182400.335034] [] __mutex_lock_slowpath+0xc7/0x1d0
2018-12-20T09:11:47.008852-06:00 graphite kernel: [182400.338424] [] mutex_lock+0x1f/0x2f
2018-12-20T09:11:47.011976-06:00 graphite kernel: [182400.341547] [] lookup_slow+0x33/0xa7
2018-12-20T09:11:47.015080-06:00 graphite kernel: [182400.344708] [] link_path_walk+0x80f/0x8b0
2018-12-20T09:11:47.018435-06:00 graphite kernel: [182400.348015] [] ? sched_clock_cpu+0x85/0xc0
2018-12-20T09:11:47.025010-06:00 graphite kernel: [182400.351289] [] ? check_preempt_curr+0x80/0xa0
2018-12-20T09:11:47.025016-06:00 graphite kernel: [182400.354643] [] path_openat+0xb5/0x640
2018-12-20T09:11:47.031338-06:00 graphite kernel: [182400.357806] [] do_filp_open+0x4d/0xb0
2018-12-20T09:11:47.031343-06:00 graphite kernel: [182400.360971] [] ? __alloc_fd+0x47/0x170
2018-12-20T09:11:47.037760-06:00 graphite kernel: [182400.364176] [] do_sys_open+0x137/0x240
2018-12-20T09:11:47.037766-06:00 graphite kernel: [182400.367395] [] SyS_openat+0x14/0x20
2018-12-20T09:11:47.044294-06:00 graphite kernel: [182400.370561] [] system_call_fastpath+0x22/0x27
2018-12-20T09:11:47.044298-06:00 graphite kernel: [182400.373927] INFO: task go-carbon:15808 blocked for more than 120 seconds.
2018-12-20T09:11:47.053520-06:00 graphite kernel: [182400.377511] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-12-20T09:11:47.059693-06:00 graphite kernel: [182400.383151] go-carbon D ffff98e2f9899040 0 15808 1 0x00000080
2018-12-20T09:11:47.059697-06:00 graphite kernel: [182400.386851] Call Trace:
2018-12-20T09:11:47.059699-06:00 graphite kernel: [182400.389328] [] schedule_preempt_disabled+0x29/0x70
2018-12-20T09:11:47.066609-06:00 graphite kernel: [182400.392828] [] __mutex_lock_slowpath+0xc7/0x1d0
2018-12-20T09:11:47.066615-06:00 graphite kernel: [182400.396242] [] mutex_lock+0x1f/0x2f
2018-12-20T09:11:47.072901-06:00 graphite kernel: [182400.399362] [] do_last+0x28f/0x12a0
2018-12-20T09:11:47.072906-06:00 graphite kernel: [182400.402534] [] ? check_preempt_curr+0x80/0xa0
2018-12-20T09:11:47.079638-06:00 graphite kernel: [182400.405964] [] path_openat+0xd7/0x640
2018-12-20T09:11:47.079646-06:00 graphite kernel: [182400.409270] [] do_filp_open+0x4d/0xb0
2018-12-20T09:11:47.086096-06:00 graphite kernel: [182400.412527] [] ? __alloc_fd+0x47/0x170
2018-12-20T09:11:47.086102-06:00 graphite kernel: [182400.415729] [] do_sys_open+0x137/0x240
2018-12-20T09:11:47.092445-06:00 graphite kernel: [182400.418940] [] SyS_openat+0x14/0x20
2018-12-20T09:11:47.092450-06:00 graphite kernel: [182400.422070] [] system_call_fastpath+0x22/0x27
2018-12-20T09:11:47.099570-06:00 graphite kernel: [182400.425595] INFO: task go-carbon:15812 blocked for more than 120 seconds.
2018-12-20T09:11:47.099575-06:00 graphite kernel: [182400.429200] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-12-20T09:11:47.111398-06:00 graphite kernel: [182400.434785] go-carbon D ffff98e2eb28c100 0 15812 1 0x00000080
2018-12-20T09:11:47.111404-06:00 graphite kernel: [182400.438561] Call Trace:
2018-12-20T09:11:47.111406-06:00 graphite kernel: [182400.441032] [] ? dequeue_task_fair+0x41e/0x660
2018-12-20T09:11:47.118266-06:00 graphite kernel: [182400.444410] [] schedule_preempt_disabled+0x29/0x70
2018-12-20T09:11:47.118273-06:00 graphite kernel: [182400.447897] [] __mutex_lock_slowpath+0xc7/0x1d0
2018-12-20T09:11:47.124795-06:00 graphite kernel: [182400.451277] [] mutex_lock+0x1f/0x2f
2018-12-20T09:11:47.124802-06:00 graphite kernel: [182400.454427] [] lookup_slow+0x33/0xa7
2018-12-20T09:11:47.131260-06:00 graphite kernel: [182400.457634] [] link_path_walk+0x80f/0x8b0
2018-12-20T09:11:47.131267-06:00 graphite kernel: [182400.460892] [] ? sched_clock_cpu+0x85/0xc0
2018-12-20T09:11:47.137922-06:00 graphite kernel: [182400.464208] [] ? check_preempt_curr+0x80/0xa0
2018-12-20T09:11:47.137927-06:00 graphite kernel: [182400.467552] [] path_openat+0xb5/0x640
2018-12-20T09:11:47.144356-06:00 graphite kernel: [182400.470727] [] ? getname_flags+0x4f/0x1a0
2018-12-20T09:11:47.144363-06:00 graphite kernel: [182400.473987] [] do_filp_open+0x4d/0xb0
2018-12-20T09:11:47.150785-06:00 graphite kernel: [182400.477152] [] ? __alloc_fd+0x47/0x170
2018-12-20T09:11:47.150791-06:00 graphite kernel: [182400.480416] [] do_sys_open+0x137/0x240
2018-12-20T09:11:47.157162-06:00 graphite kernel: [182400.483653] [] SyS_openat+0x14/0x20
2018-12-20T09:11:47.157170-06:00 graphite kernel: [182400.486793] [] system_call_fastpath+0x22/0x27

@bogorad
Copy link

bogorad commented Aug 18, 2020

I have exactly the same problem.

Raspberry pi-4, 8 GiB RAM, Ubuntu 20.04 64bit, latest patches.
4 WD-RED (the good ones) drives, on a cheap external USB3 box by Orico.

Works fine, but any sustained (moderately) heavy load - like 30-40 MiB/sec for half an hour - causes this error. Since my docker files are also on ZFS, if basically locks the system up.

Any ideas/hopes?

@cvkline
Copy link

cvkline commented Oct 20, 2020

I have almost the same identical system configuration as bogorad, and I see the exact problem as well, FWIW.

Works amazingly well most of the time, but when I set up a filesystem on my raidz pool for Time Machine backups via smbd, the first big backup (200+ GB) locks up zfs with all tasks blocked and the "Tainted: P C OE" message. The traceback for each hung task in each case ends with schedule() -> __schedule() -> __switch_to()

@bogorad
Copy link

bogorad commented Oct 20, 2020

I have almost the same identical system configuration as bogorad, and I see the exact problem as well, FWIW.

Funny thing - after a couple of updates it just stabilized. I've been downloading heavy torrents, and also re-mixing stuff with mkvmerge (20-60 gigs in and out) and all is good. Knock on wood ;)

@cvkline
Copy link

cvkline commented Oct 20, 2020

Luckily the desk I'm sitting at is solid wood, so I just literally did that. 👍

I also just did apt upgrade to be sure I was fully up to date and it put my ubuntu kernel ahead from 5.4.0-1021-raspi to 1022. Doesn't seem like much, but we'll see I guess.

Thanks for the report.

@cvkline
Copy link

cvkline commented Oct 21, 2020

I also found a PPA that someone used to build the latest zfs kmod (0.8.4), that showed some promise, but I'm still getting lockups after about 250GB of data transfer. It's quite reliable, actually, at what point it fails. I wonder if it's a counter or buffer reuse issue.

@stale
Copy link

stale bot commented Oct 22, 2021

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Oct 22, 2021
@stale stale bot closed this as completed Jan 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale No recent activity for issue
Projects
None yet
Development

No branches or pull requests

9 participants