Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS send dead for over 12 hours #3676

Closed
eolson78 opened this issue Aug 10, 2015 · 4 comments
Closed

ZFS send dead for over 12 hours #3676

eolson78 opened this issue Aug 10, 2015 · 4 comments

Comments

@eolson78
Copy link

CENTOS 6.5 Kernel 3.14.4
Running a 3 Zpool's
with an NFS based Filebench load running
send and receive running at every 1 minute interval for Replication

the ZFS Send task sits in a D state as seen below and the stacks indicate constant traversal of the pre-fetch thread.

root 70913 0.0 0.0 201104 1564 ? D Aug09 0:46 zfs send -PRpv -i PoolLoad1/vol2 snaprep64 PoolLoad1/vol2 snaprep65
root 70917 0.0 0.0 201104 1568 ? D Aug09 0:47 zfs send -PRpv -i PoolLoad1/vol3 snaprep61 PoolLoad1/vol3 snaprep62
root 70919 0.0 0.0 201104 1568 ? D Aug09 0:46 zfs send -PRpv -i PoolLoad1/vol1 snaprep72 PoolLoad1/vol1 snaprep73
root 70920 0.0 0.0 201104 1568 ? D Aug09 0:41 zfs send -PRpv -i PoolLoad1/vol5 snaprep75 PoolLoad1/vol5 snaprep76
root 125338 0.0 0.0 0 0 ? D 12:58 0:00 [kworker/2:0]
root@ip-50-0-0-62:/home/ec2-user# cat /proc/70913/stack
[] taskq_wait_id+0x4d/0x90 [spl]
[] spa_taskq_dispatch_sync+0x8d/0xc0 [zfs]
[] dump_bytes+0x42/0x50 [zfs]
[] dump_write+0xff/0x260 [zfs]
[] backup_cb+0x466/0x570 [zfs]
[] traverse_visitbp+0x4a6/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_dnode+0x71/0xd0 [zfs]
[] traverse_visitbp+0x687/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_dnode+0x71/0xd0 [zfs]
[] traverse_visitbp+0x730/0x810 [zfs]
[] traverse_impl+0x168/0x390 [zfs]
[] traverse_dataset+0x45/0x50 [zfs]
[] dmu_send_impl+0x3cb/0x4e0 [zfs]
[] dmu_send_obj+0x131/0x1e0 [zfs]
[] zfs_ioc_send+0x97/0x260 [zfs]
[] zfsdev_ioctl+0x495/0x4d0 [zfs]
[] do_vfs_ioctl+0x73/0x380
[] SyS_ioctl+0xa1/0xb0
[] system_call_fastpath+0x16/0x1b
[] 0xffffffffffffffff
root@ip-50-0-0-62:/home/ec2-user# cat /proc/70917/stack
[] taskq_wait_id+0x4d/0x90 [spl]
[] spa_taskq_dispatch_sync+0x8d/0xc0 [zfs]
[] dump_bytes+0x42/0x50 [zfs]
[] dump_write+0xff/0x260 [zfs]
[] backup_cb+0x466/0x570 [zfs]
[] traverse_visitbp+0x4a6/0x810 [zfs]
[] traverse_dnode+0x71/0xd0 [zfs]
[] traverse_visitbp+0x687/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_dnode+0x71/0xd0 [zfs]
[] traverse_visitbp+0x730/0x810 [zfs]
[] traverse_impl+0x168/0x390 [zfs]
[] traverse_dataset+0x45/0x50 [zfs]
[] dmu_send_impl+0x3cb/0x4e0 [zfs]
[] dmu_send_obj+0x131/0x1e0 [zfs]
[] zfs_ioc_send+0x97/0x260 [zfs]
[] zfsdev_ioctl+0x495/0x4d0 [zfs]
[] do_vfs_ioctl+0x73/0x380
[] SyS_ioctl+0xa1/0xb0
[] system_call_fastpath+0x16/0x1b
[] 0xffffffffffffffff
root@ip-50-0-0-62:/home/ec2-user# cat /proc/70919/stack
[] taskq_wait_id+0x4d/0x90 [spl]
[] spa_taskq_dispatch_sync+0x8d/0xc0 [zfs]
[] dump_bytes+0x42/0x50 [zfs]
[] dump_write+0xff/0x260 [zfs]
[] backup_cb+0x466/0x570 [zfs]
[] traverse_visitbp+0x4a6/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_dnode+0x71/0xd0 [zfs]
[] traverse_visitbp+0x687/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_dnode+0x71/0xd0 [zfs]
[] traverse_visitbp+0x730/0x810 [zfs]
[] traverse_impl+0x168/0x390 [zfs]
[] traverse_dataset+0x45/0x50 [zfs]
[] dmu_send_impl+0x3cb/0x4e0 [zfs]
[] dmu_send_obj+0x131/0x1e0 [zfs]
[] zfs_ioc_send+0x97/0x260 [zfs]
[] zfsdev_ioctl+0x495/0x4d0 [zfs]
[] do_vfs_ioctl+0x73/0x380
[] SyS_ioctl+0xa1/0xb0
[] system_call_fastpath+0x16/0x1b
[] 0xffffffffffffffff
root@ip-50-0-0-62:/home/ec2-user# cat /proc/70920/stack
[] taskq_wait_id+0x4d/0x90 [spl]
[] spa_taskq_dispatch_sync+0x8d/0xc0 [zfs]
[] dump_bytes+0x42/0x50 [zfs]
[] dump_write+0xff/0x260 [zfs]
[] backup_cb+0x466/0x570 [zfs]
[] traverse_visitbp+0x4a6/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_dnode+0x71/0xd0 [zfs]
[] traverse_visitbp+0x687/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_visitbp+0x41b/0x810 [zfs]
[] traverse_dnode+0x71/0xd0 [zfs]
[] traverse_visitbp+0x730/0x810 [zfs]
[] traverse_impl+0x168/0x390 [zfs]
[] traverse_dataset+0x45/0x50 [zfs]
[] dmu_send_impl+0x3cb/0x4e0 [zfs]
[] dmu_send_obj+0x131/0x1e0 [zfs]
[] zfs_ioc_send+0x97/0x260 [zfs]
[] zfsdev_ioctl+0x495/0x4d0 [zfs]
[] do_vfs_ioctl+0x73/0x380
[] SyS_ioctl+0xa1/0xb0
[] system_call_fastpath+0x16/0x1b
[] 0xffffffffffffffff

@eolson78
Copy link
Author

this really appears similar to #3655

@dweeezil
Copy link
Contributor

@eolson78 The issue threads on which it's waiting are sending the data through the pipe and on to the receiver. You'll see this type of stack when the receiving end is blocked. For example, running zfs send <whatever>@snap | sleep 3600 will produce the exact same stack in the zfs process. The problem is on the receiving end.

@dweeezil
Copy link
Contributor

And for completeness, yes, #3655 is the exact same issue. You need to check out what's happening on the receive end. This can be caused poor network connectivity, stuck TCP sessions or, of course, a problem with ZFS on the receiving side (assuming the send stream is targeting a zfs receive rather than being written to a file or a device).

@behlendorf
Copy link
Contributor

Closing as duplicate of #3655.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants