Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.7.0-rc4] PANIC at abd.c:589:abd_alloc() VERIFY3(size <= (1ULL << 24)) failed (16781312 <= 16777216) #6259

Closed
NateCrawford opened this issue Jun 22, 2017 · 2 comments

Comments

@NateCrawford
Copy link

System information

Type Version/Name
Distribution Name CentOS
Distribution Version 7.3
Linux Kernel 3.10.0-514.21.1.el7.x86_64
Architecture x86_64
ZFS Version 0.7.0-rc4
SPL Version 0.7.0-rc4

Describe the problem you're observing

During benchmark testing/parameter scan of a Lustre 2.10 system (actually 2.9.58_57_gc252b3b) using obdfilter-survey, getting panics in ABD when larger-than-SPA_MAXBLOCKSIZE assertion fails.

Describe how to reproduce the problem

Summary of disk arrangement:
90 SAS disks in 3 JBODS on two SAS cards, all using multipath.
8 11-disk raidz2 pools, two global spares

Non-default kernel module parameters:
metaslab_aliquot=2097152
metaslab_debug_unload=1
zfs_dirty_data_max=2147483648
zfs_dirty_data_sync=134217728
zfs_max_recordsize=16777216
zfs_prefetch_disable=1
zfs_txg_history=10
zfs_vdev_aggregation_limit=16777216
zfs_vdev_async_write_min_active=5
zfs_vdev_async_write_max_active=15
zfs_vdev_async_write_active_min_dirty_percent=20
zfs_vdev_scheduler=deadline
zfs_arc_max=51539607552

Non-default zfs dataset settings:
atime=off
dnodesize=auto
xattr=sa
recordsize=2M
compression=lz4

Lustre automatically detected zfs backend recordsize:
obdfilter.DFS-L-OST0000.blocksize=2097152
osd-zfs.DFS-L-OST0000.blocksize=2097152


On this Object Storage Server (dfs-data-1), this command was run as part of a scan over zfs recordsizes:

size=24576 rszlo=512 rszhi=16384 rszmax=16384 nobjlo=2 nobjhi=8 thrhi=32 tests_str="write read" obdfilter-survey

On ZFS 0.7.0-rc3 with Lustre 2.9, this ran successfully using zfs dataset recordsizes from 512K to 16M.
On ZFS 0.7.0-rc4 with Lustre 2.9.58_57_gc252b3b, this ran fine on zfs recordsizes of 512K and 1M, but with zfs 2M, failed when lustre recordsize reached 2M. (see obdfilter-survey output below)

Looking at the panic and assertion failure, and digging around for similar problem reports, I suspect the unusually-large zfs_vdev_aggregation_limit=16777216 is causing issues. Reducing it to 8388608 allowed the obdfilter-survey run to succeed. It was originally set at that high value in an attempt to improve lustre small-file performance, but is probably not appropriate.

This could very well be a bug in Lustre, or at least the lack of checking for unusually stupid combinations of parameters.

Include any warning/errors/backtraces from the system logs

obdfilter-survey output:

Wed Jun 21 11:56:15 PDT 2017 Obdfilter-survey for case=disk from dfs-data-1.gp.local
ost  8 sz 201326592K rsz  512K obj   16 thr   16 write 3574.78 [ 202.48,  684.82] read 4533.26 [ 289.99, 1183.93]
ost  8 sz 201326592K rsz  512K obj   16 thr   32 write 3644.97 [ 129.90, 1210.57] read 3307.48 [ 248.99,  608.98]
ost  8 sz 201326592K rsz  512K obj   16 thr   64 write 4473.54 [   0.00, 1385.39] read 3926.88 [ 303.97,  744.47]
ost  8 sz 201326592K rsz  512K obj   16 thr  128 write 4767.80 [   0.00, 1601.13] read 5519.96 [ 606.95,  835.46]
ost  8 sz 201326592K rsz  512K obj   16 thr  256 write 4719.14 [   0.00, 1649.28] read 4689.06 [ 523.96,  674.89]
ost  8 sz 201326592K rsz  512K obj   32 thr   32 write 3717.30 [ 185.26,  772.67] read 4883.16 [ 501.97,  991.95]
ost  8 sz 201326592K rsz  512K obj   32 thr   64 write 4296.86 [ 119.43, 1084.03] read 4318.96 [ 277.98,  784.93]
ost  8 sz 201326592K rsz  512K obj   32 thr  128 write 4414.14 [   0.00, 1341.92] read 5065.33 [ 375.98,  869.96]
ost  8 sz 201326592K rsz  512K obj   32 thr  256 write 4488.83 [   0.00, 1707.83] read 4708.66 [ 515.43,  844.95]
ost  8 sz 201326592K rsz  512K obj   64 thr   64 write 3460.39 [  81.48, 1057.32] read 5545.02 [ 357.97, 1093.94]
ost  8 sz 201326592K rsz  512K obj   64 thr  128 write 4525.58 [ 117.75, 1146.90] read 5506.45 [ 366.98,  888.91]
ost  8 sz 201326592K rsz  512K obj   64 thr  256 write 4731.65 [   0.00, 1227.37] read 4636.78 [ 401.99,  772.93]
ost  8 sz 201326592K rsz 1024K obj   16 thr   16 write 3494.20 [ 263.96,  618.93] read 3609.39 [ 381.98,  581.98]
ost  8 sz 201326592K rsz 1024K obj   16 thr   32 write 4356.40 [   0.00, 1509.96] read 3605.28 [ 251.98,  724.97]
ost  8 sz 201326592K rsz 1024K obj   16 thr   64 write 4666.73 [   0.00, 1805.68] read 5216.97 [ 449.97,  902.97]
ost  8 sz 201326592K rsz 1024K obj   16 thr  128 write 4911.76 [   0.00, 2131.19] read 6057.66 [ 606.98,  988.93]
ost  8 sz 201326592K rsz 1024K obj   16 thr  256 write 5011.22 [   0.00, 1868.84] read 5681.46 [ 584.91,  930.94]
ost  8 sz 201326592K rsz 1024K obj   32 thr   32 write 3516.98 [ 269.97,  728.90] read 4432.11 [ 385.98,  693.96]
ost  8 sz 201326592K rsz 1024K obj   32 thr   64 write 4035.67 [ 140.99, 1231.48] read 4355.16 [ 301.97,  836.95]
ost  8 sz 201326592K rsz 1024K obj   32 thr  128 write 4599.95 [   0.00, 1640.92] read 5686.64 [ 429.99,  961.93]
ost  8 sz 201326592K rsz 1024K obj   32 thr  256 write 4738.75 [   0.00, 1981.02] read 5448.43 [ 576.91,  849.87]
ost  8 sz 201326592K rsz 1024K obj   64 thr   64 write 3791.99 [ 203.99,  979.81] read 4907.09 [ 247.98,  841.97]
ost  8 sz 201326592K rsz 1024K obj   64 thr  128 write 3968.92 [   0.00, 1256.57] read 5098.25 [ 373.97,  928.70]
ost  8 sz 201326592K rsz 1024K obj   64 thr  256 write 4998.60 [   0.00, 1681.67] read 5494.18 [ 574.94,  934.90]
ost  8 sz 201326592K rsz 2048K obj   16 thr   16 write
Message from syslogd@dfs-data-1 at Jun 21 12:35:04 ...
 kernel:VERIFY3(size <= (1ULL << 24)) failed (16781312 <= 16777216)

Message from syslogd@dfs-data-1 at Jun 21 12:35:04 ...
 kernel:PANIC at abd.c:589:abd_alloc()

Message from syslogd@dfs-data-1 at Jun 21 12:35:15 ...
 kernel:VERIFY3(size <= (1ULL << 24)) failed (16781312 <= 16777216)

Message from syslogd@dfs-data-1 at Jun 21 12:35:15 ...
 kernel:PANIC at abd.c:589:abd_alloc()

Syslog:

Jun 21 12:35:04 dfs-data-1 kernel: VERIFY3(size <= (1ULL << 24)) failed (16781312 <= 16777216)
Jun 21 12:35:04 dfs-data-1 kernel: PANIC at abd.c:589:abd_alloc()
Jun 21 12:35:04 dfs-data-1 kernel: Showing stack for process 32601
Jun 21 12:35:04 dfs-data-1 kernel: CPU: 14 PID: 32601 Comm: z_wr_int_7 Tainted: P           OE  ------------   3.10.0-
514.21.1.el7.x86_64 #1
Jun 21 12:35:04 dfs-data-1 kernel: Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
Jun 21 12:35:04 dfs-data-1 kernel: ffffffffa3001563 0000000088ac422b ffff88be738b7a60 ffffffff81686f13
Jun 21 12:35:04 dfs-data-1 kernel: ffff88be738b7a70 ffffffffa04b3234 ffff88be738b7bf8 ffffffffa04b32ff
Jun 21 12:35:04 dfs-data-1 kernel: ffffffff812f01d3 ffffc90000000030 ffff88be738b7c08 ffff88be738b7ba8
Jun 21 12:35:04 dfs-data-1 kernel: Call Trace:
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff81686f13>] dump_stack+0x19/0x1b
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffffa04b3234>] spl_dumpstack+0x44/0x50 [spl]
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffffa04b32ff>] spl_panic+0xbf/0xf0 [spl]
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff812f01d3>] ? blk_peek_request+0x83/0x280
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff812eb7e3>] ? __blk_run_queue+0x33/0x40
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff812eb8a7>] ? queue_unplugged+0x37/0xa0
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffffa2eb393b>] abd_alloc+0x44b/0x470 [zfs]
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff811dc8e1>] ? __slab_free+0x81/0x2f0
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff811dc8e1>] ? __slab_free+0x81/0x2f0
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffffa2eb3bae>] abd_alloc_for_io+0xe/0x10 [zfs]
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffffa2f3057f>] vdev_queue_io_to_issue+0x60f/0x8c0 [zfs]
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff810caebe>] ? account_entity_dequeue+0xae/0xd0
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffffa2f30ce5>] vdev_queue_io_done+0x155/0x200 [zfs]
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffffa2f72a48>] zio_vdev_io_done+0xd8/0x1e0 [zfs]
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffffa2f733cc>] zio_execute+0x9c/0x100 [zfs]
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffffa04b0ec6>] taskq_thread+0x246/0x470 [spl]
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffffa04b0c80>] ? taskq_thread_spawn+0x60/0x60 [spl]
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff810b0a4f>] kthread+0xcf/0xe0
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff81697518>] ret_from_fork+0x58/0x90
Jun 21 12:35:04 dfs-data-1 kernel: [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
Jun 21 12:35:15 dfs-data-1 kernel: VERIFY3(size <= (1ULL << 24)) failed (16781312 <= 16777216)
Jun 21 12:35:15 dfs-data-1 kernel: PANIC at abd.c:589:abd_alloc()
Jun 21 12:35:15 dfs-data-1 kernel: Showing stack for process 9785
Jun 21 12:35:15 dfs-data-1 kernel: CPU: 33 PID: 9785 Comm: z_wr_int_2 Tainted: P           OE  ------------   3.10.0-5
14.21.1.el7.x86_64 #1
Jun 21 12:35:15 dfs-data-1 kernel: Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
Jun 21 12:35:15 dfs-data-1 kernel: ffffffffa3001563 000000002683e823 ffff88bcfe587a60 ffffffff81686f13
Jun 21 12:35:15 dfs-data-1 kernel: ffff88bcfe587a70 ffffffffa04b3234 ffff88bcfe587bf8 ffffffffa04b32ff
Jun 21 12:35:15 dfs-data-1 kernel: ffff885ee85b8000 ffff88bc00000030 ffff88bcfe587c08 ffff88bcfe587ba8
Jun 21 12:35:15 dfs-data-1 kernel: Call Trace:
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff81686f13>] dump_stack+0x19/0x1b
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffffa04b3234>] spl_dumpstack+0x44/0x50 [spl]
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffffa04b32ff>] spl_panic+0xbf/0xf0 [spl]
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff810c9bf8>] ? __enqueue_entity+0x78/0x80
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff810d065c>] ? enqueue_entity+0x26c/0xb60
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff810585ee>] ? physflat_send_IPI_mask+0xe/0x10
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff810cc049>] ? wakeup_gran.isra.53+0x29/0x30
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff810cabe4>] ? select_task_rq_fair+0x584/0x720
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff81033669>] ? sched_clock+0x9/0x10
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffffa2eb393b>] abd_alloc+0x44b/0x470 [zfs]
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff811dc8e1>] ? __slab_free+0x81/0x2f0
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffffa2eb3bae>] abd_alloc_for_io+0xe/0x10 [zfs]
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffffa2f3057f>] vdev_queue_io_to_issue+0x60f/0x8c0 [zfs]
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff810caebe>] ? account_entity_dequeue+0xae/0xd0
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffffa2f30ce5>] vdev_queue_io_done+0x155/0x200 [zfs]
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffffa2f72a48>] zio_vdev_io_done+0xd8/0x1e0 [zfs]
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffffa2f733cc>] zio_execute+0x9c/0x100 [zfs]
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffffa04b0ec6>] taskq_thread+0x246/0x470 [spl]
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffffa04b0c80>] ? taskq_thread_spawn+0x60/0x60 [spl]
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff810b0a4f>] kthread+0xcf/0xe0
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff81697518>] ret_from_fork+0x58/0x90
Jun 21 12:35:15 dfs-data-1 kernel: [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
@NateCrawford NateCrawford changed the title PANIC at abd.c:589:abd_alloc() VERIFY3(size <= (1ULL << 24)) failed (16781312 <= 16777216) [0.7.0-rc4] PANIC at abd.c:589:abd_alloc() VERIFY3(size <= (1ULL << 24)) failed (16781312 <= 16777216) Jun 22, 2017
behlendorf added a commit to behlendorf/zfs that referenced this issue Jun 26, 2017
Commit 8542ef8 allowed optional IOs to be aggregated beyond
the specified aggregation limit.  Since the aggregation limit
was also used to enforce the maximum block size, setting
`zfs_vdev_aggregation_limit=16777216` could result in an
attempt to allocate an ABD larger than 16M.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#6259
@behlendorf
Copy link
Contributor

@NateCrawford thanks for the detailed bug report. This issue was introduced by 8542ef8 and it can only occur when the zfs_vdev_aggregation_limit is set close to the maximum block size as you observed. I've opened #6270 with a proposed fix.

@NateCrawford
Copy link
Author

Thanks! Once it gets into a release candidate (rc5?), I'll give it another whirl. Continuing for now with 8M aggregation limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants