__taskq_destroy() hang #71

behlendorf · 2011-12-14T18:56:28Z

The following hang was observed (rarely) after running all of xfstests when it was unloading the zfs modules and destroying the pools. It looks like this was accidentally introduced by the recent taskq optimization is issue #65. There appears to be a case where we're waiting for a work it we think was queued but never gets executed.

SPL: Loaded module v0.6.0, using hostid 0x007f0100
ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5
INFO: task zpool:7661 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
zpool         D 0000000000000001     0  7661  21432 0x00000080
 ffff88003d533ca8 0000000000000086 0000000000000001 0000000000000000
 ffff88003d533c38 ffffffff81062384 ffff88003d533c38 000000004d97173a
 ffff88010e965078 ffff88003d533fd8 000000000000f508 ffff88010e965078
Call Trace:
 [] ? enqueue_task_fair+0x64/0x100
 [] ? prepare_to_wait+0x4e/0x80
 [] __taskq_wait_id+0x65/0xa0 [spl]
 [] ? autoremove_wake_function+0x0/0x40
 [] __taskq_wait+0x40/0x50 [spl]
 [] __taskq_destroy+0x3c/0xf0 [spl]
 [] spa_deactivate+0x60/0x190 [zfs]
 [] spa_export_common+0xfd/0x310 [zfs]
 [] spa_destroy+0x1a/0x20 [zfs]
 [] zfs_ioc_pool_destroy+0x1e/0x40 [zfs]
 [] zfsdev_ioctl+0xfd/0x1d0 [zfs]
 [] vfs_ioctl+0x22/0xa0
 [] do_vfs_ioctl+0x84/0x580
 [] sys_ioctl+0x81/0xa0
 [] system_call_fastpath+0x16/0x1b

prakashsurya · 2011-12-15T16:12:29Z

Have you only seen this the one time we talked about?

prakashsurya · 2011-12-15T19:14:35Z

I seem to be able to reliably reproduce this issue running ZFS's zpios-sanity script on my ARCH VM. Here's another stack:

[  563.764133] ------------[ cut here ]------------
[  563.764588] kernel BUG at mm/slub.c:2969!
[  563.764966] invalid opcode: 0000 [#1] SMP 
[  563.765375] last sysfs file: /sys/devices/virtual/bdi/zfs-6/uevent
[  563.765950] CPU 3 
[  563.766157] Modules linked in: zpios loop zfs(P) zcommon(P) zunicode(P) znvpair(P) zavl(P) splat spl zlib zlib_deflate ipv6 ext2 snd_ens1370 gameport snd_rawmidi 8139cp snd_seq_device snd_pcm 8139too serio_raw mii snd_timer i2c_piix4 floppy psmouse thermal snd pcspkr i2c_core soundcore evdev snd_page_alloc processor button ext4 mbcache jbd2 crc16 sr_mod cdrom sd_mod pata_acpi uhci_hcd ata_piix libata usbcore scsi_mod [last unloaded: zpios]
[  563.770392] Pid: 5972, comm: lt-zpool Tainted: P           2.6.32.49-1-lts #1 Bochs
[  563.771104] RIP: 0010:[<ffffffff81134228>]  [<ffffffff81134228>] kfree+0x128/0x130
[  563.771823] RSP: 0018:ffff880060c83d08  EFLAGS: 00010046
[  563.772318] RAX: 0100000000000000 RBX: ffffffffa07d5986 RCX: ffff8800718fa028
[  563.772978] RDX: 000000000038c7d0 RSI: ffffea00018d76b0 RDI: ffff8800718fa020
[  563.773637] RBP: ffff880060c83d28 R08: 0000000000000001 R09: 0000000000000000
[  563.774095] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800718fa020
[  563.774095] R13: ffff8800633a3250 R14: 0000000000000006 R15: ffff88006306b901
[  563.774095] FS:  00007fcbca771b40(0000) GS:ffff880001b80000(0000) knlGS:0000000000000000
[  563.774095] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  563.774095] CR2: 00007fcbc81c60ee CR3: 0000000060c42000 CR4: 00000000000006e0
[  563.774095] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  563.774095] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  563.774095] Process lt-zpool (pid: 5972, threadinfo ffff880060c82000, task ffff88006306b980)
[  563.774095] Stack:
[  563.774095]  ffff880060c83d38 ffff880075567c00 ffff880075567c68 ffff8800633a3250
[  563.774095] <0> ffff880060c83d38 ffffffffa07d5986 ffff880060c83d68 ffffffffa07d7bbb
[  563.774095] <0> ffff880060c83d78 ffff880074f2c000 0000000000000000 ffff880074f2c000
[  563.774095] Call Trace:
[  563.774095]  [<ffffffffa07d5986>] kmem_free_debug+0x16/0x20 [spl]
[  563.774095]  [<ffffffffa07d7bbb>] __taskq_destroy+0xbb/0xf0 [spl]
[  563.774095]  [<ffffffffa08f3275>] spa_deactivate+0xb5/0x210 [zfs]
[  563.774095]  [<ffffffffa08f7d8a>] spa_export_common+0x13a/0x370 [zfs]
[  563.774095]  [<ffffffffa08f801a>] spa_destroy+0x1a/0x20 [zfs]
[  563.774095]  [<ffffffffa0926e1e>] zfs_ioc_pool_destroy+0x1e/0x40 [zfs]
[  563.774095]  [<ffffffffa092b30c>] zfsdev_ioctl+0xdc/0x1b0 [zfs]
[  563.774095]  [<ffffffff81157c8a>] vfs_ioctl+0x2a/0xa0
[  563.774095]  [<ffffffff81117dd0>] ? unmap_region+0x150/0x170
[  563.774095]  [<ffffffff8115820d>] do_vfs_ioctl+0x7d/0x530
[  563.774095]  [<ffffffff81161b8f>] ? alloc_fd+0x4f/0x150
[  563.774095]  [<ffffffff81158741>] sys_ioctl+0x81/0xa0
[  563.774095]  [<ffffffff81012072>] system_call_fastpath+0x16/0x1b
[  563.774095] Code: bf 88 4c 00 4d 85 ed 0f 84 26 ff ff ff 49 8b 45 00 49 83 c5 08 4c 89 e6 48 89 df ff d0 49 8b 45 00 48 85 c0 75 eb e9 08 ff ff ff <0f> 0b eb fe 0f 1f 40 00 55 48 89 e5 0f 1f 44 00 00 48 81 ef a8 
[  563.774095] RIP  [<ffffffff81134228>] kfree+0x128/0x130
[  563.774095]  RSP <ffff880060c83d08>
[  563.774095] ---[ end trace 1758a610091151e0 ]---

And it looks to be hitting BUG_ON here:

void kfree(const void *x)                                                           
{                                                                                   
        struct page *page;                                                          
        void *object = (void *)x;                                                   
                                                                                    
        trace_kfree(_RET_IP_, x);                                                   
                                                                                    
        if (unlikely(ZERO_OR_NULL_PTR(x)))                                          
                return;                                                             
                                                                                    
        page = virt_to_head_page(x);                                                
        if (unlikely(!PageSlab(page))) {                                            
                BUG_ON(!PageCompound(page));                                        
                kmemleak_free(x);                                                   
                put_page(page);                                                     
                return;                                                             
        }                                                                           
        slab_free(page->slab, page, object, _RET_IP_);                              
}                                                                                   
EXPORT_SYMBOL(kfree);

behlendorf · 2011-12-15T19:58:04Z

Just a thought, but since your in a kmem_free() you might enable the basic spl debugging and the memory tracking. This will help you immediately catch any memory handling mistakes. In addition, adding an ASSERT(tq->tq\_flags & TQ_ACTIVE); check to the top of __taskq_destroy() might not be a bad idea. I could see something like your describing occur if for some reason taskq_destroy() is called twice.

prakashsurya · 2011-12-15T21:15:29Z

Hmm, this may be revealing.. The first time I ran zpios-sanity with debug enabled on the SPL side I hit this ASSERT:

[ 2328.281520] SPLError: 1804:0:(spl-taskq.c:506:taskq_thread()) ASSERTION(tq->tq_lowest_id > id) failed
[ 2328.282387] SPLError: 1804:0:(spl-taskq.c:506:taskq_thread()) SPL PANIC
[ 2328.282996] SPL: Showing stack for process 1804
[ 2328.283427] Pid: 1804, comm: z_wr_iss/2 Tainted: P           2.6.32.49-1-lts #1
[ 2328.284098] Call Trace:
[ 2328.284429]  [<ffffffffa0473477>] spl_debug_dumpstack+0x27/0x40 [spl]
[ 2328.285030]  [<ffffffffa0474bd2>] spl_debug_bug+0x82/0xd0 [spl]
[ 2328.285582]  [<ffffffffa047fa3a>] taskq_thread+0x52a/0x850 [spl]
[ 2328.286145]  [<ffffffff81056c00>] ? default_wake_function+0x0/0x20
[ 2328.286721]  [<ffffffffa047f510>] ? taskq_thread+0x0/0x850 [spl]
[ 2328.287281]  [<ffffffff81084188>] kthread+0x88/0x90
[ 2328.287737]  [<ffffffff8104d478>] ? finish_task_switch+0x48/0xd0
[ 2328.288317]  [<ffffffff810130aa>] child_rip+0xa/0x20
[ 2328.288800]  [<ffffffff81084100>] ? kthread+0x0/0x90
[ 2328.289277]  [<ffffffff810130a0>] ? child_rip+0x0/0x20
[ 2328.289755] SPL: Dumping log to /tmp/spl-log.1323983531.1804

prakashsurya · 2011-12-15T21:21:28Z

I wonder if I goofed something by introducing this change:

@@ -481,10 +481,6 @@ taskq_thread(void *args)
                if (pend_list) {
                         t = list_entry(pend_list->next, taskq_ent_t, tqent_list);
                         list_del_init(&t->tqent_list);
+                       /* In order to support recursively dispatching a
+                        * preallocated taskq_ent_t, tqent_id must be
+                        * stored prior to executing tqent_func. */
+                       id = t->tqent_id;
                        tqt->tqt_ent = t;
                        taskq_insert_in_order(tq, tqt);
                         tq->tq_nactive++;
@@ -497,6 +493,7 @@ taskq_thread(void *args)
                         tq->tq_nactive--;
                        list_del_init(&tqt->tqt_active_list);
                        tqt->tqt_ent = NULL;
-                       id = t->tqent_id;
                         task_done(tq, t);

                        /* When the current lowest outstanding taskqid is

behlendorf · 2011-12-15T21:58:43Z

That's a good start. It would be worthwhile to change that to an ASSERT3S so we can actually see the bogus values, perhaps the splat tests would hit this as well with debugging enabled. As for the change you mention I don't see how it could cause this unless the tqent_func is tinkering with the tqent_id which seems very unlikely. I'd be more suspicious of something like that taskq_insert_in_order() changes although those look good too.

prakashsurya · 2011-12-15T22:05:53Z

Well tqent_func does indeed increment tqent_id if it is using a preallocated taskq_ent_t, although I don't know if that falls into the category of "tinkering". That was the reason I needed to move it in the first place.

prakashsurya · 2011-12-15T22:40:46Z

Well I ran the SPL's taskq splat tests with --enable-debug{,-kmem,-kmem-tracking}, but didn't hit the ASSERT I'm seeing.

I also changed it to an ASSERT3S as you suggested and hit it again:

[ 1758.245810] SPLError: 31360:0:(spl-taskq.c:506:taskq_thread()) VERIFY3(tq->tq_lowest_id > id) failed (1785 > 1786)

behlendorf · 2011-12-15T23:03:34Z

So we're sure taskq_lowest_id() went backwards somehow. I think your idea about performing a cross check on taskq_lowest_id() and making sure we are really getting the lowest value is a good one. Perhaps we have an issue on the taskq_insert_in_order() side of things

prakashsurya · 2011-12-15T23:59:47Z

Can you provide a sanity check for me on this patch:

diff --git a/module/spl/spl-taskq.c b/module/spl/spl-taskq.c
index b2b0e6c..1ac0e60 100644
--- a/module/spl/spl-taskq.c
+++ b/module/spl/spl-taskq.c
@@ -410,6 +410,9 @@ taskq_insert_in_order(taskq_t *tq, taskq_thread_t *tqt)
        taskq_thread_t *w;
        struct list_head *l;
 
+       taskq_thread_t *big = NULL;
+       taskq_thread_t *sml = NULL;
+
        SENTRY;
        ASSERT(tq);
        ASSERT(tqt);
@@ -425,6 +428,14 @@ taskq_insert_in_order(taskq_t *tq, taskq_thread_t *tqt)
        if (l == &tq->tq_active_list)
                list_add(&tqt->tqt_active_list, &tq->tq_active_list);
 
+       list_for_each_prev(l, &tq->tq_active_list) {
+               sml = big;
+               big = list_entry(l, taskq_thread_t, tqt_active_list);
+               if (sml != NULL) {
+                       ASSERT3S(big->tqt_ent->tqent_id, >, sml->tqt_ent->tqent_id);
+               }
+       }
+
        SEXIT;
 }

Am I traversing the list correctly (i.e. big should always have a bigger tqent_id than sml)? Or maybe I confused big and sml.. With the above patch I hit:

[  433.664553] SPLError: 24820:0:(spl-taskq.c:435:taskq_insert_in_order()) VERIFY3(big->tqt_ent->tqent_id > sml->tqt_ent->tqent_id) failed (1 > 2)
[  433.665768] SPLError: 24820:0:(spl-taskq.c:435:taskq_insert_in_order()) SPL PANIC

in the taskq:order SPL splat test.

prakashsurya · 2011-12-16T00:14:59Z

I must be confusing the order direction, I swapped the comparison operator and it passed.

behlendorf · 2011-12-16T00:31:32Z

Right, your operator was just wrong. Using list_for_each_next instead of list_for_each_prev might make it easier to read.

prakashsurya · 2011-12-16T00:35:36Z

Ok, so it does appear the active list isn't staying sorted as it should.. I hit this:

[   87.703653] SPLError: 1083:0:(spl-taskq.c:435:taskq_insert_in_order()) VERIFY3(big->tqt_ent->tqent_id >= sml->tqt_ent->tqent_id) failed (630 >= 873)
[   87.704932] SPLError: 1083:0:(spl-taskq.c:435:taskq_insert_in_order()) SPL PANIC

With this patch applied:

diff --git a/module/spl/spl-taskq.c b/module/spl/spl-taskq.c
index b2b0e6c..79e0c9b 100644
--- a/module/spl/spl-taskq.c
+++ b/module/spl/spl-taskq.c
@@ -410,6 +410,9 @@ taskq_insert_in_order(taskq_t *tq, taskq_thread_t *tqt)
        taskq_thread_t *w;
        struct list_head *l;
 
+       taskq_thread_t *big = NULL;
+       taskq_thread_t *sml = NULL;
+
        SENTRY;
        ASSERT(tq);
        ASSERT(tqt);
@@ -425,6 +428,14 @@ taskq_insert_in_order(taskq_t *tq, taskq_thread_t *tqt)
        if (l == &tq->tq_active_list)
                list_add(&tqt->tqt_active_list, &tq->tq_active_list);
 
+       list_for_each_prev(l, &tq->tq_active_list) {
+               big = sml;
+               sml = list_entry(l, taskq_thread_t, tqt_active_list);
+               if (big != NULL) {
+                       ASSERT3S(big->tqt_ent->tqent_id, >=, sml->tqt_ent->tqent_id);
+               }
+       }
+
        SEXIT;
 }

prakashsurya · 2011-12-16T01:38:03Z

With the taskq_thread id field changes we talked about, it passed the previous issues, but got caught up here:

[  304.798362] SPLError: 5353:0:(spl-taskq.c:663:__taskq_destroy()) ASSERTION(!(t->tqent_flags & TQENT_FLAG_PREALLOC)) failed
[  304.799381] SPLError: 5353:0:(spl-taskq.c:663:__taskq_destroy()) SPL PANIC

prakashsurya · 2011-12-16T18:48:26Z

Hmm, yeah so far it seems that I may have picked up an old build artifact which caused the above assertion. I cleaned out my git repos (git clean -dxf && git reset --hard) and was able to loop through the zpios-sanity tests about 20 times until I hit this:


[ 4647.350227] SPLError: 5959:0:(zio.c:1109:zio_taskq_dispatch()) ASSERTION(taskq_empty_ent(&zio->io_tqent)) failed
[ 4647.351505] SPLError: 5959:0:(zio.c:1109:zio_taskq_dispatch()) SPL PANIC
[ 4647.352302] SPL: Showing stack for process 5959
[ 4647.352868] Pid: 5959, comm: txg_sync Tainted: P           2.6.32.49-1-lts #1
[ 4647.353709] Call Trace:
[ 4647.354013]  [<ffffffffa0309477>] spl_debug_dumpstack+0x27/0x40 [spl]
[ 4647.354769]  [<ffffffffa030abd2>] spl_debug_bug+0x82/0xd0 [spl]
[ 4647.355477]  [<ffffffffa054f3ee>] zio_taskq_dispatch+0x1ae/0x1c0 [zfs]
[ 4647.356251]  [<ffffffffa0555344>] zio_wait+0x1a4/0x3f0 [zfs]
[ 4647.356919]  [<ffffffff813960ae>] ? mutex_unlock+0xe/0x10
[ 4647.357566]  [<ffffffffa04cb72e>] dsl_pool_sync+0x14e/0x7f0 [zfs]
[ 4647.358297]  [<ffffffffa04f2ee3>] ? spa_errlog_sync+0x1f3/0x260 [zfs]
[ 4647.359090]  [<ffffffffa054fb48>] ? zio_destroy+0x138/0x270 [zfs]
[ 4647.359815]  [<ffffffffa04e74fd>] spa_sync+0x39d/0xc40 [zfs]
[ 4647.360481]  [<ffffffff81044b88>] ? __wake_up_common+0x58/0x90
[ 4647.361176]  [<ffffffffa04fb9ae>] txg_sync_thread+0x2ae/0x4f0 [zfs]
[ 4647.361939]  [<ffffffffa04fb700>] ? txg_sync_thread+0x0/0x4f0 [zfs]
[ 4647.362678]  [<ffffffffa0312d61>] thread_generic_wrapper+0x71/0xd0 [spl]
[ 4647.363462]  [<ffffffffa0312cf0>] ? thread_generic_wrapper+0x0/0xd0 [spl]
[ 4647.364256]  [<ffffffffa0312cf0>] ? thread_generic_wrapper+0x0/0xd0 [spl]
[ 4647.365049]  [<ffffffff81084188>] kthread+0x88/0x90
[ 4647.365624]  [<ffffffff8104d478>] ? finish_task_switch+0x48/0xd0
[ 4647.366340]  [<ffffffff810130aa>] child_rip+0xa/0x20
[ 4647.366925]  [<ffffffff81084100>] ? kthread+0x0/0x90
[ 4647.367508]  [<ffffffff810130a0>] ? child_rip+0x0/0x20

prakashsurya · 2011-12-16T21:04:12Z

Well, maybe I'm wrong about the TQENT_FLAG_PREALLOC assertion above being a false alarm. Although it took running zpios-sanity nearly 50 times, I hit it again with debug enabled in the SPL but not in ZFS:


[ 1108.332022] SPLError: 28618:0:(spl-taskq.c:649:__taskq_destroy()) ASSERTION(!(t->tqent_flags & TQENT_FLAG_PREALLOC)) failed
[ 1108.333125] SPLError: 28618:0:(spl-taskq.c:649:__taskq_destroy()) SPL PANIC
[ 1108.333772] SPL: Showing stack for process 28618
[ 1108.334218] Pid: 28618, comm: lt-zpool Tainted: P           2.6.32.49-1-lts #1
[ 1108.334889] Call Trace:
[ 1108.335129]  [<ffffffffa04b0477>] spl_debug_dumpstack+0x27/0x40 [spl]
[ 1108.335730]  [<ffffffffa04b1bd2>] spl_debug_bug+0x82/0xd0 [spl]
[ 1108.336283]  [<ffffffffa04ba907>] __taskq_destroy+0x257/0x4a0 [spl]
[ 1108.336879]  [<ffffffffa05cb6c2>] ? spa_config_exit+0xa2/0xe0 [zfs]
[ 1108.337474]  [<ffffffffa05bd275>] spa_deactivate+0xb5/0x210 [zfs]
[ 1108.338079]  [<ffffffffa05c1d8a>] spa_export_common+0x13a/0x370 [zfs]
[ 1108.338689]  [<ffffffffa05c201a>] spa_destroy+0x1a/0x20 [zfs]
[ 1108.339234]  [<ffffffffa05f0e1e>] zfs_ioc_pool_destroy+0x1e/0x40 [zfs]
[ 1108.339848]  [<ffffffffa05f530c>] zfsdev_ioctl+0xdc/0x1b0 [zfs]
[ 1108.340403]  [<ffffffff81157c8a>] vfs_ioctl+0x2a/0xa0
[ 1108.340876]  [<ffffffff81117dd0>] ? unmap_region+0x150/0x170
[ 1108.341405]  [<ffffffff8115820d>] do_vfs_ioctl+0x7d/0x530
[ 1108.341910]  [<ffffffff81161b8f>] ? alloc_fd+0x4f/0x150
[ 1108.342012]  [<ffffffff81158741>] sys_ioctl+0x81/0xa0
[ 1108.342012]  [<ffffffff81012072>] system_call_fastpath+0x16/0x1b
[ 1108.342012] SPL: Dumping log to /tmp/spl-log.1324063836.28618

prakashsurya · 2011-12-16T22:12:59Z

Aha! So yes.. As we suspected, the flags going in are not necessarily the flags coming out. In this particular case, 0x1 or TQENT_FLAG_PREALLOC was set prior to servicing the task, but not after it finished:

[  699.510038] SPLError: 18278:0:(spl-taskq.c:496:taskq_thread()) VERIFY3(tqt->tqt_flags == t->tqent_flags) failed (1 == 0)
[  699.512556] SPLError: 18278:0:(spl-taskq.c:496:taskq_thread()) SPL PANIC
[  699.513192] SPL: Showing stack for process 18278
[  699.513638] Pid: 18278, comm: z_null_iss/0 Tainted: P           2.6.32.49-1-lts #1
[  699.514343] Call Trace:
[  699.514585]  [<ffffffffa04b0477>] spl_debug_dumpstack+0x27/0x40 [spl]
[  699.515190]  [<ffffffffa04b1bd2>] spl_debug_bug+0x82/0xd0 [spl]
[  699.515748]  [<ffffffffa04bba2d>] taskq_thread+0x2ed/0x8a0 [spl]
[  699.516314]  [<ffffffff81056c00>] ? default_wake_function+0x0/0x20
[  699.516895]  [<ffffffffa04bb740>] ? taskq_thread+0x0/0x8a0 [spl]
[  699.517460]  [<ffffffff81084188>] kthread+0x88/0x90
[  699.517921]  [<ffffffff8104d478>] ? finish_task_switch+0x48/0xd0
[  699.518487]  [<ffffffff810130aa>] child_rip+0xa/0x20
[  699.518955]  [<ffffffff81084100>] ? kthread+0x0/0x90
[  699.519423]  [<ffffffff810130a0>] ? child_rip+0x0/0x20

The taskq_t's active thread list is sorted based on its tqt_ent->tqent_id field. The list is kept sorted solely by inserting new taskq_thread_t's in their correct sorted location; no other means is used. This means that once inserted, if a taskq_thread_t's tqt_ent->tqent_id field changes, the list runs the risk of no longer being sorted. Prior to the introduction of the taskq_dispatch_prealloc() interface, this was not a problem as a taskq_ent_t actively being serviced under the old interface should always have a static tqent_id field. Thus, once the taskq_thread_t is added to the taskq_t's active thread list, the taskq_thread_t's tqt_ent->tqent_id field would remain constant. Now, this is no longer the case. Currently, if using the taskq_dispatch_prealloc() interface, any given taskq_ent_t actively being serviced _may_ have its tqent_id value incremented. This happens when the preallocated taskq_ent_t structure is recursively dispatched. Thus, a taskq_thread_t could potentially have its tqt_ent->tqent_id field silently modified from under its feet. If this were to happen to a taskq_thread_t on a taskq_t's active thread list, this would compromise the integrity of the order of the list (as the list _may_ no longer be sorted). To get around this, the taskq_thread_t's taskq_ent_t pointer was replaced with its own static copy of the tqent_id. So, as a taskq_ent_t is pulled off of the taskq_t's pending list, a static copy of its tqent_id is made and this copy is used to sort the active thread list. Using a static copy is key in ensuring the integrity of the order of the active thread list. Even if the underlying taskq_ent_t is recursively dispatched (as has its tqent_id modified), this static copy stored inside the taskq_thread_t will remain constant. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#71

ghost assigned prakashsurya Dec 14, 2011

prakashsurya mentioned this issue Dec 16, 2011

Swap taskq_ent_t with taskqid_t in taskq_thread_t #74

Closed

behlendorf closed this as completed in 8f2503e Dec 17, 2011

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

__taskq_destroy() hang #71

__taskq_destroy() hang #71

behlendorf commented Dec 14, 2011

prakashsurya commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

behlendorf commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

behlendorf commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

behlendorf commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

prakashsurya commented Dec 16, 2011

behlendorf commented Dec 16, 2011

prakashsurya commented Dec 16, 2011

prakashsurya commented Dec 16, 2011

prakashsurya commented Dec 16, 2011

prakashsurya commented Dec 16, 2011

prakashsurya commented Dec 16, 2011

__taskq_destroy() hang #71

__taskq_destroy() hang #71

Comments

behlendorf commented Dec 14, 2011

prakashsurya commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

behlendorf commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

behlendorf commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

behlendorf commented Dec 15, 2011

prakashsurya commented Dec 15, 2011

prakashsurya commented Dec 16, 2011

behlendorf commented Dec 16, 2011

prakashsurya commented Dec 16, 2011

prakashsurya commented Dec 16, 2011

prakashsurya commented Dec 16, 2011

prakashsurya commented Dec 16, 2011

prakashsurya commented Dec 16, 2011