Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tiling->unsplit from the context menu crashes notion #334

Closed
wilhelmy opened this issue Apr 14, 2021 · 9 comments
Closed

tiling->unsplit from the context menu crashes notion #334

wilhelmy opened this issue Apr 14, 2021 · 9 comments
Labels

Comments

@wilhelmy
Copy link
Collaborator

wilhelmy commented Apr 14, 2021

Apr 14 19:15:11 fukushima systemd-coredump[2607]: Process 1905 (notion) of user 1000 dumped core.
                                                  
                                                  Stack trace of thread 1905:
                                                  #0  0x000000000043c9d7 lookup_dynfun (notion + 0x3c9d7)
                                                  #1  0x000000000042465e region_rescue_clientwins (notion + 0x2465e)
                                                  #2  0x0000000000424c6d region_rescue_needed (notion + 0x24c6d)
                                                  #3  0x00007f3dbc8da150 mod_tiling_untile (mod_tiling.so + 0x12150)
                                                  #4  0x00007f3dbc8da21c l2chnd_b_o__WTiling (mod_tiling.so + 0x1221c)
                                                  #5  0x000000000043aa2a extl_l1_call_handler2 (notion + 0x3aa2a)
                                                  #6  0x00007f3dbcccd573 n/a (liblua5.4.so.5 + 0xf573)
                                                  #7  0x00007f3dbccce129 n/a (liblua5.4.so.5 + 0x10129)
                                                  #8  0x00007f3dbcccb6cb n/a (liblua5.4.so.5 + 0xd6cb)
                                                  #9  0x00007f3dbccd039e n/a (liblua5.4.so.5 + 0x1239e)
                                                  #10 0x00007f3dbccd0486 lua_pcallk (liblua5.4.so.5 + 0x12486)
                                                  #11 0x000000000043b505 extl_l1_call_handler (notion + 0x3b505)
                                                  #12 0x00007f3dbcccd573 n/a (liblua5.4.so.5 + 0xf573)
                                                  #13 0x00007f3dbccdceb9 n/a (liblua5.4.so.5 + 0x1eeb9)
                                                  #14 0x00007f3dbccce142 n/a (liblua5.4.so.5 + 0x10142)
                                                  #15 0x00007f3dbcccb6cb n/a (liblua5.4.so.5 + 0xd6cb)
                                                  #16 0x00007f3dbccd039e n/a (liblua5.4.so.5 + 0x1239e)
                                                  #17 0x00007f3dbccd0486 lua_pcallk (liblua5.4.so.5 + 0x12486)
                                                  #18 0x000000000043af8c extl_dodo_call_vararg (notion + 0x3af8c)
                                                  #19 0x000000000043a17c extl_docpcall (notion + 0x3a17c)
                                                  #20 0x00007f3dbcccd573 n/a (liblua5.4.so.5 + 0xf573)
                                                  #21 0x00007f3dbccce129 n/a (liblua5.4.so.5 + 0x10129)
                                                  #22 0x00007f3dbcccb6cb n/a (liblua5.4.so.5 + 0xd6cb)
                                                  #23 0x00007f3dbccd039e n/a (liblua5.4.so.5 + 0x1239e)
                                                  #24 0x00007f3dbccd0486 lua_pcallk (liblua5.4.so.5 + 0x12486)
                                                  #25 0x0000000000439341 extl_cpcall (notion + 0x39341)
                                                  #26 0x000000000043bc81 extl_cpcall_call (notion + 0x3bc81)
                                                  #27 0x000000000043bd66 extl_call (notion + 0x3bd66)
                                                  #28 0x00007f3dbc8fffbf menu_do_finish (mod_menu.so + 0x4fbf)
                                                  #29 0x000000000043765e do_execute (notion + 0x3765e)
                                                  #30 0x000000000041a200 ioncore_mainloop (notion + 0x1a200)
                                                  #31 0x00000000004183dd main (notion + 0x183dd)
                                                  #32 0x00007f3dbc9cab25 __libc_start_main (libc.so.6 + 0x27b25)
                                                  #33 0x000000000041850e _start (notion + 0x1850e)

Edit: can anyone reproduce?

@knixeur
Copy link
Collaborator

knixeur commented Jul 6, 2021

I didn't catch the stacktrace but I confirm it crashed :(

Btw: the menuitem is called Untile

@raboof raboof added the bug label Jul 6, 2021
@kristopolous
Copy link
Contributor

kristopolous commented Feb 25, 2023

This is super reproducible and I've been looking into it for a few days (I'm pretty lazy). The "bug" is in ops.c

FOR_ALL_MANAGED_BY_TILING(reg, tiling, tmp)

That for loop gets corrupted by this line:

reg2=group_do_attach(grp, &param, &data); 

Which frees the tiling->managed_list object from underneath it and thus corrupts the pointers.

Here's a really reproducible version.

  1. do rm -fr ~/.notion
  2. start notion in the default tiled mode with 1 application window (such as xterm)
  3. right click on the title bar, go to "Tiling" => "Untile"
  4. enjoy your core.

Somehow the reparenting has to be done without the rug pull here. It has been there since at least 2014. I didn't follow the full path down though. I could bisect it if we really care.

ops.c may not be the fix point, it's just the first common parent between the defect and crash.

Here's some more details:

(gdb) b ops.c:135
(gdb) c
(gdb) watch tiling->managed_list
(gdb) c

Notice how you'll get something like this:

#0  0x000055cd3c8e4e0a in free_node (ptrlist=0x55cd3dfdb2d0, node=0x55cd3e055210) at ptrlist.c:19
#1  0x000055cd3c8e53a0 in ptrlist_remove (ptrlist=0x55cd3dfdb2d0, ptr=0x55cd3df85180) at ptrlist.c:121
#2  0x00007f2656a257e1 in tiling_do_managed_remove (ws=0x55cd3dfdb210, reg=0x55cd3df85180) at tiling.c:673
#3  0x00007f2656a258f8 in tiling_managed_remove (ws=0x55cd3dfdb210, reg=0x55cd3df85180) at tiling.c:697
#4  0x000055cd3c8b33fe in region_managed_remove (mgr=0x55cd3dfdb210, reg=0x55cd3df85180) at region.c:234
#5  0x000055cd3c8b3b39 in region_detach_manager (reg=0x55cd3df85180) at region.c:576
#6  0x000055cd3c8b5641 in doit_reparent (mgr=0x55cd3e056ba0, par=0x55cd3df455a0, fp=0x7ffe1f188240, cont=0x55cd3c8ce580 <group_do_attach_final>, cont_param=0x7ffe1f1882b0, reg=0x55cd3df85180) at attach.c:86
#7  0x000055cd3c8b58f2 in region_attach_helper (mgr=0x55cd3e056ba0, par=0x55cd3df455a0, fp=0x7ffe1f188240, fn=0x55cd3c8ce580 <group_do_attach_final>, fn_param=0x7ffe1f1882b0, data=0x7ffe1f188290) at attach.c:171
#8  0x000055cd3c8ce96f in group_do_attach (ws=0x55cd3e056ba0, param=0x7ffe1f1882b0, data=0x7ffe1f188290) at group.c:729
#9  0x00007f2656a31cfb in mod_tiling_untile (tiling=0x55cd3dfdb210) at ops.c:148

That's where your memory corruption occurs. I don't know enough about the mechanics here. Maybe a simple copy of the ptr list and then a freeing at the end of the loop would do it?

@kristopolous
Copy link
Contributor

back. did the git bisect. defect came in at 8d3f262, it's the rug pull as theorized. It's in tiling_managed_remove right at splittree_remove, author is @raboof .

I'll look into it more.

@raboof
Copy link
Owner

raboof commented Feb 27, 2023

Ouch, sorry about that. Before 8d3f262 that splittree_remove was already there, but perhaps triggered under different/fewer conditions. That if(!reused) that got removed looks potentially relevant I guess...

I'll look into it more

Thanks!

@kristopolous
Copy link
Contributor

it's calling ptrlist_remove(&(ws->managed_list), reg); in tiling_do_managed_remove which we identified from above as the issue here.

Here's a really cheap solution

a/mod_tiling/ops.c b/mod_tiling/ops.c
index 0bd21912..bc5059d9 100644
--- a/mod_tiling/ops.c
+++ b/mod_tiling/ops.c
@@ -147,13 +147,20 @@ bool mod_tiling_untile(WTiling *tiling)
 
         reg2=group_do_attach(grp, &param, &data);
 
+        // See #334: tiling->unsplit from the context menu crashes notion
+        if(tiling->managed_list == NULL) {
+            break;
+        }
+
         if(reg2==NULL)
             warn(TR("Unable to move a region from tiling to group."));
     }
 
     tiling->batchop=FALSE;
 
-    region_dispose((WRegion*)tiling);
+    if(tiling->managed_list != NULL) {
+        region_dispose((WRegion*)tiling);
+    }
 
     return TRUE;
 }

we can probably do better than that

@kristopolous
Copy link
Contributor

I'm also open to just using this and then opening up a more minor memory leak bug which I'm sure is sitting around somewhere in this hack and then just move on.

kristopolous added a commit to kristopolous/notion that referenced this issue Mar 21, 2023
raboof added a commit that referenced this issue Mar 21, 2023
A cheap fix to prevent the crash described in #334 from happening
@raboof
Copy link
Owner

raboof commented Mar 21, 2023

I merged #357 - do you want to keep this issue open to keep looking for a clearer solution or rather close it?

@kristopolous
Copy link
Contributor

no ... let me handle that

@kristopolous
Copy link
Contributor

apparently only you have the privilege of closing. Feel free to do that now that we have documented and opened up the more major and now less visible issue.

@raboof raboof closed this as completed Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants