Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

work around skin change crash on Linux #3144

Merged
merged 6 commits into from
Nov 2, 2020

Conversation

ronso0
Copy link
Member

@ronso0 ronso0 commented Oct 1, 2020

in the end the skin change bug on Linux (at least ~ubuntu 20.04 with Qt 5.12.8) was caused by the way we use Singletons.
I did look into the skin engine briefly a few times but -with my limited knowledge- I did not find the cause.
Tbh it's still a mystery to me why it failed with Deere since LateNight uses many more Singletons and nests them way more complicated.
Someone else may investigate this following the clues I'll post below.

Shade
fixed by removing the useless waveforms Singleton which were each loaded only ONCE in the skin.

Deere
fixed by replacing
two Spinny Singletons (with/without cover art) which were loaded in a heavily nested structure to accomodate show_paralel_waveforms, show_coverart, show_4decks
with
3 Spinnies in one Singleton: 1 for stacked waveforms, 2/3 for split waveforms (with/without cover art). I don't notice any mentionable memory impact, if at all.
Along the way I accumulated all deck Singletons in one common template (visual row, cover art, overview).

separate Spinny instances.

After almost a day of testing endless variations of how to instatiate, nest and recall Spinny singletons I ended up with this solution.

I could switch through all skins multiple times with 2 decks playing, Spinnies and CoverArt enabled without crashing.
I can also cycle through all overview renderer types (which reloads the skin instantly).
(though it did crash once when shutting down in --dev mode! idk why)

Please test!

@ronso0 ronso0 added this to the 2.3.0 milestone Oct 1, 2020
@ronso0
Copy link
Member Author

ronso0 commented Oct 1, 2020

Hopefully this brings us closer to releasing 2.3.
I'm not very keen on diff'ing Deere 2.2 // 2.3 and having to port the changes back to 2.3 before merging this.

@ywwg
Copy link
Member

ywwg commented Oct 3, 2020

It's good that we're narrowing this down, but I wonder if this hints at the root cause, which is probably some mis-ordering in the destructors?

@ronso0
Copy link
Member Author

ronso0 commented Oct 3, 2020

@foss- Did this branch fix the crash for you on macOS, as well?

@ronso0
Copy link
Member Author

ronso0 commented Oct 3, 2020

@ywwg Please check #3148 for simple steps that would also fix in Deere. Maybe those help debugging the issue.
I wonder if we should invest too much time searching & trying to fix the root cause since the crash only affects *ubuntu 20.04 systems with Qt 5.12. It has not been reported that issue occurs with higher Qt versions.
Rather we should document the DON'TS for skin creators
Even though those are smaller changes I'd like to keep the refactoring done here since it makes the sin structure more clear and smaller, too.

@ronso0 ronso0 force-pushed the skin-change-crash-fix branch from cd3ebb7 to 14d8126 Compare October 4, 2020 05:42
@ywwg
Copy link
Member

ywwg commented Oct 4, 2020

Ah, I see what you are saying. Yes if we're just exposing a bug in QT then we shouldn't work too hard around it.

@ronso0 ronso0 force-pushed the skin-change-crash-fix branch from 14d8126 to cd1eff5 Compare October 5, 2020 10:41
@ronso0
Copy link
Member Author

ronso0 commented Oct 5, 2020

I'm really at a loss here:
A few days ago, the fixes worked flawlessly, I could switch skins endlessly.
Now, Deere crashes again...
This puzzles me even more since the complex use of Singletons works perfectly in LateNight.

@ywwg
Copy link
Member

ywwg commented Oct 5, 2020

Have you triesd adding debug prints to the destructors of the relevant skin widgets, starting with the singleton class? That would help you get a more fine-grained picture of where the crash is actually coming from. The backtrace is useless because it's a response to some sort of X event, but that event is getting triggered by something... And the singletons use a tricksy model of storing pointers it doesn't own, which could be a problem. (Maybe converting those to shared_ptr instead of bare pointer would help?)

@ywwg
Copy link
Member

ywwg commented Oct 5, 2020

ah, it seems the whole skin system assumes bare pointers. welp.

@ronso0
Copy link
Member Author

ronso0 commented Oct 5, 2020

I experience the crash after the new skin is loaded / same skin is reloaded. I added debug output for each Singleton that is added to the SingletonMap, and it's empty when the new skin is loaded. All goes fine until the skin is about to be displayed.
The skin crashes even though no Spinny Singleton is shown in the skin (just instantiated).

Any more hints are welcome!

@ywwg
Copy link
Member

ywwg commented Oct 5, 2020

All goes fine until the skin is about to be displayed.

weird. maybe we're not releasing a context somewhere? if we didn't clean up properly during skin destruction, we could be in a bad state.

Then again, if this only affects a single version of QT, can we blocklist that version? do we know if ubuntu is going to update their version?

@ronso0
Copy link
Member Author

ronso0 commented Oct 5, 2020

weird. maybe we're not releasing a context somewhere? if we didn't clean up properly during skin destruction, we could be in a bad state.

Looks just like it.
Yet don't know where exactly to start digging.

@ronso0
Copy link
Member Author

ronso0 commented Oct 5, 2020

It is definitely related to WSpinny: if -in originally crashing Deere- I replace Spinny with CoverArt and leave everything else as is...
it doesn't crash!

@ronso0
Copy link
Member Author

ronso0 commented Oct 5, 2020

Then again, if this only affects a single version of QT, can we blocklist that version? do we know if ubuntu is going to update their version?

IIRC they didn't update Qt in the last Ubuntu LTS releases I used, so I wouldn't count on it

@ywwg
Copy link
Member

ywwg commented Oct 5, 2020

In the QT discord chat, someone says: " in Krita we provide appimages with our own Qt patches (five right now, if this page is up to date - https://phabricator.kde.org/T10838) which solves the biggest issues we had
(not sure if your issue would justify the amount of work this solution would need, but I just wanted to mention that there is this option)"

@ywwg
Copy link
Member

ywwg commented Oct 5, 2020

if you want to talk to them directly I can invite you

@ywwg
Copy link
Member

ywwg commented Oct 5, 2020

It is definitely related to WSpinny: if -in originally crashing Deere- I replace Spinny with CoverArt and leave everything else as is...
it doesn't crash!

this is good to know -- so it's not the singleton code, it's the spinny. Next I would try disabling different elements of the spinny -- the vinyl visualizer, the background, the indicator, etc

@ywwg
Copy link
Member

ywwg commented Oct 5, 2020

Can you find a piece of the spinny that makes it crash vs not

@ronso0
Copy link
Member Author

ronso0 commented Oct 5, 2020

this is good to know -- so it's not the singleton code, it's the spinny. Next I would try disabling different elements of the spinny -- the vinyl visualizer, the background, the indicator, etc

I can try that.

what definitely makes a difference is how the skin is structured / where&when the spinny is displayed. thus my previous report of success. (btw found another way to fix it by restructuring the skin)

@ywwg
Copy link
Member

ywwg commented Oct 5, 2020

even if this is only a crasher in one version of QT, it does seem like we might be doing something wrong if it's that fragile

@ronso0
Copy link
Member Author

ronso0 commented Oct 29, 2020

re documentation the assumed cause of the crash:
as soon as this is confirmed for fixing working around the crash I'll squash the Deere commits and add an my findings to the commit message.

the difference to other skins is that Deere had nested singletons per deck with multiple GL widgets with complicated 'visible' connnections:

deck_visual_row.xml [ waveform 
                    [ spinny container [ Spinny Singletons with cover    ]
                    [                  [ Spinny Singletons without cover ]

Additionally, from watching how some complex templates show/hide widgets (like the LateNight mixer) I think it may be that some singletons may be visible twice for an instant.
So I assume (what @Be-ing suspected earlier, too) that reparenting the singleton in a way conflicts with the GL context. I have absolutely no clue what exactly causes this, because other skins -especially LateNight nowadays- has rather complicated nested singletons as well (with spinnies, too), but none with multiple GL widgets.

re: tsan/asan
I have never used those before, so I'm the wrong guy to investigate the issue that way.
it would be a waste of time for me. someone else needs to try that.

@Holzhaus
Copy link
Member

I meant @ywwg :D

@ywwg
Copy link
Member

ywwg commented Oct 29, 2020

Still crashes on exit. But I have a backtrace!

==101246==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x60400326c210
    #0 0x59798d in operator delete(void*) (/home/owen/src/github/mixxx/cbuild/mixxx+0x59798d)
    #1 0xac86bc in LV2Manifest::~LV2Manifest() /home/owen/src/github/mixxx/src/effects/lv2/lv2manifest.cpp:155:5
    #2 0xab5157 in LV2Backend::~LV2Backend() /home/owen/src/github/mixxx/src/effects/lv2/lv2backend.cpp:18:9
    #3 0xab5668 in LV2Backend::~LV2Backend() /home/owen/src/github/mixxx/src/effects/lv2/lv2backend.cpp:12:27
    #4 0x108f15c in EffectsManager::~EffectsManager() /home/owen/src/github/mixxx/src/effects/effectsmanager.cpp:53:9
    #5 0x1091078 in EffectsManager::~EffectsManager() /home/owen/src/github/mixxx/src/effects/effectsmanager.cpp:44:35
    #6 0x5bd49e in MixxxMainWindow::finalize() /home/owen/src/github/mixxx/src/mixxx.cpp:786:5
    #7 0x5b9926 in MixxxMainWindow::~MixxxMainWindow() /home/owen/src/github/mixxx/src/mixxx.cpp:204:5
    #8 0x59a323 in (anonymous namespace)::runMixxx(MixxxApplication*, CmdlineArgs const&) /home/owen/src/github/mixxx/src/main.cpp:56:1
    #9 0x59a323 in main /home/owen/src/github/mixxx/src/main.cpp:130:18
    #10 0x7fb7545610b2 in __libc_start_main /build/glibc-ZN95T4/glibc-2.31/csu/../csu/libc-start.c:308:16
    #11 0x4ef27d in _start (/home/owen/src/github/mixxx/cbuild/mixxx+0x4ef27d)

0x60400326c210 is located 0 bytes inside of 48-byte region [0x60400326c210,0x60400326c240)
allocated by thread T0 here:
    #0 0x59723d in operator new[](unsigned long) (/home/owen/src/github/mixxx/cbuild/mixxx+0x59723d)
    #1 0xac2aed in LV2Manifest::LV2Manifest(LilvPluginImpl const*, QHash<QString, LilvNodeImpl*>&) /home/owen/src/github/mixxx/src/effects/lv2/lv2manifest.cpp:27:17
    #2 0xab47a9 in LV2Backend::enumeratePlugins() /home/owen/src/github/mixxx/src/effects/lv2/lv2backend.cpp:29:40
    #3 0xab35c1 in LV2Backend::LV2Backend(QObject*) /home/owen/src/github/mixxx/src/effects/lv2/lv2backend.cpp:9:5

SUMMARY: AddressSanitizer: alloc-dealloc-mismatch (/home/owen/src/github/mixxx/cbuild/mixxx+0x59798d) in operator delete(void*)
==101246==HINT: if you don't care about these errors you may set ASAN_OPTIONS=alloc_dealloc_mismatch=0
==101246==ABORTING

@ywwg
Copy link
Member

ywwg commented Oct 29, 2020

this is a different bug so no need to fix it in this PR

@Holzhaus
Copy link
Member

this is a different bug so no need to fix it in this PR

Please open a bug report then :D

@ronso0
Copy link
Member Author

ronso0 commented Oct 29, 2020

Still crashes on exit. But I have a backtrace!

while you were runnig this branch / 2.3?

@ywwg
Copy link
Member

ywwg commented Oct 29, 2020

while you were runnig this branch / 2.3?

This branch. But the issue affects 2.3 and master. I filed #3236

@ywwg
Copy link
Member

ywwg commented Oct 29, 2020

(with this type of bad destruction, the memory corruption could cause random issues on some builds and not on others.)

Copy link
Contributor

@uklotzde uklotzde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if this PR is only an improvement and does not finally fix the crashes I propose to merge it now.

Since I am not able to assess all the changes in detail I would like to get confirmation from another reviewer.

@ronso0
Copy link
Member Author

ronso0 commented Nov 2, 2020

If this is confirmed to fix the crash on macOS I'll polish it, move some structure out of skin.xml.
Then we can merge.

@ronso0 ronso0 marked this pull request as draft November 2, 2020 10:24
@foss-
Copy link
Contributor

foss- commented Nov 2, 2020

Crashes on macOS 10.15.7 with mixxx master and beta persisting

@ronso0 ronso0 marked this pull request as ready for review November 2, 2020 14:26
@ronso0 ronso0 changed the title fix skin change crash work around skin change crash Nov 2, 2020
@ronso0 ronso0 changed the title work around skin change crash work around skin change crash on Linux Nov 2, 2020
@ronso0
Copy link
Member Author

ronso0 commented Nov 2, 2020

renamed to specify this workaround is for Linux only.
macOS workaround is #3252

Ready for merge.

@ronso0
Copy link
Member Author

ronso0 commented Nov 2, 2020

I'd appreciate if someone could test this, should take like a minute.
Without this change it's getting annoying to edit mixxx.cfg for testing Deere.

To test:

  • switch through all skins (not all schemes) without a crash
  • shutdown with Deere without crashing

Deere

  • spinnies and waveforms work both with parallel & split waves
  • spinnies next to parallel waves should always show the cover art
  • spinnies next to parallel waves are 50x50 xp with 2 decks, 75x75 px with 4 decks

@Holzhaus
Copy link
Member

Holzhaus commented Nov 2, 2020

I cannot reproduce the crashes anymore (I think some past mesa updates fixed them, but don't quote me on that). Neither this branch nor the current 2.3 crash for me. If it works for you, we can merge IMHO.

@ronso0
Copy link
Member Author

ronso0 commented Nov 2, 2020

my request was rathe rabout someone double-checking nothing broke.
but hej, I checked it a few times by now..
LGTM :)

@Be-ing Be-ing merged commit 0e0c72e into mixxxdj:2.3 Nov 2, 2020
@ronso0 ronso0 deleted the skin-change-crash-fix branch November 2, 2020 23:11
@ronso0 ronso0 mentioned this pull request Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants