Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug][Solved] Mono.Unix.UnixIOException: Resource temporarily unavailable on Arch Linux after Util.Invoke and long chain of runtime calls #3343

Closed
Blazingi opened this issue Apr 10, 2021 · 16 comments
Labels
Linux Issues specific for Linux Mono Issues specific for Mono Support Issues that are support requests

Comments

@Blazingi
Copy link

Blazingi commented Apr 10, 2021

Background

  • Operating System: Arch Linux
  • CKAN Version: 1.30.0-1 (AUR)
  • KSP Version: 1.11.2

Have you made any manual changes to your GameData folder (i.e., not via CKAN)?

  • No

Problem

Crash when downloading multiple mods at once.

exception inside UnhandledException handler: (null) assembly:/usr/lib/mono/gac/Mono.Posix/4.0.0.0__0738eb9f132ed756/Mono.Posix.dll type:UnixIOException member:(null)

exception inside UnhandledException handler: (null) assembly:/usr/lib/mono/gac/Mono.Posix/4.0.0.0__0738eb9f132ed756/Mono.Posix.dll type:UnixIOException member:(null)

exception inside UnhandledException handler: (null) assembly:/usr/lib/mono/gac/Mono.Posix/4.0.0.0__0738eb9f132ed756/Mono.Posix.dll type:UnixIOException member:(null)

exception inside UnhandledException handler: (null) assembly:/usr/lib/mono/gac/Mono.Posix/4.0.0.0__0738eb9f132ed756/Mono.Posix.dll type:UnixIOException member:(null)

exception inside UnhandledException handler: (null) assembly:/usr/lib/mono/gac/Mono.Posix/4.0.0.0__0738eb9f132ed756/Mono.Posix.dll type:UnixIOException member:(null)

[ERROR] FATAL UNHANDLED EXCEPTION: Mono.Unix.UnixIOException: Resource temporarily unavailable [EWOULDBLOCK].
  at Mono.Unix.UnixMarshal.ThrowExceptionForLastError () [0x00005] in <7bcd6d815ebc477ab0521e0361c44f6c>:0 
  at Mono.Unix.UnixStream.Write (System.Byte[] buffer, System.Int32 offset, System.Int32 count) [0x00055] in <7bcd6d815ebc477ab0521e0361c44f6c>:0 
  at System.Windows.Forms.XplatUIX11.WakeupMain () [0x00000] in <4b6c441381804088ab0fff508a3fbabf>:0 
  at System.Windows.Forms.XplatUIX11.SendAsyncMethod (System.Windows.Forms.AsyncMethodData method) [0x00080] in <4b6c441381804088ab0fff508a3fbabf>:0 
  at System.Windows.Forms.XplatUI.SendAsyncMethod (System.Windows.Forms.AsyncMethodData data) [0x00000] in <4b6c441381804088ab0fff508a3fbabf>:0 
  at System.Windows.Forms.Control.BeginInvokeInternal (System.Delegate method, System.Object[] args, System.Windows.Forms.Control control) [0x0003f] in <4b6c441381804088ab0fff508a3fbabf>:0 
  at System.Windows.Forms.Control.Invoke (System.Delegate method, System.Object[] args) [0x00017] in <4b6c441381804088ab0fff508a3fbabf>:0 
  at (wrapper remoting-invoke-with-check) System.Windows.Forms.Control.Invoke(System.Delegate,object[])
  at System.Windows.Forms.Control.Invoke (System.Delegate method) [0x0001d] in <4b6c441381804088ab0fff508a3fbabf>:0 
  at (wrapper remoting-invoke-with-check) System.Windows.Forms.Control.Invoke(System.Delegate)
  at CKAN.Util.Invoke[T] (T obj, System.Action action) [0x0000d] in <7c20c4e9497444e4a499ca87b8a6fb81>:0 
  at CKAN.Wait.SetDescription (System.String message) [0x00014] in <7c20c4e9497444e4a499ca87b8a6fb81>:0 
  at (wrapper remoting-invoke-with-check) CKAN.Wait.SetDescription(string)
  at CKAN.GUIUser.RaiseProgress (System.String message, System.Int32 percent) [0x00017] in <7c20c4e9497444e4a499ca87b8a6fb81>:0 
  at CKAN.NetAsyncDownloader.FileProgressReport (System.Int32 index, System.Int32 percent, System.Int64 bytesDownloaded, System.Int64 bytesToDownload) [0x0019e] in <7c20c4e9497444e4a499ca87b8a6fb81>:0 
  at CKAN.NetAsyncDownloader+<>c__DisplayClass16_0.<DownloadModule>b__0 (System.Object sender, System.Net.DownloadProgressChangedEventArgs args) [0x0001e] in <7c20c4e9497444e4a499ca87b8a6fb81>:0 
  at CKAN.NetAsyncDownloader+NetAsyncDownloaderDownloadPart.<ResetAgent>b__20_0 (System.Object sender, System.Net.DownloadProgressChangedEventArgs args) [0x00008] in <7c20c4e9497444e4a499ca87b8a6fb81>:0 
  at System.Net.WebClient.OnDownloadProgressChanged (System.Net.DownloadProgressChangedEventArgs e) [0x0000a] in <1af2426e91d1474d92e5d471c4ca8f95>:0 
  at System.Net.WebClient.<StartAsyncOperation>b__78_9 (System.Object arg) [0x00000] in <1af2426e91d1474d92e5d471c4ca8f95>:0 
  at System.Threading.QueueUserWorkItemCallback.WaitCallback_Context (System.Object state) [0x00007] in <efe941bb62534dc3a62ceb1a818964a0>:0 
  at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) [0x00071] in <efe941bb62534dc3a62ceb1a818964a0>:0 
  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) [0x00000] in <efe941bb62534dc3a62ceb1a818964a0>:0 
  at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [0x00021] in <efe941bb62534dc3a62ceb1a818964a0>:0 
  at System.Threading.ThreadPoolWorkQueue.Dispatch () [0x00074] in <efe941bb62534dc3a62ceb1a818964a0>:0 
  at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback () [0x00000] in <efe941bb62534dc3a62ceb1a818964a0>:0 

Solution

ulimit -s 32768
ulimit -n 10240
did the trick, but it only works in current bash session (you need to launch ckan from terminal after adjusting limits).
For permanent solution you need to change /etc/security/limits.conf

@HebaruSan

This comment has been minimized.

@HebaruSan HebaruSan added the Support Issues that are support requests label Apr 10, 2021
@Blazingi
Copy link
Author

@HebaruSan Pastebin seems to be the standard but sure, edited.

@Blazingi
Copy link
Author

Also, it seems only to be the problem when downloading 5+ mods at once.

@HebaruSan HebaruSan changed the title [Bug] Briefly summarize your issue here [Bug] Mono.Unix.UnixIOException: Resource temporarily unavailable after Util.Invoke and long chain of runtime calls Apr 11, 2021
@HebaruSan HebaruSan added Linux Issues specific for Linux Mono Issues specific for Mono labels Apr 11, 2021
@HebaruSan
Copy link
Member

This looks like a problem with the Mono install rather than a CKAN code issue. A fellow Arch user may know what sorts of things to check; I'll see if there are any on the Discord.

@HebaruSan
Copy link
Member

Here's a possibility. It looks like some Arch users had problems with applications being able to create threads, and CKAN creates one thread for each download:

https://bugs.archlinux.org/task/47662

We had an AUR user report this on the Discord, but they didn't really figure anything out about it:

screenshot

Make sure your Mono can create threads as needed and see if there are any similar known issues with threading on your version of Arch.

@HebaruSan HebaruSan changed the title [Bug] Mono.Unix.UnixIOException: Resource temporarily unavailable after Util.Invoke and long chain of runtime calls [Bug] Mono.Unix.UnixIOException: Resource temporarily unavailable on Arch Linux after Util.Invoke and long chain of runtime calls Apr 11, 2021
@Blazingi
Copy link
Author

Ok, seems that raising the open files limit to 10240 and stack size to 32768 enabled it to work, thank you for pointing me in right direction! I will test it some more and then close the issue.

Fix note:
ulimit -s 32768
ulimit -n 10240
did the trick, but it only works in current bash session (you need to launch ckan from terminal after adjusting limits).
For permanent solution you need to change /etc/security/limits.conf

@Blazingi Blazingi changed the title [Bug] Mono.Unix.UnixIOException: Resource temporarily unavailable on Arch Linux after Util.Invoke and long chain of runtime calls [Bug][Solved] Mono.Unix.UnixIOException: Resource temporarily unavailable on Arch Linux after Util.Invoke and long chain of runtime calls Apr 11, 2021
@HebaruSan
Copy link
Member

Cool! If you do find a good solution, would you mind updating the wiki? I can copy/paste from here to there if need be, but it would be better if the person investigating the issue was the one doing the editing:

https://github.com/KSP-CKAN/CKAN/wiki/Installing-CKAN-on-Arch

@Blazingi
Copy link
Author

Sure! Probably not today but i can also write in permanent solution :) . Also all seems to work so closing!

@jasnyj
Copy link

jasnyj commented Aug 7, 2021

This bug is unrelated to Arch Linux proper, this is an upstream bug presents in mono version 6.12.0.86 and upward. The fact that it only affects Arch Linux for now because Arch is bleeding edge and uses the latest mono version while other distributions tend to use older versions is only tangential.

The "fix" proposed by BlazingPL unfortunately doesn't and cannot work. The bug is unrelated to the number of open files or the size of the stack (or any other ulimit for that matter). The bug happens because a pipe buffer get full and the exception triggered by this condition is erroneously not caught. As far as I know there is no way to control the size of pipes' buffers, so we cannot prevent the exception. If you want to know more see the mono pull request that fix this bug: mono/mono#21136.

In the mean time, while the pull request is being reviewed, the only solution is to use a mono version older than 6.12.0.86 or build a custom version with the patch of mono's pull request 21136 included.

@DasSkelett
Copy link
Member

DasSkelett commented Aug 7, 2021

this is an upstream bug presents in mono version 6.12.0.86 and upward. The fact that it only affects Arch Linux for now because Arch is bleeding edge and uses the latest mono version while other distributions tend to use older versions is only tangential.

I have mono 6.12.0.147 from mono's preview repo installed on my Kubuntu machine, and am not encountering this error. So there must be another unknown that influences whether this bug is triggered or not (?)
Maybe a different/newer X11 version on Arch? Or kernel? Or, since I see EWOULDBLOCK in that PR, does it depend on threading circumstances, or on CPU power / core count or something?

In any case, thanks for posting here and shedding some light onto this elusive bug. And confirming that is's indeed not something we messed up 😄

@jasnyj
Copy link

jasnyj commented Aug 7, 2021

The bug is indeed threading-related. I don't know a lot about mono's internals but they use a pipe to enable communication between the main UI thread and other threads. When CKAN downloads a lot of mods at once, each downloader thread regularly reports its progress to the main UI thread which causes the mono framework to send a message to the main UI thread by writing to this pipe. If the downloader threads are too numerous or the main UI thread is, for some reason, not scheduled often enough to read the pipe, the buffer of the pipe will fill up. This causes the Mono.Unix.UnixIOException which is, due to a bug in mono, currently not caught correctly and makes CKAN crash.

Therefore the bug is intermittent as it depends on how many threads there is, how they are scheduled, etc... The first time I used CKAN on mono 6.12.0.122 I encountered it immediately and it happened each time I tried downloading mods. Later on, I wanted to reproduce the bug to get the stack trace but I had difficulties making it happen again and it took some tries to happen again.

Also note the bug (mostly) only appears when downloading multiple mods at once. If your download cache contains a lot of already downloaded mods, CKAN might not have to download any mods or not enough for parallel download to trigger.

If you want to reproduce the bug on Kubuntu:

  • make sure CKAN's download cache is empty (or temporarily set it to an empty directory).
  • use a machine with multiple cores, mine has 4 cores (i5-6500). This might not be necessary to trigger the bug but since this is threading-related it might be possible that the bug doesn't happen on a low core count (or even, on the other hand, on a "too high" core count).
  • try to install the Realism Overhaul mod collection, as per https://github.com/KSP-RO/RP-0/wiki/RO-&-RP-1-Installation-for-1.10.1 (RP-1 should also trigger the bug). That's the mod I discovered the bug with.
  • (Also, use the CKAN GUI, not the console UI as it seems the console UI only download mods one by one as when using the console UI the downloader threads don't report their progresses in the same way and the bug doesn't happen.)
  • Try a few times, each time emptying the download cache, if it doesn't happen on first try.

Now, it might be the case that the conditions for the bug to happen depend on a lot of things: kernel version, X.org version, window manager, etc... Still the bug lies in mono and should (I hope) be corrected soon.

@HebaruSan

This comment has been minimized.

@DasSkelett
Copy link
Member

Wow, thanks for this detailed explanation!
Indeed, preferably your fix will be merged in Mono rather sooner than later.
In the meantime, I'm thinking about a few ways how we could mitigate this in CKAN, as even after the fix is merged, it might take some time until it arrives everywhere:

  • Work on [Feature] Download Limits #3314, which proposes an option to limit parallel downloads. Should at least reduce the frequency of this bug appearing.
  • Reduce the number of GUI updates on progress reports. I think right now, every progress report from the WebClient triggers a GUI update. Might be a good idea for performance reasons alone, even aside this bug.
  • Catch and ignore that error in SetProgress() and hope that Mono just carries on?

@jasnyj
Copy link

jasnyj commented Aug 7, 2021

  • (Also, use the CKAN GUI, not the console UI as it seems the console UI only download mods one by one.)

The same downloader component is shared by all the UIs; ConsoleUI's downloads are just as parallel as GUI's are, we just don't have a UI component displaying per-download progress in ConsoleUI yet.

Oh I see, I'm not familiar with CKAN's implementation indeed. Then if I understand correctly, the bug doesn't happen because the downloader threads are not reporting their progresses to the main UI thread, thus not writing to the communication pipe, or at least not as frequently as in the full GUI ?

  • Work on [Feature] Download Limits #3314, which proposes an option to limit parallel downloads. Should at least reduce the frequency of this bug appearing.

  • Reduce the number of GUI updates on progress reports. I think right now, every progress report from the WebClient triggers a GUI update. Might be a good idea for performance reasons alone, even aside this bug.

Implementing either of these two solutions should indeed fix the problem in 99% of cases. However, the remaining 1% might cause problems. Now, keeping in mind this is only a temporary solution while waiting for the mono fix, this should render the bug rare enough for CKAN to be perfectly usable.

  • Catch and ignore that error in SetProgress() and hope that Mono just carries on?

I don't know if it's possible to catch Mono generated exceptions in user code but it's probably the case. If it is, this will for sure fix the problem but I think it carries the risk of catching other exceptions that shouldn't be caught.
EDIT: this might well cause Mono to crash or at least introduce new bugs as once the exception has been caught in CKAN, the control flow will not return to the point the exception occurred and Mono is probably not prepared to that.

@HebaruSan
Copy link
Member

HebaruSan commented Aug 7, 2021

Then if I understand correctly, the bug doesn't happen because the downloader threads are not reporting their progresses to the main UI thread, thus not writing to the communication pipe, or at least not as frequently as in the full GUI ?

ConsoleUI doesn't use anything from the System.Windows.Forms namespace (which is why it still works on Mac despite mono/mono#6701), so the socket or pipe (depending on version) doesn't exist at all to be filled up. It also is unencumbered by a notion of "main UI thread."

@Oman395
Copy link

Oman395 commented Nov 5, 2022

This is currently closed, but I've been having the issue on Arch 6.0.6 and KDE plasma, and the command mentioned didn't work (maybe bc zsh?). Regardless, I have found a pretty easy solution: for whatever reason, installing the mods through CLI just... works. I don't even know how that is, or how the code is different, but it's really weird.
Edit: This happened primarily with ReStock, and I haven't really had it for any other mod. Maybe that sheds some insight??? I don't even

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Linux Issues specific for Linux Mono Issues specific for Mono Support Issues that are support requests
Projects
None yet
Development

No branches or pull requests

5 participants