-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Use vfork() improve performance when starting processes. #33289
Conversation
…improvement due to getting rid of page faults.
Benchmark code written specifically to demonstrate vfork()'s performance superiority:
|
Gaaa; 3 builds failed because they're cross compiles which is literally documented as not supported. |
// ptrace() is used on the child, thus making setuid() safe to use after vfork(). The fabled vfork() | ||
// security hole is the other way around; if a multithreaded host were to execute setuid() | ||
// on another thread while a vfork() child is still pending, bad things are possible; however we | ||
// do not do that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this important to mention? Can you phrase it as something we mustn't do?
I did a review and my comments were mostly about the comments.
I think it would be interesting to see a .NET Code benchmark with these changes, using Process.Start API. Does OSX HAVE_VFORK_SHM? |
I'm concerned by some of the man page comments.
and
Seems dangerous to rely on this purely by checking whether the function exists. |
+1 The C example showing the 150x improvement is a worst-case scenario where the parent starts writing to 1GB of data (at 1 byte per page) just after the fork. |
The behavior of calling vfork is not defined if setuid is used, which we use if the process is being launched with explicit credentials. It's probably also not defined if change working directory is set. Given that vfork self-describes these cautions, I'd want to see it not used during the cred-set or CWD-set paths. My gut feel is that it's just inherently more risk than reward. |
chdir is defined for the same reason dup2 is. I deliberately did not check for the existence of vfork() but put a cmake check for the behavior itself. I am now debating ripping out the specific dependency because cross compile. ptrace() makes a security demand against the parent process is why setuid() is safe. |
The equivalent worst case is guaranteed to happen in .NET sooner or later: pack the heap right after process start. |
+1 @joshudson, what impact does this have on a "normal" .NET process using Process.Start? For example, does this make a measurable improvement to the |
@bartonjs: Any platform in which calling setuid in the vfork child is a security vulnerability is already unsafe because calling setuid in the fork child is an information disclosure vulnerability. The vfork+setuid referenced in the man page I already called out. Don't ever call setuid in the parent while a child is in vfork; which we don't do. |
There are many articles that warn about vfork security and portability issues. From https://wiki.sei.cmu.edu/confluence/pages/viewpage.action?pageId=87152373: Do not use vfork() I agree that checking for vfork presence is not enough given this. I think vfork would be ok to use only on platforms or situations where vfork does not suffer from these ugly issues. I do not know whether such cases exist. |
That article's just wrong. It lists the original call chain vfork was added for as undefined. Incidentally I found out why Go arranged to have the vfork parent and child in different functions. Go has a M:N threading model and implicit async. |
@joshudson, can you point to any official documentation / man pages / etc. for key platforms that specifically states vfork as being safe / recommended? At the moment, the potential risk does not seem to be worth the potential benefit, which I expect to be limited in the common case, as @tmds outlined. |
https://gist.github.com/nicowilliams/a8a07b0fc75df05f684c23c18d7db234 Quite a surprising document really. He starts talking about adding another system call about half way through but he starts off with the address space problem. https://sourceware.org/bugzilla/show_bug.cgi?id=10354 Glibc switching to vfork. https://gitlab.gnome.org/GNOME/glib/merge_requests/95 Gnome switching to vfork via posix_spawn I don't know if you will ever support such processors but sometimes you don't get fork at all. It's vfork or don't compile. |
@joshudson thank you for the links to the articles. I have no prior knowledge of vfork, so I have to catch up and also read the articles. But one thing that seems to be needed if we wanted to add the vfork is to make sure no signal handlers can be called while running in the child before the exec, as described by Rich Felker in https://sourceware.org/bugzilla/show_bug.cgi?id=14750. @stephentoub have we ever considered using posix_spawn instead of fork / exec? It seems that if that was possible, we would get the same benefit as with using vfork, taking an advantage of the fact that GLIBC / musl developers have already done the necessary steps to ensure it is safe. |
You can't use posix_spawn because chdir. Show me where you set signal handlers and I'll deal with that. (It's almost certaibly broken now because the fork() child rarely can inherit signal handlers either). |
We deal with signal handlers in coreclr PAL in https://github.com/dotnet/coreclr/blob/master/src/pal/src/exception/signal.cpp and also here in corefx in https://github.com/dotnet/corefx/blob/master/src/Native/Unix/System.Native/pal_signal.c |
I read your signal handlers. They're not safe in fork children either, as expected. On considering third party signal handlers for the first time, looks like blanket mask is correct whether fork or vfork is used. |
Can you elaborate? What's broken? |
If the fork child we're to receive a handled signal, it would write to the controlling pipe just like the parent does, but the other end is not expecting a spurious pipe read. Pretty much any signal handlers that doesn't terminate the process and does more than write to a global variable isn't safe in a fork child so I'm not surprised. I was expecting to find only SIGSEGV and SIGBUS -> (if managed block throw exception else die) which would have been safe even for vfork. |
…nt; also took care of pthread cancellation mask in case third-party native code tries to use it
There's the signal mask, the signal handling cleaning, and the pthread_cancel mask for good measure. Portability is a harsh mistress. (Once considering third party libraries trying to use signals, this stuff is clearly broken even before trying to use vfork()). |
if ((processId = fork()) == -1) | ||
// The fork child must not be signalled until it calls exec(): our signal handlers do not | ||
// handle being raised in the child process correctly | ||
sigfillset(&signal_set); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit - use SIGALL_SET
instead of creating your own set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
joshua@novaϟ find /usr -name '*.h' -print0 | xargs -0 grep -i SIGALL_SET /dev/null
joshua@novaϟ
SIGALL_SET is not in my system header files.
{ | ||
if (sig != SIGKILL && sig != SIGSTOP) | ||
break; // No more signals | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A code style nit - the curly braces should be on separate lines.
: (sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)) | ||
{ | ||
// It has a pre-defined handler -- put it back | ||
sigaction(sig, &sa_old, &sa_trash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the signals will fall into this category. So it would be better to query the previous handler fist and then only set it for signals that don't have it set to SIG_IGN or SIG_DFL, instead of setting it and then resetting back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried and failed to locate a portable API call for reading the signal handler w/o setting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you pass NULL as the act
parameter and non-NULL as the oldact, it just gets the current value.
sigaction Linux man page says:
If act is non-NULL, the new action for signal signum is installed
from act. If oldact is non-NULL, the previous action is saved in
oldact.
FreeBSD man page says:
If act is non-NULL, it specifies an action ...
OSX man page says:
If act is non-zero, it specifies an action ...
@stephentoub thank you for reminding me that. I have not looked into that yet. |
} | ||
if (sigaction(sig, NULL, &sa_old)) | ||
{ | ||
break; // No more signals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking here is incorrect because sigaction
in glibc returns -1
for signals used by it internally. Since in practice those signals are 32/33 and the total number of signals is >=64, the loop will stop early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that's undocumented nonsense. I'll just go use NSIG then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's documented: http://man7.org/linux/man-pages/man2/sigaction.2.html (section C library/kernel differences).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't want to cause any controversy on this; but somehow my man page on sigaction has about half the content. I guess they fixed their docs.
The WSL bug doesn't appear to be so bad. Simple testing on
|
@izbyshev : OK then. The documentation said it was worse. |
…he middle of the signal list
@stephentoub so I did experiments with vfork / fork. Here is what my test app does:
I ran the tests on my native Linux box with 24GB of RAM and swap enabled / disabled. I've tried to increase the amount of memory the test was eating until the
As you can see, the vfork have worked fine with much more memory consumed by the parent with both the swap on and off. |
Thanks, @janvorli. And you're comfortable with the change, such that if vfork is available, we use it? i.e. all of the previously raised concerns are no longer applicable on any platform? |
On the other hand I was expecting somebody to tell me what particular build magic you wanted to use to exclude Mac (the only suspect |
@joshudson |
Something's up with master builds; looks more like the build chain is broken than anything I did. My machine yields: /home/joshua/netcore/dotnet/Tools/tests.targets(579,5): error : One or more tests failed while running tests from 'System.Drawing.Common.Tests' please check /home/joshua/netcore/dotnet/bin/tests/System.Drawing.Common.Tests/netcoreapp-Linux-Debug-x64/testResults.xml for details! [/home/joshua/netcore/dotnet/src/System.Drawing.Common/tests/System.Drawing.Common.Tests.csproj] which are the same tests that fail on master for me. |
@stephentoub : One last commit to get rid of the remaining typos. I'm a poor speller. |
@janvorli, does this look good to you? |
@dotnet-bot test Packaging All Configurations x64 Debug Build please |
Thanks, @joshudson. At this point we'll merge it. If we need to revert for some reason, thankfully it's a one character change to delete the 'v' :-) |
Yes, it does. I am sorry for a late response, I was OOF last week. Additionally, I was wondering if it would make sense to add an env variable that would enable switching back to |
Why would someone be uncomfortable? Your comment makes me worried again we shouldn't be using it. |
I am not worried about it, but I have thought some people might be due to the negative articles mentioned in the comments above. |
That's a fun tuning knob. Somebody writes this code (2007 me), gets the memory corruption, changes the knob, slows down the GC, and it goes away:
In 2007 there was this bug where the callback function went out of scope immediately despite the documentation saying when the Enum function returned. I don't know if it still exists. Things like these can be set up that do easily enough. In general, expect a few timing issues being reported as
|
I think we should either be confident enough in the change to be able to explain to everyone why it's safe, or we should revert it. Having a knob for this seems wrong to me.
Can you elaborate on this? What kinds of timing-related issues are you expecting to see reported? |
It changes GC speed so use-after-free p/invoke bugs can look like this knob does something. |
Yeah, I think you are right. |
…fx#33289) * Use vfork() to start child processes where this yields a performance improvement due to getting rid of page faults. * Remove specific dependency on shared memory vfork so that cross compiles work again. * Added signal mask code so that a child process can't confuse the parent; also took care of pthread cancellation mask in case third-party native code tries to use it * Fix issues from vfork() pull request review * Check handler before replacing it * Improve readability of signal handler removing * Convert tabs to spaces * use NSIG instead of dynamic probing because glibc punches a hole in the middle of the signal list * Exclude Mac OSX from vfork() because we don't quite trust it. * Fix one last batch of typos Commit migrated from dotnet/corefx@0a561e3
Use
vfork()
to start child processes where this yields a performance improvement due to getting rid of page faults.The larger the host process, the bigger the improvement. For a one gigabyte process,
vfork()
is literally 150 times faster thanfork()
; however most of the performance penalty is incorrectly being charged to the garbage collector.