Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Use vfork() improve performance when starting processes. #33289

Merged
merged 10 commits into from
Feb 15, 2019
Merged

Use vfork() improve performance when starting processes. #33289

merged 10 commits into from
Feb 15, 2019

Conversation

joshudson
Copy link
Contributor

Use vfork() to start child processes where this yields a performance improvement due to getting rid of page faults.

The larger the host process, the bigger the improvement. For a one gigabyte process, vfork() is literally 150 times faster than fork(); however most of the performance penalty is incorrectly being charged to the garbage collector.

…improvement due to getting rid of page faults.
@joshudson
Copy link
Contributor Author

joshudson commented Nov 7, 2018

Benchmark code written specifically to demonstrate vfork()'s performance superiority:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#include <sys/wait.h>

int main(void)
{
        char *args[] = { "/bin/false", NULL };
        volatile char *buffer = malloc(1024 * 1024 * 1024);
        for (int i = 0; i < 1024 * 1024 * 1024; i += 4096)
                buffer[i] = 1;
        time_t start = time(NULL);
        pid_t pid;
        int status;
        for (int j = 0; j < 1000; j++) {
#ifndef DUMMY
                if ((pid = FORK()) == 0) {
                        execve(args[0], args, NULL);
                        write(2, "Oops\n", 5);
                        _exit(3);
                }
                if (pid < 0)
                        _exit(3);
                waitpid(pid, &status, 0);
                if ((status & 0xFF) > 1) _exit(1);
#endif
#ifndef NOPAGEFAULT
                for (int i = 0; i < 1024 * 1024 * 1024; i += 4096)
                        buffer[i] = 2;
#endif
        }
        printf("%d\n", (int)(time(NULL) - start));
        exit(0);
}

@joshudson
Copy link
Contributor Author

Gaaa; 3 builds failed because they're cross compiles which is literally documented as not supported.

@danmoseley danmoseley requested a review from tmds November 7, 2018 03:53
// ptrace() is used on the child, thus making setuid() safe to use after vfork(). The fabled vfork()
// security hole is the other way around; if a multithreaded host were to execute setuid()
// on another thread while a vfork() child is still pending, bad things are possible; however we
// do not do that.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this important to mention? Can you phrase it as something we mustn't do?

@tmds
Copy link
Member

tmds commented Nov 7, 2018

I did a review and my comments were mostly about the comments.

For a one gigabyte process, vfork() is literally 150 times faster than fork();

I think it would be interesting to see a .NET Code benchmark with these changes, using Process.Start API.

Does OSX HAVE_VFORK_SHM?

@stephentoub
Copy link
Member

I'm concerned by some of the man page comments.

The requirements put on vfork() by the standards are weaker than
       those put on fork(2), so an implementation where the two are
       synonymous is compliant.  In particular, the programmer cannot rely
       on the parent remaining blocked until the child either terminates or
       calls execve(2), and cannot rely on any specific behavior with
       respect to shared memory.

and

When vfork() is called in a multithreaded process, only the calling
       thread is suspended until the child terminates or executes a new
       program.  This means that the child is sharing an address space with
       other running code.  This can be dangerous if another thread in the
       parent process changes credentials (using setuid(2) or similar),
       since there are now two processes with different privilege levels
       running in the same address space.  As an example of the dangers,
       suppose that a multithreaded program running as root creates a child
       using vfork().  After the vfork(), a thread in the parent process
       drops the process to an unprivileged user in order to run some
       untrusted code (e.g., perhaps via plug-in opened with dlopen(3)).  In
       this case, attacks are possible where the parent process uses mmap(2)
       to map in code that will be executed by the privileged child process.

Seems dangerous to rely on this purely by checking whether the function exists.

@tmds
Copy link
Member

tmds commented Nov 7, 2018

I'm concerned by some of the man page comments.

+1 vfork semantics are not portable. I think it will work fine on Linux and BSDs when implemented 'correctly'.
It may be hard to figure out what is correct. For example, the go implementation makes the parent function return immediately after the vfork call: https://github.com/golang/go/blob/50bd1c4d4eb4fac8ddeb5f063c099daccfb71b26/src/syscall/exec_linux.go#L162-L167. We're not doing that, maybe we should?

The C example showing the 150x improvement is a worst-case scenario where the parent starts writing to 1GB of data (at 1 byte per page) just after the fork.
In the .NET implementation the parent thread is waiting for the child to exec (so it can't write). While other threads may be writing, probably they are doing it sequentially, limiting the number of pages to be copied.
So the gain in .NET will be much lower, and maybe not worth the complexity and non-portability issues.

@bartonjs
Copy link
Member

bartonjs commented Nov 7, 2018

The behavior of calling vfork is not defined if setuid is used, which we use if the process is being launched with explicit credentials. It's probably also not defined if change working directory is set.

Given that vfork self-describes these cautions, I'd want to see it not used during the cred-set or CWD-set paths.

My gut feel is that it's just inherently more risk than reward.

@joshudson
Copy link
Contributor Author

chdir is defined for the same reason dup2 is.

I deliberately did not check for the existence of vfork() but put a cmake check for the behavior itself. I am now debating ripping out the specific dependency because cross compile.

ptrace() makes a security demand against the parent process is why setuid() is safe.

@joshudson
Copy link
Contributor Author

The equivalent worst case is guaranteed to happen in .NET sooner or later: pack the heap right after process start.

@stephentoub
Copy link
Member

My gut feel is that it's just inherently more risk than reward.

+1

@joshudson, what impact does this have on a "normal" .NET process using Process.Start? For example, does this make a measurable improvement to the dotnet command when it spawns processes?

@joshudson
Copy link
Contributor Author

@bartonjs: Any platform in which calling setuid in the vfork child is a security vulnerability is already unsafe because calling setuid in the fork child is an information disclosure vulnerability.

The vfork+setuid referenced in the man page I already called out. Don't ever call setuid in the parent while a child is in vfork; which we don't do.

@stephentoub
Copy link
Member

@jkotas, @janvorli, do you have an opinion on this one?

@jkotas
Copy link
Member

jkotas commented Dec 13, 2018

There are many articles that warn about vfork security and portability issues. From https://wiki.sei.cmu.edu/confluence/pages/viewpage.action?pageId=87152373:

Do not use vfork()

I agree that checking for vfork presence is not enough given this. I think vfork would be ok to use only on platforms or situations where vfork does not suffer from these ugly issues. I do not know whether such cases exist.

@joshudson
Copy link
Contributor Author

joshudson commented Dec 13, 2018

That article's just wrong. It lists the original call chain vfork was added for as undefined.

Incidentally I found out why Go arranged to have the vfork parent and child in different functions. Go has a M:N threading model and implicit async.

@stephentoub
Copy link
Member

@joshudson, can you point to any official documentation / man pages / etc. for key platforms that specifically states vfork as being safe / recommended? At the moment, the potential risk does not seem to be worth the potential benefit, which I expect to be limited in the common case, as @tmds outlined.

@joshudson
Copy link
Contributor Author

https://gist.github.com/nicowilliams/a8a07b0fc75df05f684c23c18d7db234

Quite a surprising document really. He starts talking about adding another system call about half way through but he starts off with the address space problem.

https://sourceware.org/bugzilla/show_bug.cgi?id=10354

Glibc switching to vfork.

https://gitlab.gnome.org/GNOME/glib/merge_requests/95

Gnome switching to vfork via posix_spawn

http://nommu.org

I don't know if you will ever support such processors but sometimes you don't get fork at all. It's vfork or don't compile.

@janvorli
Copy link
Member

@joshudson thank you for the links to the articles. I have no prior knowledge of vfork, so I have to catch up and also read the articles. But one thing that seems to be needed if we wanted to add the vfork is to make sure no signal handlers can be called while running in the child before the exec, as described by Rich Felker in https://sourceware.org/bugzilla/show_bug.cgi?id=14750.

@stephentoub have we ever considered using posix_spawn instead of fork / exec? It seems that if that was possible, we would get the same benefit as with using vfork, taking an advantage of the fact that GLIBC / musl developers have already done the necessary steps to ensure it is safe.

@joshudson
Copy link
Contributor Author

You can't use posix_spawn because chdir. Show me where you set signal handlers and I'll deal with that. (It's almost certaibly broken now because the fork() child rarely can inherit signal handlers either).

@janvorli
Copy link
Member

Show me where you set signal handlers

We deal with signal handlers in coreclr PAL in https://github.com/dotnet/coreclr/blob/master/src/pal/src/exception/signal.cpp and also here in corefx in https://github.com/dotnet/corefx/blob/master/src/Native/Unix/System.Native/pal_signal.c
But it is possible that 3rd party libraries that .NET application use also register their own signal handlers, so I am not sure why you wanted to know where we deal with signals. My understanding (based on the web page I've linked above) was that all signals must be blocked before calling vfork, then in child set all signals that are not set to SIG_IGN to SIG_DFL and then restore the signal mask before calling the execve and also in the parent. I understand all as really all, no matter which ones we set and don't set.

@joshudson
Copy link
Contributor Author

I read your signal handlers. They're not safe in fork children either, as expected. On considering third party signal handlers for the first time, looks like blanket mask is correct whether fork or vfork is used.

@stephentoub
Copy link
Member

They're not safe in fork children either, as expected.

Can you elaborate? What's broken?

@joshudson
Copy link
Contributor Author

joshudson commented Dec 13, 2018

If the fork child we're to receive a handled signal, it would write to the controlling pipe just like the parent does, but the other end is not expecting a spurious pipe read.

Pretty much any signal handlers that doesn't terminate the process and does more than write to a global variable isn't safe in a fork child so I'm not surprised.

I was expecting to find only SIGSEGV and SIGBUS -> (if managed block throw exception else die) which would have been safe even for vfork.

…nt; also took care of pthread cancellation mask in case third-party native code tries to use it
@joshudson
Copy link
Contributor Author

joshudson commented Dec 14, 2018

There's the signal mask, the signal handling cleaning, and the pthread_cancel mask for good measure.

Portability is a harsh mistress.

(Once considering third party libraries trying to use signals, this stuff is clearly broken even before trying to use vfork()).

if ((processId = fork()) == -1)
// The fork child must not be signalled until it calls exec(): our signal handlers do not
// handle being raised in the child process correctly
sigfillset(&signal_set);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit - use SIGALL_SET instead of creating your own set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

joshua@novaϟ find /usr -name '*.h' -print0 | xargs -0 grep -i SIGALL_SET /dev/null
joshua@novaϟ

SIGALL_SET is not in my system header files.

{
if (sig != SIGKILL && sig != SIGSTOP)
break; // No more signals
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A code style nit - the curly braces should be on separate lines.

: (sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL))
{
// It has a pre-defined handler -- put it back
sigaction(sig, &sa_old, &sa_trash);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the signals will fall into this category. So it would be better to query the previous handler fist and then only set it for signals that don't have it set to SIG_IGN or SIG_DFL, instead of setting it and then resetting back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried and failed to locate a portable API call for reading the signal handler w/o setting it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you pass NULL as the act parameter and non-NULL as the oldact, it just gets the current value.
sigaction Linux man page says:

If act is non-NULL, the new action for signal signum is installed
from act. If oldact is non-NULL, the previous action is saved in
oldact.

FreeBSD man page says:

If act is non-NULL, it specifies an action ...

OSX man page says:

If act is non-zero, it specifies an action ...

@janvorli
Copy link
Member

@stephentoub thank you for reminding me that. I have not looked into that yet.

}
if (sigaction(sig, NULL, &sa_old))
{
break; // No more signals

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking here is incorrect because sigaction in glibc returns -1 for signals used by it internally. Since in practice those signals are 32/33 and the total number of signals is >=64, the loop will stop early.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's undocumented nonsense. I'll just go use NSIG then.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's documented: http://man7.org/linux/man-pages/man2/sigaction.2.html (section C library/kernel differences).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't want to cause any controversy on this; but somehow my man page on sigaction has about half the content. I guess they fixed their docs.

@izbyshev
Copy link

The WSL bug is so bad that we should sit on this until the fix makes it to the main release. We are not fine as the bug randomly corrupts memory because it doesn't block but also doesn't replace with fork.

The WSL bug doesn't appear to be so bad. Simple testing on 1709 indicates that vfork behaves just like fork:

$ cat test.c
#include <stdio.h>
#include <unistd.h>

int main(void) {
    volatile int x = 0;
    if (vfork() == 0) {
        sleep(1);
        write(1, "child\n", 6);
        x = 42;
        _exit(0);
    }
    printf("parent\n");
    sleep(2);
    printf("%d\n", x);
    return 0;
}
$ gcc test.c
$ ./a.out
parent
child
0

0 indicates that address space is not actually shared. On 1803, where vfork was fixed, the output is correct:

child
parent
42

@joshudson
Copy link
Contributor Author

@izbyshev : OK then. The documentation said it was worse.

@janvorli
Copy link
Member

janvorli commented Feb 1, 2019

@stephentoub so I did experiments with vfork / fork. Here is what my test app does:

  • mmap certain amount of memory and writes something to every page.
  • call fork or vfork
  • in the child path, call execve that just calls ls for simplicity
  • in the parent path, wait for the child to exit and then exit too

I ran the tests on my native Linux box with 24GB of RAM and swap enabled / disabled. I've tried to increase the amount of memory the test was eating until the execve forking failed.
Here are the results

function swap max_mem
fork on 17GB
vfork on 24GB
fork off 4.6GB
vfork off 9.47GB

As you can see, the vfork have worked fine with much more memory consumed by the parent with both the swap on and off.
That means that if a process that consumes a lot of memory spawns a child process, it has much higher probability of failing with fork than with vfork.

@stephentoub
Copy link
Member

so I did experiments with vfork / fork

Thanks, @janvorli. And you're comfortable with the change, such that if vfork is available, we use it? i.e. all of the previously raised concerns are no longer applicable on any platform?

@joshudson
Copy link
Contributor Author

On the other hand I was expecting somebody to tell me what particular build magic you wanted to use to exclude Mac (the only suspect vfork() implementation even being considered for .NET support). I went over the modern BSD documents and there's no issue there.

@janvorli
Copy link
Member

janvorli commented Feb 8, 2019

@joshudson #ifdef __APPLE__ is all the magic you need.

@joshudson
Copy link
Contributor Author

Something's up with master builds; looks more like the build chain is broken than anything I did. My machine yields:

/home/joshua/netcore/dotnet/Tools/tests.targets(579,5): error : One or more tests failed while running tests from 'System.Drawing.Common.Tests' please check /home/joshua/netcore/dotnet/bin/tests/System.Drawing.Common.Tests/netcoreapp-Linux-Debug-x64/testResults.xml for details! [/home/joshua/netcore/dotnet/src/System.Drawing.Common/tests/System.Drawing.Common.Tests.csproj]
/home/joshua/netcore/dotnet/Tools/tests.targets(579,5): error : One or more tests failed while running tests from 'System.Net.NameResolution.Pal.Tests' please check /home/joshua/netcore/dotnet/bin/tests/System.Net.NameResolution.Pal.Tests/netcoreapp-Linux-Debug-x64/testResults.xml for details! [/home/joshua/netcore/dotnet/src/System.Net.NameResolution/tests/PalTests/System.Net.NameResolution.Pal.Tests.csproj]
/home/joshua/netcore/dotnet/dir.traversal.targets(77,5): error : (No message specified) [/home/joshua/netcore/dotnet/src/tests.builds]
4 Warning(s)
3 Error(s)

which are the same tests that fail on master for me.

@joshudson
Copy link
Contributor Author

@stephentoub : One last commit to get rid of the remaining typos. I'm a poor speller.

@stephentoub
Copy link
Member

@janvorli, does this look good to you?

@stephentoub
Copy link
Member

@dotnet-bot test Packaging All Configurations x64 Debug Build please
@dotnet-bot test UWP CoreCLR x64 Debug Build please

@stephentoub stephentoub reopened this Feb 14, 2019
@stephentoub
Copy link
Member

Thanks, @joshudson. At this point we'll merge it. If we need to revert for some reason, thankfully it's a one character change to delete the 'v' :-)

@stephentoub stephentoub merged commit 0a561e3 into dotnet:master Feb 15, 2019
@janvorli
Copy link
Member

@janvorli, does this look good to you?

Yes, it does. I am sorry for a late response, I was OOF last week.

Additionally, I was wondering if it would make sense to add an env variable that would enable switching back to fork for users that would not be comfortable with the fact we use vfork for some reason.

@stephentoub
Copy link
Member

Additionally, I was wondering if it would make sense to add an env variable that would enable switching back to fork for users that would not be comfortable with the fact we use vfork for some reason.

Why would someone be uncomfortable? Your comment makes me worried again we shouldn't be using it.

@janvorli
Copy link
Member

I am not worried about it, but I have thought some people might be due to the negative articles mentioned in the comments above.

@joshudson
Copy link
Contributor Author

joshudson commented Feb 18, 2019

That's a fun tuning knob. Somebody writes this code (2007 me), gets the memory corruption, changes the knob, slows down the GC, and it goes away:

NativeMethods.EnumSomething(myCallback);

In 2007 there was this bug where the callback function went out of scope immediately despite the documentation saying when the Enum function returned. I don't know if it still exists. Things like these can be set up that do easily enough.

In general, expect a few timing issues being reported as vfork problems. Otherwise I'd be meh about

if (0==(pid=((strcmp(getenv("COREFX_USE_FORK")?:"","1")?vfork:fork)())))

vfork is like setjmp; you can't write (condition? vfork ():fork()) so you have to write (condition?vfork:fork)() . See K&R on setjmp for details.

@stephentoub
Copy link
Member

I am not worried about it, but I have thought some people might be due to the negative articles mentioned in the comments above.

I think we should either be confident enough in the change to be able to explain to everyone why it's safe, or we should revert it. Having a knob for this seems wrong to me.

In general, expect a few timing issues being reported as vfork problems.

Can you elaborate on this? What kinds of timing-related issues are you expecting to see reported?

@joshudson
Copy link
Contributor Author

It changes GC speed so use-after-free p/invoke bugs can look like this knob does something.

@janvorli
Copy link
Member

I think we should either be confident enough in the change to be able to explain to everyone why it's safe, or we should revert it.

Yeah, I think you are right.

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…fx#33289)

* Use vfork() to start child processes where this yields a performance improvement due to getting rid of page faults.

* Remove specific dependency on shared memory vfork so that cross compiles work again.

* Added signal mask code so that a child process can't confuse the parent; also took care of pthread cancellation mask in case third-party native code tries to use it

* Fix issues from vfork() pull request review

* Check handler before replacing it

* Improve readability of signal handler removing

* Convert tabs to spaces

* use NSIG instead of dynamic probing because glibc punches a hole in the middle of the signal list

* Exclude Mac OSX from vfork() because we don't quite trust it.

* Fix one last batch of typos


Commit migrated from dotnet/corefx@0a561e3
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants