Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process.WaitForExit() hangs forever even though the process has exited in some cases #29232

Closed
madelson opened this issue Apr 11, 2019 · 39 comments

Comments

@madelson
Copy link
Contributor

Setup

Launch a process with stream redirection where that process itself launches a long-running process:

// in the host process (process1.exe)
var process2 = new Process
{
    StartInfo = 
    {
        FileName = "process2.exe"
        UseShellExecute = false,
        RedirectStandardOutput = true
    }
};
process2.OutputDataReceived += (o, e) => Console.WriteLine(e.Data);
process2.Start();
process2.BeginOutputReadLine();
process2.WaitForExit(); // never returns, even though process2 does exit fairly quickly
return; // returns from Main(), exiting

// in process2.exe
Console.WriteLine("Launching process3.exe!");
var process3 = new Process { StartInfo = { FileName = "process3.exe", UseShellExecute = false } };
process3.Start();
process3.Dispose();

// in process3.exe
while (true) // loop basically forever performing background tasks
{
     ...
}

Expected Behavior
Process1 receives the output of Process2. Once Process2 has exited, Process1 can move on from its WaitForExit() call. Meanwhile Process3 continues to run.

Actual Behavior
WaitForExit() never returns.

Notes
I believe the problem is that WaitForExit() waits for EOF to be returned from redirected streams that are being consumed asynchronously. However, I think that Process3 is somehow inheriting the stdio handles from Process2 which prevents EOF from being sent even after Process2 exits.

Note that unlike #26165 and #27115 , I am observing the behavior on Windows. I'm not sure if the same would happen on Linux.

@tmds
Copy link
Member

tmds commented Apr 12, 2019

I believe the problem is that WaitForExit() waits for EOF to be returned from redirected streams that are being consumed asynchronously. However, I think that Process3 is somehow inheriting the stdio handles from Process2 which prevents EOF from being sent even after Process2 exits.

Yes, Process2 and Process3 share the same standard out. What happens when you RedirectStandardOutput = true for Process3?

@tmds
Copy link
Member

tmds commented Apr 12, 2019

WaitForExit involves waiting for the output to reach EOF, which won't happen because Process3 is keeping it alive.

https://github.com/dotnet/corefx/blob/007571b90c3cc555cd30f3b467b87c22f9f665e0/src/System.Diagnostics.Process/src/System/Diagnostics/Process.Unix.cs#L210-L228

@danmoseley
Copy link
Member

@tmds the code above is Unix, the report is Windows (you probably intended this but just pointing out in case)

Looks like the code has a typo "WaitUtilEOF"

@tmds
Copy link
Member

tmds commented Apr 13, 2019

@danmosemsft what is the rationale for the WaitUtilEOF? From the code I think it is not really needed on Unix, but it seems to serve a role on Windows: the handle is Disposed after the WaitUtilEOF.

@madelson these two options should make it do what you want:

  • RedirectStandardOutput&Error = true of Process3
  • Close() StandardOutput&Error of Process2

@danmoseley
Copy link
Member

@tmds I do not know, and an not able to examine the code at the moment. Perhaps @krwq @stephentoub know.

@madelson
Copy link
Contributor Author

madelson commented Apr 13, 2019

What happens when you RedirectStandardOutput = true for Process3?

@tmds I agree that this would probably fix this specific situation. The problem is that in this case I don't control the source of process dotnet/corefx#2; that's just what it does.

@tmds
Copy link
Member

tmds commented Apr 13, 2019

@madelson you can do the Close, that should work too.

@danmoseley
Copy link
Member

@madelson can you please confirm this fixes it?

@madelson
Copy link
Contributor Author

@danmosemsft I will confirm soon when I next get a chance to retry the code. Thinking about this more I’m not convinced it will fix

@tmds
Copy link
Member

tmds commented Apr 17, 2019

Perhaps @krwq @stephentoub know.

@krwq @stephentoub do you know why WaitForExit reads standard out and error until EOF?

@krwq
Copy link
Member

krwq commented Apr 20, 2019

@tmds got no context on that, perhaps some simplification to avoid races between events and exit. If the question is just about linux then I suspect the code was ported from Windows and it's there for parity - if you think it shouldn't be there I think it makes sense to try to remove it and try running tests for some time in a loop and perhaps add some more targeted to see if there are any issues.

@madelson
Copy link
Contributor Author

@krwq I imagine that partly what it's trying to achieve is to give you some reasonable way of knowing when everything is done and no more data is coming. With WaitForExit() not returning until this happens, it allows you to set up data received events, wait for the process to finish, and then move on.

@tmds
Copy link
Member

tmds commented Apr 23, 2019

@madelson did the Close of process2 stdout/stderr work for you?

I think it makes sense to try to remove it and try running tests for some time in a loop and perhaps add some more targeted to see if there are any issues.

We should keep it in both, or remove it in both, to have the same behavior. I wonder if on Windows the Dispose of the handle would affect the behavior of standard out and standard error if they were not yet read until EOF. I'll make a PR that removes the readuntileof and see if some tests fail.

@stephentoub
Copy link
Member

Yeah, the "eof" part is a bit of a misnomer. It's really waiting until all background processing completes.

@madelson
Copy link
Contributor Author

@stephentoub perhaps a solution would be for the process streams to support cancellation / timeout. If they did, then something could be set up where the process exiting cancels any outstanding reads (although even in that case you might have to try one more read on the stream to be sure that there's nothing left in the pipe).

@Peperud
Copy link

Peperud commented Dec 11, 2019

Is there a workaround for this?

@danmoseley
Copy link
Member

@madelson
Copy link
Contributor Author

Close() StandardOutput&Error of Process2

did the Close of process2 stdout/stderr work for you?

@tmds I don't think this is a workaround because I am trying to capture the standard output of process2. If I close stdout before waiting for exit, I won't get the output. If I wait for exit before closing, then the program hangs.

@tmds
Copy link
Member

tmds commented Dec 17, 2019

@tmds I don't think this is a workaround because I am trying to capture the standard output of process2. If I close stdout before waiting for exit, I won't get the output. If I wait for exit before closing, then the program hangs.

Then you need to redirect the output of process 3, otherwise you can't know when process2 output ends.

@madelson
Copy link
Contributor Author

Then you need to redirect the output of process 3, otherwise you can't know when process2 output ends.

@tmds is this because processes 2 and 3 share a console? Otherwise, I don't see how one would affect the other.

@tmds
Copy link
Member

tmds commented Dec 18, 2019

Yes, when starting process2, process1 creates a pipe (RedirectStandardOutput) and both process2 and process3 stdout write to it.

@Peperud
Copy link

Peperud commented Jan 7, 2020

Am I reading this right? I must redirect the output of any process launched down the chain?

I can only ensure compliance for the processes that I start directly. But if any of them launches a process of their own I'll be doomed?

Doesn't sound quite right to me.

@tmds
Copy link
Member

tmds commented Jan 8, 2020

Am I reading this right? I must redirect the output of any process launched down the chain?

You are reading it right.
If you want to distinguish between the output of a direct child, and their children, those outputs need to write to different pipes.

I can only ensure compliance for the processes that I start directly. But if any of them launches a process of their own I'll be doomed?

It's a bug when a program launches a child that keeps stdout open when the child isn't going to provide output (especially if that child out-lives the parent process).

Related API proposal: https://github.com/dotnet/corefx/issues/35685

@Peperud
Copy link

Peperud commented Jan 8, 2020

You are reading it right.
If you want to distinguish between the output of a direct child, and their children, those outputs need to write to different pipes.

Understood, but I'm not concerned with distinguishing the outputs here.

It's a bug when a program launches a child that keeps stdout open when the child isn't going to provide output (especially if that child out-lives the parent process).

Not sure if this applies to what I see. No child down the line outlives its parent. All children (potentially) write output. Yet, the parent hangs on WaitForExit

Related API proposal: #28838

@tmds
Copy link
Member

tmds commented Jan 8, 2020

I'm not concerned with distinguishing the outputs here

I mean you don't want grand children keeping standard output open (even if they don't actively write to it).

No child down the line outlives its parent. All children (potentially) write output. Yet, the parent hangs on WaitForExit

If the child and everything down the line has exited, WaitForExit should return.
Are you running in a container?
Can you reproduce this?

@Peperud
Copy link

Peperud commented Jan 9, 2020

If the child and everything down the line has exited, WaitForExit should return.
Are you running in a container?

Yes.

Can you reproduce this?

I've moved away to a different arrangement, but if I get some time I'll try to reproduce again.

@tmds
Copy link
Member

tmds commented Jan 9, 2020

Are you running in a container?
Yes

WaitForExit can be quirky in containers when there is no init process. The init process reaps orphaned child processes. You can add one with docker run by adding an --init argument.

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the Future milestone Feb 1, 2020
@riku76
Copy link

riku76 commented Feb 13, 2020

I have the exact same issue on Linux that repeats when a native daemon is running my .NET Core 3.1.1 application as a child and this child is spawning another native child (eg. df or grep) the WaitForExit never returns even if the StandardOutput/Error are NOT redirected. From htop I can see that the child exits, shows quickly as a Zombie and then goes away, so it looks like .NET Core is closing the handle, but somehow ignoring that it exited. Also, the same code works fine if executed on command-line OR under Mono 6.4.0 (even under daemon). Platform is Raspberry Pi 4 (armv7).

@tmds
Copy link
Member

tmds commented Feb 13, 2020

@riku76 the issue described here is related to how a redirected stream keeps WaitForExit from returning. Since you are not redirecting streams, the root cause is different. Can you create a new issue?

@riku76
Copy link

riku76 commented Feb 13, 2020

@tmds created: #32225

@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@adamsitnik adamsitnik removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2020
@EHerzog76
Copy link

Process.WaitForExit() hangs forever
Process.WaitForExit(60000) hangs until the timeout is reached.

In the case, when you use RedirectStandardOutput
-) and you read the result syncron with Proess.ReadToEnd()
-) and you start a child-process e.g.:
Process.StartInfo.FileName = "powershell.exe";
Process.StartInfo.Arguments = "-NonInteractive -NoProfile -Command "& {Write-Output "Create a session with New-PSSession ..."; Invoke-Command -Session $session -ScriptBlock { Get-Process | ConvertTo-Json }; Write-Output "Delete your session ..."}";

Until now the only work-a-round is to read the StandardOutput asynchron with Process.BeginOutputReadLine();
Working code pattern can be found here:
https://stackoverflow.com/questions/139593/processstartinfo-hanging-on-waitforexit-why/53504707#53504707

@Evengard
Copy link

Evengard commented Mar 2, 2023

Any news on that one? dotnet test seems to be affected by that, and every dotnet test in my CI is running for 15 minutes because of that.

@tmds
Copy link
Member

tmds commented Mar 2, 2023

The expected behavior is that WaitForExit, when output is redirected, waits until the output has finished.

If the are grandchildren keeping the output open, then they keep the WaitForExit from returning. Though that may seem weird, output is output independent of it being produced by the direct child or a grandchild. If the grandchild isn't actually providing output, then the child should start it with redirected output. That is an issue with the child process, not with the .NET process that started the child.

For WaitForExit not working well in containers, there was a fix in .NET 7.0.3+.

@Evengard
Copy link

Evengard commented Mar 2, 2023

I just tried a .NET 7.0 docker container (sdk:7.0), it still hangs. So the problem probably is still here...

@tmds
Copy link
Member

tmds commented Mar 2, 2023

Does dotnet test work fine for you when running outside of a container, and do you only have an issue when it runs in a container?

@Evengard
Copy link

Evengard commented Mar 2, 2023

Seems like so. I've run the same command on the same host system on which I run the (kubernetes with k3s) containers with the same projects, in containers it rarely doesn't hang but most of the times still hang, on the host system seems to be fine.

@tmds
Copy link
Member

tmds commented Mar 3, 2023

I'm going to close this issue, as it was about standard out being kept open by grand children, and that causing WaitForExit not to return.

As I mentioned in #29232 (comment), that is the expected behavior.
And, if the grand child is not supposed to keep standard output open, then the child is responsible for redirecting standard output when it starts the grand child.

@Evengard if you have an issue with WaitForExit hanging, that occurs only in containers, and reproduces with .NET 7.0.3+, then create a new issue and include some information on how to reproduce the problem.

@tmds tmds closed this as completed Mar 3, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Apr 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests