Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dapr run shut down app and daprd gracefully #696

Closed
wants to merge 2 commits into from

Conversation

pkedy
Copy link
Member

@pkedy pkedy commented Apr 23, 2021

Send the running app and dapr an interrupt signal instead of killing them immediately. This allows the processes to clean up appropriately.

Description

This PR makes a change to send interrupt signals to the application and daprd processes and waits for them to shutdown gracefully instead of killing them immediately. The user will see the 5 second delay for outstanding requests to finish.

Issue reference

Resolves #695

Checklist

Please make sure you've completed the relevant tasks for this PR, out of the following list:

  • Code compiles correctly
  • Created/updated tests
  • Extended the documentation

@codecov
Copy link

codecov bot commented Apr 23, 2021

Codecov Report

Merging #696 (3824ddb) into master (4d75238) will increase coverage by 0.10%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #696      +/-   ##
==========================================
+ Coverage   21.82%   21.92%   +0.10%     
==========================================
  Files          29       29              
  Lines        1535     1537       +2     
==========================================
+ Hits          335      337       +2     
  Misses       1159     1159              
  Partials       41       41              
Impacted Files Coverage Δ
pkg/standalone/run.go 60.16% <100.00%> (+0.65%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4d75238...3824ddb. Read the comment docs.

Copy link
Collaborator

@mukundansundar mukundansundar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkedy Is there a specific reason to load a DLL file to call an event in windows? Is there any other simpler way to do this?
@artursouza @wcs1only Thoughts on this change?

Copy link
Member

@artursouza artursouza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there is a commented out line of code. If not needed, please, remove.

@@ -359,10 +359,9 @@ func testRun(t *testing.T) {
output, err := spawn.Command(daprPath, "run", "--dapr-http-port", "9999", "--", "bash", "-c", "curl -v http://localhost:9999/v1.0/shutdown; sleep 10; exit 1")
t.Log(output)
require.NoError(t, err, "run failed")
assert.Contains(t, output, "Exited App successfully", "App should be shutdown before it has a chance to return non-zero")
//assert.Contains(t, output, "Exited App successfully", "App should be shutdown before it has a chance to return non-zero")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this commented out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to keep that check, but I could not find a way too. I'll put a comment there instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the output different from what was expected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its been a while since I was working on this, but the issue is that if you send curl a CRTL-C/interrupt, it still comes back with a non-zero exit code depending on the OS. We need a process that exits with 0 even if sent an interrupt/ctrl-c.

Copy link
Contributor

@wcs1only wcs1only May 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried trap ? Also, it is not curl that is getting interrupted here, it is the sleep.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about it but avoided that in case we can get Windows docker working for GitHub actions. (That would also assume Windows would have some form of curl installed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the grand scheme of things, Im not 💯 sure we really should test the exit code of the app, just that it ran and existed. In Daprd, it has explicit code to exit with 0 if it is interrupted. Most apps don't do that.

Copy link
Contributor

@wcs1only wcs1only May 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows github actions has curl. This test works on windows today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the grand scheme of things, Im not 💯 sure we really should test the exit code of the app, just that it ran and existed. In Daprd, it has explicit code to exit with 0 if it is interrupted. Most apps don't do that.

That's fine if they don't do that. There is no harm in telling the user what the return code was.

@artursouza
Copy link
Member

@mukundansundar Depending on a DLL is a problem. We should avoid that. I saw that the latest version does not have that so I am fine with this.

assert.Contains(t, output, "Exited App successfully", "App should be shutdown before it has a chance to return non-zero")
// It would be ideal to check that the spawned app existed with a non-zero
// exit code, however when sending a SIGINT/CTRL-C to curl, it does not.
// In the future we could find a program that returns 0 in this scenario.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might as well not have have this test at all without this check. The exit 1 is how we tell the difference between "shutdown received and respected" vs "shutdown ignored".

willardstanley@Willards-MacBook-Pro cli % bash -c 'trap "exit 0" SIGINT; sleep 200; exit 12'; echo $? 
^C0

Copy link
Contributor

@wcs1only wcs1only May 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it doesn't even have to be that sophisticated you could just change exit 1 to echo -e "\nSHOULD_NEVER_GET_EXECUTED" and then do a NotContains"== APP == SHOULD_NEVER_GET_EXECUTED" on the output. The point is, we shouldn't just turn off a test like this...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Litmus test: If you mangle the curl URL, so that the shutdown API doesn't get called, does the test fail like it should?

@wcs1only
Copy link
Contributor

wcs1only commented May 7, 2021

Did you run the E2E tests on windows? When I run them on my Windows box for this change, they fail quite hard.

C:\Users\Charlie\cli>make e2e-build-run-sh
CGO_ENABLED=0 GOOS=windows GOARCH=amd64 go build  -ldflags "-X main.version=edge -X main.apiVersion=1.0" \
-o ./dist/windows_amd64/release/dapr.exe;
# The desire here is to download this test dependency without polluting go.mod
# In golang >=1.16 there is a new way to do this with `go install gotest.tools/gotestsum@latest`
# But this doesn't work with <=1.15, so we do it the old way for now
# (see: https://golang.org/ref/mod#go-install)
GO111MODULE=off go get gotest.tools/gotestsum
gotestsum --jsonfile test_output.json --format standard-verbose -- -count=1 -tags=e2e ./tests/e2e/standalone/...
=== RUN   TestStandaloneInstall
=== RUN   TestStandaloneInstall/test_install
    standalone_test.go:170: {"time":"2021-05-07T20:56:04.5943049Z","status":"pending","msg":"Making the jump to hyperspace..."}
        {"time":"2021-05-07T20:56:04.6100291Z","status":"pending","msg":"Downloading binaries and setting up components..."}
        {"time":"2021-05-07T20:56:13.3019072Z","status":"success","msg":"Downloading binaries and setting up components..."}
        {"time":"2021-05-07T20:56:13.3019535Z","status":"success","msg":"Downloaded binaries and completed components set up."}
        {"time":"2021-05-07T20:56:13.3019535Z","status":"info","msg":"daprd binary has been installed to C:\\Users\\Charlie\\.dapr\\bin."}
        {"time":"2021-05-07T20:56:14.1285943Z","status":"info","msg":"dapr_placement container is running."}
        {"time":"2021-05-07T20:56:14.9331928Z","status":"info","msg":"dapr_redis container is running."}
        {"time":"2021-05-07T20:56:15.7328786Z","status":"info","msg":"dapr_zipkin container is running."}
        {"time":"2021-05-07T20:56:15.7328786Z","status":"info","msg":"Use `docker ps` to check running containers."}
        {"time":"2021-05-07T20:56:15.7328786Z","status":"success","msg":"Success! Dapr is up and running. To get started, go here: https://aka.ms/dapr-getting-started"}

    standalone_test.go:188: [/dapr_zipkin] openzipkin/zipkin running
    standalone_test.go:188: [/dapr_redis] redis running
    standalone_test.go:188: [/dapr_placement] daprio/dapr:1.1.1 running
    standalone_test.go:188: [/boring_wing] alpine running
    standalone_test.go:188: [/eloquent_blackwell] jongallant/ubuntu-docker-client running
=== RUN   TestStandaloneInstall/test_install/daprd
=== RUN   TestStandaloneInstall/test_install/dashboard
=== RUN   TestStandaloneInstall/test_install/config.yaml
=== RUN   TestStandaloneInstall/test_install/components\statestore.yaml
=== RUN   TestStandaloneInstall/test_install/components\pubsub.yaml
=== RUN   TestStandaloneInstall/test_run
=== RUN   TestStandaloneInstall/test_run/Normal_exit
←[31mERROR ←[0mfailed to interrupt 'go test': not supported by windows
FAIL    github.com/dapr/cli/tests/e2e/standalone        53.771s
make: *** [Makefile:171: test-e2e-sh] Interrupt

It looks like the parent process is getting the interrupt as well.

I suspect that you need to set the CREATE_NEW_PROCESS_GROUP on the processes that the CLI spawns in order to safely send a CTL_BREAK to them. See https://docs.microsoft.com/en-us/windows/win32/procthread/process-creation-flags for more info.

cmd.SysProcAttr = &syscall.SysProcAttr{
		CreationFlags: syscall.CREATE_NEW_PROCESS_GROUP,
}

…them immediately. This allows the processes to clean up appropriately.
@pkedy pkedy force-pushed the app_and_dapr_signal_interrupt branch from dec9403 to b43ffa1 Compare October 28, 2021 17:46
@pkedy pkedy requested review from a team as code owners October 28, 2021 17:46
@mukundansundar
Copy link
Collaborator

@pkedy Is this PR still valid?

@pkedy
Copy link
Member Author

pkedy commented Nov 18, 2021

@mukundansundar Yes, I think this is still valid. Last I looked at it the tests were hanging. Now that #835 is in, I'll take another look.

@dapr-bot
Copy link
Collaborator

dapr-bot commented Jan 4, 2022

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@dapr-bot dapr-bot added the stale label Jan 4, 2022
@dapr-bot
Copy link
Collaborator

This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@dapr-bot dapr-bot closed this Jan 11, 2022
@mukundansundar
Copy link
Collaborator

@pkedy Can this PR be reopened and will you be looking into this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dapr run CTRL-C kills both the app and daprd
5 participants