Restore old signal handler after shutdown #353

dhood · 2017-08-08T18:08:51Z

This PR makes two main changes:

the original signal handler is restored in an on_shutdown callback. This allows the original signal handler to be called even after rclcpp::shutdown has been called within a process
~~during shutdown, ignore sigints from interrupting the shutdown process. otherwise deadlocks described in Deadlock on sigint when multiple rmw impl's available rmw_implementation#25 can occur~~

This has been done to fix flaky tests caused by the deadlock described in ros2/rmw_implementation#25. For it to take effect, it has to be combined with a change such as ros2/demos@190d2f5 so the processes call shutdown. ~~If we were to instead call shutdown in the rclcpp signal handler, that change wouldn't be necessary. Is it appropriate to replace this block of code with a call to shutdown?~~

I have tried to maintain the preference for sigaction where available, following existing code, but please keep in mind that there may be subtleties of signal handlers that I'm likely to overlook

Standard CI

Linux
Linux-aarch64
macOS
Windows

repeating the list_paramters* tests (usually very flaky): (passed 60 times, failed on 60th because of startup issue that I understand to be unrelated)

dhood · 2017-08-09T01:01:37Z

@dirk-thomas has clarified that it is appropriate for the demo nodes to call shutdown themselves instead of having it called from rclcpp's signal handler. it doesn't necessarily make sense for the interrupt handler in rclcpp to call shutdown, since maybe users want to spin after sigint in some case.

ros2/system_tests#215 adds some tests

dhood · 2017-08-10T19:06:35Z

even if it's not appropriate to call shutdown from the signal handler I still think it's appropriate to ignore interrupts, because the deadlock occurs as a consequence of the guard condition triggering that happens in both places. as this PR is right now, launch_testing sending two interrupts in a row to a node can cause deadlock.

I'm going to factorise out the logic of (ingoring interrupts + triggering the guard condition) and call it from both the signal handler and rclcpp::shutdown

dhood · 2017-08-10T19:34:06Z

ec03f5f factors the guard condition triggering logic out, and also changes from manually ignoring SIGINTS to just skipping responding to them if g_is_interrupted is true (but signal_value != g_signal_status might be more appropriate?).

dirk-thomas · 2017-08-10T20:00:24Z

Why should a second SIGINT not notify the guard condition? I would expect a second signal to notify the condition again. Only in shutdown the signal handler is restored (symmetric to init). E.g. consider the following use case:

init()
...
spin()  // waiting for sigint to return from wait
// do some else
spin()  // a second SIGINT while waiting here should wake up the wait again
...
shutdown()

dhood · 2017-08-10T20:40:45Z

Thanks for the clear example: I had the ideas of interrupt and shutdown conflated. This is because I thought that it was a double interrupt that was causing another deadlock to occur, since the first interrupt triggered some destruction. Looking closer, the example that I was double interrupting was exiting after the first interrupt (that's where the destruction was coming from), so it was just another instance of the same issue as in ros2/rmw_implementation#25.

Just de-registering our signal handler should be sufficient to fix the deadlocks. I'll post back with CI to confirm

dhood · 2017-08-11T01:36:18Z

ok finally got to 50 test passes without the other parameter flakiness issue interfering: (this branch only)

the other flakiness issue is fixed in #356, so this job which includes commits from both branches was able to get the tests to pass 100 times in a row:

conclusion is that ignoring sigints during shutdown is not necessary; restoring the state of the old signal handler is sufficient

dirk-thomas · 2017-08-11T17:12:50Z

It would be good to separate the refatoring parts of the patch into one commit and the functional changes into a second commit.

dhood · 2017-08-11T17:36:21Z

done; the refactoring in d7b7d74 is optional, we can leave it out if it's easier

dirk-thomas

LGTM (to be merged without squashing)

dhood · 2017-08-11T21:02:43Z

CI after rebase

Linux
Linux-aarch64
macOS
Windows (known flaky tests)

* QoS Profile Overrides - Player Signed-off-by: Anas Abou Allaban <aabouallaban@pm.me>

dhood self-assigned this Aug 8, 2017

dhood added the in progress Actively being worked on (Kanban column) label Aug 8, 2017

dhood mentioned this pull request Aug 9, 2017

Add tests for user-defined signal handler ros2/system_tests#215

Merged

dhood added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Aug 9, 2017

dhood changed the title ~~Restore old signal handler after shutdown and ignore sigints during shutdown~~ Restore old signal handler after shutdown Aug 11, 2017

dhood mentioned this pull request Aug 11, 2017

Call to rclcpp::shutdown in demos so rclcpp signal handler gets removed ros2/demos#162

Merged

dhood added 3 commits August 11, 2017 10:31

Factor out signal handler swapping

c15db0b

Restore old signal handler on shutdown

be985a6

Factor out guard condition triggering

d7b7d74

dhood force-pushed the restore_old_signal_handler branch from 21c595a to d7b7d74 Compare August 11, 2017 17:32

dirk-thomas approved these changes Aug 11, 2017

View reviewed changes

dhood merged commit 89c43e7 into master Aug 11, 2017

dhood deleted the restore_old_signal_handler branch August 11, 2017 21:02

dhood mentioned this pull request Feb 27, 2018

Consider renaming rclcpp::ok #3

Closed

nnmm pushed a commit to ApexAI/rclcpp that referenced this pull request Jul 9, 2022

adapt to action implicit changes (ros2#353)

66b8229

DensoADAS pushed a commit to DensoADAS/rclcpp that referenced this pull request Aug 5, 2022

QoS Profile Overrides - Player (ros2#353)

72a62ea

* QoS Profile Overrides - Player Signed-off-by: Anas Abou Allaban <aabouallaban@pm.me>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore old signal handler after shutdown #353

Restore old signal handler after shutdown #353

dhood commented Aug 8, 2017 •

edited

Loading

dhood commented Aug 9, 2017

dhood commented Aug 10, 2017

dhood commented Aug 10, 2017

dirk-thomas commented Aug 10, 2017

dhood commented Aug 10, 2017

dhood commented Aug 11, 2017

dirk-thomas commented Aug 11, 2017

dhood commented Aug 11, 2017

dirk-thomas left a comment

dhood commented Aug 11, 2017

Restore old signal handler after shutdown #353

Restore old signal handler after shutdown #353

Conversation

dhood commented Aug 8, 2017 • edited Loading

dhood commented Aug 9, 2017

dhood commented Aug 10, 2017

dhood commented Aug 10, 2017

dirk-thomas commented Aug 10, 2017

dhood commented Aug 10, 2017

dhood commented Aug 11, 2017

dirk-thomas commented Aug 11, 2017

dhood commented Aug 11, 2017

dirk-thomas left a comment

Choose a reason for hiding this comment

dhood commented Aug 11, 2017

dhood commented Aug 8, 2017 •

edited

Loading