-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: uninstall custom signal handlers before shutdown #5913
Conversation
@holmanb , I'm assuming we're looking for some confirmation that this approach works before reviewing? |
We tested this patch in MAAS but it does not work. As per chat we decided to release a fix in MAAS directly. So, if this patch was supposed to be used only for MAAS we can close it. Thanks! |
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.) |
@@ -64,7 +64,7 @@ class TestPowerChange: | |||
[ | |||
("poweroff", "now", "10", "will execute: shutdown -P now msg"), | |||
("reboot", "now", "0", "will execute: shutdown -r now msg"), | |||
("halt", "+1", "0", "will execute: shutdown -H +1 msg"), | |||
("halt", "+1", "0", re.escape("will execute: shutdown -H +1 msg")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this ever pass?
Perhaps something changed in the verify ordering utility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a good likelihood it never passed
return None | ||
|
||
|
||
def inspect_handler(sig: Union[int, Callable, None]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gather some data that would be useful for future debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we won't actually see these logs unless we manually call cloud-init with cloud-init --debug
. This is currently getting called pre-log setup.
@@ -106,8 +109,9 @@ def handle(name: str, cfg: Config, cloud: Cloud, args: list) -> None: | |||
break | |||
if (upgrade or pkglist) and reboot_if_required and reboot_fn_exists: | |||
try: | |||
LOG.warning( | |||
"Rebooting after upgrade or install per %s", reboot_marker | |||
LOG.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is expected, so the warnings log level doesn't make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General approach LGTM.
Can we get some unittests of the new context manager? We obviously need to mock the exit, but I think we could use the context manager and then just call the handler directly and check the logs.
Also, if we're concerned about all the places this would need to be plumbed, I'm wondering if it'd be possible to check the process tree and determine if whatever invoked the reboot was initiated by cloud-init and just ignore those. I'm not saying we should do that here, but it's an alternative to think about.
Is that possible? Does systemd expose the PID/GID of the process that set a specific boot target? I can't think of a way to do this that wouldn't be completely broken. |
It appears so, but none of the options presented there seem reasonable for this use case, so...just ignore me 😄 |
I still don't think so. If a user passes a shutdown command through a script, cloud-init doesn't receive the signal from its own child process. The process that sends the signal on reboot/shutdown is systemd. What we would need for this to be possible is to know which process set the systemd boot target to shutdown / reboot / etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Not something I'm asking to change, but I did find the new tests a bit hard to grok. Since the test body contains an if
condition for each of the 2nd parametrizations, I think that 3 separate tests might have been more appropriate, but that could just be personal preference.
Hrm...my approval didn't see the CI failures. There appears to be some legitimate ones there, so let's not merge quite yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments inline
return None | ||
|
||
|
||
def inspect_handler(sig: Union[int, Callable, None]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we won't actually see these logs unless we manually call cloud-init with cloud-init --debug
. This is currently getting called pre-log setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Proposed Commit Message
Additional Context
This might address the following bug reports:
https://bugs.launchpad.net/maas/+bug/2089185
#5849
Test Steps
I haven't been able to reproduce either bug, so testing is needed.
Here is are debs for testing (funny name because Github is unaware of the existence of debian files):
noble: cloud-init_24.3.1-1174-gf7bbd23d-1~bddeb_all.tar.gz
jammy: cloud-init_24.3.1-1016-ge3d0bcd4-1~bddeb_all.tar.gz
Merge type