Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add systemd notify and watchdog support #2438

Merged
merged 4 commits into from
Oct 26, 2020

Conversation

ekohl
Copy link
Contributor

@ekohl ekohl commented Oct 20, 2020

Description

This is a rebased version of #2260 with additional changes. I squashed all commits from the other PR into a single one to keep credits for @acmh while also making it easy to rebase. All my changes are on top of it.

It is more than 169 lines change, but splitting it up might be hard with credit. If desired, I can first submit the additional events as a separate patch.

Your checklist for this pull request

  • I have reviewed the guidelines for contributing to this repository.
  • I have added an entry to History.md if this PR fixes a bug or adds a feature. If it doesn't need an entry to HISTORY.md, I have added [changelog skip] or [ci skip] to the pull request title.
  • I have added appropriate tests if this PR fixes a bug or adds a feature.
  • My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • I have updated the documentation accordingly.
  • All new and existing tests passed, including Rubocop.

@ekohl ekohl force-pushed the systemd-integration branch from 9e8c8c3 to 389ce58 Compare October 20, 2020 21:11
@nateberkopec nateberkopec added feature systemd waiting-for-review Waiting on review from anyone labels Oct 20, 2020
@nateberkopec
Copy link
Member

It is more than 169 lines change, but splitting it up might be hard with credit.

That's fine. I appreciate the effort and desire to share the credit :)

@ekohl ekohl force-pushed the systemd-integration branch from 9b5e8ba to ef3e75f Compare October 20, 2020 21:40
@ekohl
Copy link
Contributor Author

ekohl commented Oct 20, 2020

Added a skip_on :jruby to the tests since that doesn't work. At least a WIP test passed so let's see the full set now.

@nateberkopec
Copy link
Member

Hmmm... any idea why it doesn't work with JRuby? It should in theory, right?

@ekohl
Copy link
Contributor Author

ekohl commented Oct 20, 2020

It gave an error on bind with the unix socket, I'm not sure why. https://github.com/puma/puma/blob/master/docs/systemd.md#socket-activation does state:

Note: Socket activation doesn't currently work on JRuby. This is tracked in #1367.

Following the trail of issues it looks like this is still an issue on JRuby.

private

def watchdog_sleep_time
return unless SdNotify.watchdog?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would improve the understanding here to move this up to just after the def start_watchdog line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, moved.

This takes the work by @acmh and improves on it. This is done by
squashing all commits and rebasing it. Then the following changes were
made:

* Dropped SD_NOTIFY env var. There is aleady the NOTIFY_SOCKET env var
  presented by systemd and is redundant.
* Move code is pushed in Puma::Systemd
* on_reload now emits RELOADING=1 notification to systemd
* Drop lower bound check on usec. Systemd can only be configured in
  seconds and it's hard to misconfigure. The actual code should be safe.
* Clean up integration tests and skip on JRuby
@ekohl ekohl force-pushed the systemd-integration branch from ef3e75f to f098486 Compare October 21, 2020 13:13
@nateberkopec nateberkopec removed the waiting-for-review Waiting on review from anyone label Oct 21, 2020
@nateberkopec nateberkopec added this to the 5.1.0 milestone Oct 21, 2020
@nateberkopec
Copy link
Member

LGTM but would appreciate review from others. We'll merge this in after 5.0.3 releases.

@acmh
Copy link
Contributor

acmh commented Oct 21, 2020

Congratz @ekohl , thank you for the credit! Good work!

@ekohl
Copy link
Contributor Author

ekohl commented Oct 21, 2020

Thanks for the initial work @acmh! It was a very solid base to improve on.

@nateberkopec nateberkopec merged commit 288a4cf into puma:master Oct 26, 2020
@ekohl ekohl deleted the systemd-integration branch October 26, 2020 22:10
@rromanchuk
Copy link

I think i have a weird configuration issue using type=notify, everything starts up fine, serving request and then systemd kills at the WatchdogSec interval. I'll come back with better details after i poke around some more. Maybe i missed something with sd_notify

puma.service: start operation timed out. Terminating.
puma[19721]: [19721] - Gracefully shutting down workers...
puma[20181]: On worker shutdown...
puma[20182]: On worker shutdown...
systemd[1]: puma.service: Failed with result 'timeout'.

ping_f = watchdog_sleep_time

log "Pinging systemd watchdog every #{ping_f.round(1)} sec"
Thread.new do

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... doesn't it defeat the purpose if you start the watchdog as a separate thread? How would that ever fail? It should either be integrated into the main event loop or at least somehow check whether the service is actually still working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a fair point but I wouldn't know how to do it better. Perhaps this is a good enough implementation for now and it allows for a better implementation in the future. Maybe the thread should perform a request to see if it's served? The question then becomes: which one? Should it be configurable?

@ekohl
Copy link
Contributor Author

ekohl commented Nov 17, 2020

I think i have a weird configuration issue using type=notify, everything starts up fine, serving request and then systemd kills at the WatchdogSec interval. I'll come back with better details after i poke around some more. Maybe i missed something with sd_notify

Did you figure it out? Perhaps you should start without the watchdog to see if that does work.

@Dantemss
Copy link

@rromanchuk I just ran into a similar issue with a different daemon... if your ExecStart uses a shell, make sure you exec the puma command rather than just calling it so that systemd gets the correct process as the main process (otherwise you have both a shell and puma processes, systemd thinks the shell is the main process and for whatever reason that makes SdNotify.watchdog? return false)

@mperham
Copy link

mperham commented Nov 30, 2020

With Sidekiq I decided to vendor the code for the sd_notify gem because it's only ~50 lines of stable code. I didn't want to require the user to add the gem to their gemfile manually, that shouldn't be something they need to care about IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants