task_mgr::spawn's shutdown_process_on_error` doesnt work reliably [P:3] [S:0] #3402
Labels
c/storage/pageserver
Component: storage: pageserver
c/storage
Component: storage
t/bug
Issue Type: Bug
triaged
bugs that were already triaged
Steps to reproduce
Was discovered via #3387
Apparently synthetic size calculation task had an error that triggered shut down. Synthetic size calculation shouldnt lead to pageserver shutdown, and this is fixed in #3392. But shutdown on error should still work even if its triggered erroneously. This is what this issue all about.
Expected result
pageserver restart.
Actual result
pageserver was stuck in semi-alive state when some of the tasks were stopped and some continue running. Postgres protocol listener was shut down so this resulted in
connection refused
errors during basebackups.Environment
prod.
Logs, links
The text was updated successfully, but these errors were encountered: