-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add State Machine Handling to KernelManager #763
Comments
This is interesting, thanks for opening this issue @blink1073. If we were to include the triggers that initiate these transitions, it seems like we'd have the following ...
Because Restarting is literally a composition of shutdown and startup I'm not sure I've expressed that correctly, nor am I sure about the I suppose we could solely express Restarting in terms of the Terminating and Starting entries, but, from a contextual standpoint, I think it makes sense that KernelManager has a "Restarting" state. What do others think? |
About "shutdown and restart", I think we could clarify the specification. Restarting could be as simple as clearing the kernel's execution context, thus virtually removing the need for a "restarting" state. |
@kevin-bates, thank you for making that table. That's super helpful as we move into implementing this. One comment: we might not be able to conclude that @davidbrochart, I think "restart" has multiple meanings that need a lot more clarification (I'll explain more on the other issue). We can iterate on the restart states in this state machine in future PRs. I want to propose a few more states here.
I think we need more visibility into the ZMQ sockets here. A kernel process might be running, but communication with the kernel is failing for some reason. This could happen when ZMQ sockets go down (not impossible in a remote kernel scenario) or a third-party kernel implements the kernel messaging protocol incorrectly (speaking from personal experience when debugging other kernels 😅 ). |
@Zsailer and I talked a bit about this yesterday. A few ideas that came out:
|
@blink1073 and @Zsailer - this makes sense.
This seems to imply that the provisioner would need to provide a status that it doesn't provide today and asking implementers of provisioners to get this right may be asking too much. Just wondering if this shouldn't still be a function of the |
Hey @kevin-bates, thanks for the feedback. I think it would make sense to not make |
Listing out the full set of states while chatting with @blink1073:
The idea is that the "monitor" object would track all of these states. This monitor object would take the place of the |
In #751 we explored better internal state management for the
KernelManager
class.We decided there to adopt the
transitions
library. It is MIT licensed, already on conda-forge, and lets you producediagrams
from your state machine.We can produce the state machine graph image automatically, backed by CI.
A rough sketch looks like:
Currently server adds "execution_state" to the manager object based on kernel status messages. I think eventually we should move that logic here, but it doesn't need to be in this PR.
The states used here could mirror the ones used in
JupyterLab
:We are
Unknown
to start. State transitions are as follows:Then we have a
state: Unicode()
trait and a conveniencewait_for_state()
method that takesan optional desired state or returns for any state change. It would also raise an exception if it
gets to
Dead
state and it wasn't the expected state, along with the traceback.The
state
trait can also beobserved
directly.We also have an
exception: Unicode()
trait that is set when an exception leads to aDead
status.The text was updated successfully, but these errors were encountered: