-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernel handshaking pattern proposal #66
Changes from 8 commits
88b33ef
9ebe5b7
93e61ca
ed24523
0331184
920c07f
2f13d4d
a1fdffa
1bc2a7f
6eb4ff5
5d4d257
aff6a5f
6cd6e7f
6619122
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,53 @@ | ||||||
# Kernel Handshaking pattern | ||||||
|
||||||
## Problem | ||||||
|
||||||
The current implementation of Jupyter client makes it responsible for finding available ports and pass them to a new starting kernel. The issue is that a new process can start using one of these ports before the kernel has started, resulting in a ZMQError when the kernel starts. This is even more problematic when spawning a lot of kernels in a short laps of time, because the client may find available ports that have already been assigned to another kernel. | ||||||
|
||||||
A workaround has been implemented for the latter case, but it does not solve the former one. | ||||||
|
||||||
## Proposed Enhancement | ||||||
|
||||||
We propose to implement a handshaking pattern: the client lets the kernel find free ports and communicate them back via a dedicated socket. It then connects to the kernel. More formely: | ||||||
|
||||||
- When the client starts, it opens a dedicated socket A for receiving connection information from kernels (channel ports). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the lifetime of this dedicated socket? Is it per-kernel launch or once per "application". I suppose this could be considered an implementation detail, but I think the JEP should minimally state that. However, so that different implementations can succeed, I think we should minimally include the kernel's ID in the response information sent to the kernel and require the kernel to include it in the connection information it returns. This would then allow implementations to use a "dedicated socket" that spans multiple kernel launches, should they choose to do so. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would consider it an implementation detail, but it's worth specifying that:
|
||||||
- When launching a new kernel, the client passes its address and the port of this socket to the kernel. | ||||||
SylvainCorlay marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
- The kernel starts, find free ports to bind the shell, control, stdin, heartbeat and iopub sockets. It then connect to the A socket and send the connection information to the client. | ||||||
- Upon reception of the connection information, the client connects to the kernel. | ||||||
|
||||||
The way the client passes its address and the port of the listening socket to the kernel should be similar to that of passing the ports of the kernel socket in the current implementation: a connection file that can be read by local kernels or sent over the network for remote kernels (although this requires a custom kernel provisioner or "nanny"). | ||||||
|
||||||
The kernel specifies whether it supports the handshake pattern via the "kernel_protocol_version" field in the kernelspec: | ||||||
- if the field is missing, or if its value if less than 5.5, the kernel supports passing ports only. | ||||||
- if the field value is >=5.5 and <6, the kernel supports both mechanisms. | ||||||
- if the field value is >=6, the kernel supports the handshake pattern. Clients should not assume the kernel still supports the old mechanism. | ||||||
|
||||||
### Impact on existing implementations | ||||||
|
||||||
Although this enhancement requires changing all the existing kernels, the impact should be limited. Indeed, most of the kernels are based on the kernel wrapper approach, or on xeus. | ||||||
|
||||||
Most of the clients are based on `jupyter_client`. Therefore, the changes should only be limited to this repository or external kernel provisioners. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
A transition period where clients and kernels support both mechanisms should allow kernels to gradually migrate to the new version of the protocol. Support for the handshaking pattern is indicated in the kernelspec via `kernel_protocol_version` as stated above. | ||||||
|
||||||
## Relevant Resources (GitHub repositories, Issues, PRs) | ||||||
|
||||||
### GitHub repositories | ||||||
|
||||||
- Jupyter Client: https://github.com/jupyter/jupyter_client | ||||||
The Jupyter protocol client APIs | ||||||
- Voilà: https://github.com/voila-dashboards/voila | ||||||
Voilà turns Jupyter notebooks into standalone web applications | ||||||
- IPyKernel: https://github.com/ipython/ipykernel | ||||||
IPython kernel for Jupyter | ||||||
- Xeus: https://github.com/jupyter-xeus/xeus | ||||||
The C++ implementation of the Jupyter kernel protocol | ||||||
|
||||||
### GitHub Issues | ||||||
|
||||||
- Spawning many kernels may result in ZMQError (https://github.com/jupyter/jupyter_client/issues/487) | ||||||
- Spawning ~20 requests at a time results in a ZMQError (https://github.com/voila-dashboards/voila/issues/408#issuecomment-539968325) | ||||||
|
||||||
### GitHub Pull Requests | ||||||
|
||||||
- Prevent two kernels to have the same ports (https://github.com/jupyter/jupyter_client/pull/490) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.