-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype new kernel discovery machinery #261
Conversation
Cool, I definitely want to be tracking this. |
Me too 😉 |
👍 |
Would be very helpful for education folks if things are simplified. |
Would love to make sure that |
hooray |
I can't really claim that this will simplify things, unfortunately - it's probably more complex, at least in the default case. My aim is to make it easier to combine different ways of finding kernels. |
@takluyver Oddly, I think that will simplify things on a higher level ;-) |
# TODO: get full language info | ||
'language': {'name': spec.language}, | ||
'display_name': spec.display_name, | ||
'argv': spec.argv, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK env
is also possible in the json file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It certainly is, but the aim of this is not necessarily to expose all the information from the kernelspec file; it's to provide a kernel picker with the information it may need to select a kernel. Maybe env is part of that, though. One of the big parts I haven't yet addressed is what information is required in this dictionary, and what standardised but optional fields we'll provide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it should be the whole spec? Just so that we don't have to be responsible for what info the decider uses to select which kernel to run?
@@ -250,6 +240,8 @@ def start_kernel(self, **kw): | |||
# If kernel_cmd has been set manually, don't refer to a kernel spec | |||
# Environment variables from kernel spec are added to os.environ | |||
env.update(self.kernel_spec.env or {}) | |||
elif self.extra_env: | |||
env.update(self.extra_env) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
Or better, why not:
if self.extra_env:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KernelManager is rather complex because of the evolution of how we use it.
- Originally it had a configurable kernel command line, which you could use to make it start a different kernel, before Jupyter (then IPython) was really aware of other kernels. But this changed it for every KernelManager in a process, so you couldn't mix kernels in one notebook server, or one Qt console.
- Then with kernelspecs, it was meant to be given the name of a kernelspec, and it would retrieve argv and env from that. But we left the machinery for 1. in place so that people configuring
kernel_cmd
wouldn't find it broken as soon as they upgraded. - In this PR, I've reused the leftover machinery for 1. to allow kernel_cmd as an input, but not as a config option, and added
extra_env
to fit with this way of doing it. But I don't want to immediately break things for code using 2. Thiselif
branch is for 3., and theif
branch above is for 2. They shouldn't be mixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So If I would want to supply an env I would use
def make_manager(self, name):
[...]
env = get_env_for_name(name) # runs activate and would need caching to not do it again the next time
return KernelManager(.., extra_env=env)
Sounds good to me
For https://github.com/Cadair/jupyter_environment_kernels/ the most important feature (after an entry point :-) ) would be that it is possible to add env variables which are lazily loaded on kernel creation (in this case by running In our case, we currently accomplish that by supplying a loader function to a subclassed KernelSpec class: As far as I understand the new API, the Or did I misunderstod the API and that could be accomplished in the make_manager method? |
The jupyter environment kernel manager has some kind of caching nowadays: https://github.com/Cadair/jupyter_environment_kernels/blob/master/environment_kernels/core.py#L110-L124 It currently caches information where kernels are, but not the env variables, which are lazily loaded. Cached values are updated every 3 minutes. |
Is it really necessary to load all environment variables into the kernel spec? If you're going to start a kernel in a conda env, why not start it through a wrapper script that calls |
Yes, I think that this should be done in The idea is that the dictionary contains the information needed to select which kernel to use; once you have made a choice, it calls
For kernel specs, I think we will need to cache the language info in a persistent manner; getting the language info requires starting a kernel, making a kernel_info request, waiting for the reply, and shutting the kernel down again. We could save this as e.g. I think the kernel finders should have some kind of way to control cached information. Questions I have:
|
Unfortunately, that has the problem that (at least on windows, which has no exec) you get a process tree like this: Kernel starter (python) -> batch file (cmd) -> Kernel (python) This has it's own class of problems, one of them is that the current notebook can't interrupt/kill such kernels when they are busy (#104). Getting the env variables wasn't actually the problem, the code is battle tested in xonsh. Getting the lazy loading was more work... |
At least the environment manager will put the language tag into the json: https://github.com/Cadair/jupyter_environment_kernels/blob/master/environment_kernels/envs_common.py#L46 Why should that not work anymore? |
That's just the language name. We'll probably want to include some more of the details from the kernel_info_reply, like the language version and the kernel implementation. |
As long as that is computed when the kernel is started and not when the notebook/lab server starts to know about the kernel, as otherwise the lazy loading is again in vain. Anyway: we currently do a full import of the kernel classes/library, when we probe for a kernel in the env. So these two informations could already be gotten at that stage. |
That's why I was talking about persistent caching - that information probably requires starting kernel processes, but we'll probably want to know it when picking the kernel (e.g. distinguishing Python 3 vs 2). |
I've added a couple of tests, plus support for entry points (using the group name These are, I think, the big remaining questions:
Information:
What else might be relevant? Also, using entry points ties this to Python, which would make it hard for implementations in other languages (e.g. nteract) to replicate this framework. Do we:
|
I think the best option for other languages is some variant of 1 (interface with Python & jupyter_client to discover kernels). If frontends can still start their native kernel directly (e.g. a JS kernel for nteract), but require Python to find other kernels, perhaps this is not too onerous. The variants of this: 1a. As the proposal currently stands, kernels discovered through this mechanism would have to be started through Python as well. @rgbkrk I'd welcome your input from the nteract perspective (& that goes for any other nteract devs watching this). How would you like to interface with an extensible kernel discovery mechanism? Can you see some option I'm missing? |
/cc @lgeiger and @BenRussert for Hydrogen I'll get up to date on the state of this and add new comments. |
👍 to including language name
In practice, I've found this information not helpful. I don't generally want to distinguish Python 2 from 3 in my notebooks. Separating py2 and py3 may have made sense when we made that decision, but I don't think it does anymore. A Python notebook is a Python notebook, and I think language version belongs more at the same level as env/dependency-specification, and not at the same level as language name. I think we should even consider dropping the python2/python3 default kernel names for IPython, and just use
👍 to pushing env info one level down into the Finder spec for the given env type
👍 to a place for Finders to put their own extra info. As for where these things reside on Finders being part of the spec, my inclination would be option 3: rely on native implementations. Kernel Finders can choose to define their own specs for implementation in any language. This should be quite doable for the common cases: (base kernelspecs, envs), and I suspect of less interest for the cases where it would be more difficult. I think requiring Finders to emit argv/env misses much of the point of custom kernel finders, which can do things like launch kernels via remote APIs. These shouldn't involve subprocesses at any point. We could have a middle ground, where Finders may emit an argv/env which is accessible via CLI (e.g. Yet another CLI option is something like |
"kernel definition", as in, here is how to launch it and its associated metadata? |
Thanks for not using MetaKernel 😸 ... the naming is already a sensitive breakdown of the functionality without confusing it with that other project. |
Wait, haven't we already called this the "kernelspec"? |
Good point re Metakernel, I hadn't even thought of that. "kernelspec" is also used for the kernel.json file (and the directory containing it) which is just an advertising mechanism for a kernel (sense 1) installed elsewhere. And the kernelspec machinery is somewhat tied to launching a process locally, as it relies on argv. The concept of a type of kernel can extend to kernels that start on a remote machine. I think so far "kernel type" is the term I can most imagine using. "Kernel class" incorrectly implies a Python class. "Kernel launcher" sounds like the parent process machinery to start a kernel. "Kernel definition" is possible, but suggests static metadata rather than a program you can launch. E.g. "There are 7 kernel types available on your system" or "You need to install the C# kernel type to run this notebook" |
Perhaps we should do a draft doc page as part of this PR that summarizes this PR discussion and complements |
That's a good idea. I'll work on it this evening or tomorrow. |
I wasn't thinking of anything fancy, @takluyver. More of a recap and nomenclature summary. I'll give the code a more thorough review later today or tomorrow. Thanks. |
I tend to think of the running kernel as a kernel instance |
Have some docs. It looks much nicer when run through Sphinx. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@takluyver Thanks for the excellent documentation. It goes a long way to clarifying usage and nomenclature. I've made some suggestions. The suggestions could be addressed in a second PR too.
docs/kernel_providers.rst
Outdated
================ | ||
|
||
.. note:: | ||
This is a new interface under development. Not all Jupyter applications |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps mention that the interface is subject to change.
docs/kernel_providers.rst
Outdated
kernel types. | ||
|
||
By writing a kernel provider, you can extend how Jupyter applications discover | ||
and start kernels. To do so, subclass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add one short sentence of why one might wish to create a kernel provider.
Add a subheading: Create a kernel provider
docs/kernel_providers.rst
Outdated
return # Check it's available | ||
|
||
# Two variants - for a real kernel, these could be different | ||
# environments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
environment variables? pip environments? conda environments? Or implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably good to specify since environment is a pretty overloaded term.
extra_env={'ROUNDED': '1'}) | ||
else: | ||
raise ValueError("Unknown kernel %s" % name) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subheading: Registering a Kernel Provider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opted not to add a subheading here because it feels like part of the previous section to me. I've implemented all your other suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure no problem.
docs/kernel_providers.rst
Outdated
Its state includes a namespace and an execution counter. | ||
|
||
Kernel type | ||
Allows starting multiple, initially similar kernel instances. The kernel type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps move "Allows ... instances." to end of the paragraph so definition is first.
for a kernel, such as ``IRkernel``, would also be a different kernel type. | ||
|
||
Kernel provider | ||
A Python class to discover kernel types and allow a client to start instances |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIce!
docs/kernel_providers.rst
Outdated
==================== | ||
|
||
To find and start kernels in client code, use | ||
:class:`jupyter_client.discovery.KernelFinder`. This has a similar API to kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This...providers." Let's be a bit more detailed and concrete here before comparing to similar API. Perhaps recap at start of section what is a kernel type.
docs/kernel_providers.rst
Outdated
|
||
.. automethod:: make_manager | ||
|
||
Included kernel providers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Basic Kernel Providers included in Jupyter Client"
I'm not totally happy with my suggestion above but I think we need a bigger distinction between prior example section and this reference section.
], | ||
'jupyter_client.kernel_providers' : [ | ||
'spec = jupyter_client.discovery:KernelSpecProvider', | ||
'pyimport = jupyter_client.discovery:IPykernelProvider', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we being consistent with the id before = as described in entrypoints doc section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep:
I can't work out when Github does inline code previews and when it doesn't!
Rendered doc page on my test build. For the convenience of others. |
Thanks @willingc ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @takluyver. Looks great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done @takluyver and @willingc!
Does anyone else want to review this? If not, I'll push the button in another day or two. |
Button being pressed. |
Is the discovery mechanism already used in any of the jupyter projects? As far as I see the notebook still uses |
No, you're right, it's not used anywhere yet. We're still hashing out the details of how it works in different situations. #308 is a follow up to this PR. |
Damn, I just had some time and I wanted to port https://github.com/Cadair/jupyter_environment_kernels to it :-/ Ok, waiting for this to land in notebook land... Is there a realistic chance that a PR which "just" converts |
If you've got time to experiment with making that work on top of PR #308, that could be useful - we're currently trying to figure out if we've got a useful API design by making prototype things on top of it (I've done an ssh provider and a docker provider). Kernel specs become one default provider for the new discovery system, so no, we wouldn't convert KernelSpecManager to use the new system. |
Following jupyter/nbformat#81, this PR introduces extensible kernel finder machinery, to pave the way for different kernel pickers.
The kernel finder machinery would work like this:
find_kernels()
yields pairs of id (str) and kernel attributes (dict). The attributes contain information about the kernel. At present this has no required fields, but we'll probably want to require things similar to the language info inkernel_info_reply
.make_manager(kernel_id)
returns an instance ofKernelManager
, or a subclass, ready to start a kernel of the specified kind.ipykernel
in the current Python.spec/python3
means thepython3
kernel from thespec
finder. Other qualified kernel IDs might look like:conda/myenv
ordocker/ipython/ipykernel-nightly
.jupyter console --kernel spec/python3
. I propose that unqualified names (e.g.python3
) are treated as if they were prefixed withspec/
, ensuring backwards compatibility.pyimport/kernel
(or whatever we change that name to) will always get a kernel running on the same Python executable as the process starting it.find_kernels()
and use it to select the most appropriate one.Implementation notes: