Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Multiple ContentsManager #3233

Open
mlucool opened this issue Jan 18, 2018 · 5 comments
Open

Allow Multiple ContentsManager #3233

mlucool opened this issue Jan 18, 2018 · 5 comments

Comments

@mlucool
Copy link
Contributor

mlucool commented Jan 18, 2018

Can we make it possible to allow for multiple content managers in notebook server?

In a complex environment there may be a more than one type of notebook source so it is better to compose them from the right managers. For example, some notebooks may be from disk, others from a database, while others that purely are read only and exist in memory. Which content manager you call should be driven by the path (/file/full/path/to/file, /db/table/entry).

I was able to write something that works as a POC for this by taking a ContextManager and then dispatching all calls to a list of internally managed ContextMangers. It simply goes in order of those managers and calls an additional function on each, manages_path(path), which returns true if it wants to manage this path. It then passes on the call to this manager and returns its output. This is crude and would be better if there was a fully supported way to make this happen as I imagine there will likely be similar issues with things like the FileCheckpoints etc.

Would this be something this project would be open to taking up?

@takluyver
Copy link
Member

@Carreau and @KesterTong already built something like this called MixedContentsManager as part of a project to integrate Jupyter with Google Drive. It hasn't been touched for a while, but maybe there's something useful to work with there.

From the notebook's point of view, I think it's fine to do this entirely within the ContentsManager, which is already pluggable. So I'd suggest building it as a separate thing for now, and then proposing it for inclusion in notebook once it's working nicely.

@Carreau
Copy link
Member

Carreau commented Jan 18, 2018

Yep there was some plan to make the mixed content manager the default, and just allow to add/remove extra content managers. @ian-r-rose may have also worked on that with his github and google drive plugins for JupyterLab.

@ian-r-rose
Copy link
Member

Yes, I have worked on a client-side approach to this in JupyterLab, which dispatches requests to different storage backends, each of which implements the Contents API. This is currently used for the JupyterLab Google Drive and GitHub plugins.

@mlucool
Copy link
Contributor Author

mlucool commented Jan 18, 2018

This code is a WIP but does work:

A ContentsManager that passes to other ContentsManagers who also have a manages_path function:

from notebook.services.contents.filemanager import ContentsManager

def make_dispatched(self, function):
    """
    This function inspects a function and dispatches it out by finding which
    arg is the path, then finding a manager to manage that path and forwarding
    this on
    """
    # functions passed here have self in it, when it's called its bound so -1 for no self
    path_idx = inspect.getargspec(function)[0].index('path') - 1

    def wrapper(*args, **kwargs):
        path = kwargs['path'] if 'path' in kwargs else args[path_idx]
        for manager in self.managers:
            if manager.manages_path(path):
                fn = getattr(manager, function.__name__)
                return fn(*args, **kwargs)
        # This self is jupyter's
        self.log.error("Could not find manager for path %s", path)
        raise Exception('Could not find manager for path "%s"' % path)

    return wrapper


_managers = None
def set_managers(managers):
    """
    We need a global list because I dont see where to pass this data to juptyer->manager
    """
    global _managers
    _managers = managers


class DispatchedContentsManager(ContentsManager):
    def __init__(self, **kwargs):
        global _managers
        self.managers = []
        for manager in _managers:
            # Init each manger with the same params
            inst = manager(**kwargs)
            if not hasattr(inst, 'manages_path'):
                raise RuntimeError('Count not find method manages_path in manager!')
            self.managers.append(inst)

        # FIXME: add support for rename_file! This does not follow the pattern below
        for name, method in inspect.getmembers(self,
                                               predicate=inspect.ismethod):
            if (not name.startswith('_') and
                        'path' in inspect.getargspec(method)[0]):
                setattr(self, name, make_dispatched(self, method))

Example manager (typically manages_path has a regexp in it):

from notebook.services.contents.filemanager import FileContentsManager

class DefaultFileManager(FileContentsManager):
    """
    A fallback manager to be used for all files
    """

    def manages_path(self, path):
        return True

Now in a jupyter config:

set_managers([DefaultFileManager]) # Pass in the class

Any suggestions?

@jessie-ross
Copy link

This seems to be a ready made solution:
https://github.com/viaduct-ai/hybridcontents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants