Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for moving JupyterHealth to be a subproject of JupyterHub #752

Open
minrk opened this issue Dec 4, 2024 · 6 comments
Open

Proposal for moving JupyterHealth to be a subproject of JupyterHub #752

minrk opened this issue Dec 4, 2024 · 6 comments

Comments

@minrk
Copy link
Member

minrk commented Dec 4, 2024

Proposal and context

We propose for the JupyterHub council to approve adopting JupyterHealth as an official subproject under the umbrella of JupyterHub. For background, JupyterHealth is an open source development effort led by a group of members of the Jupyter community, with funding support from a research effort titled Agile Metabolic Health (AMH), led by Fernando Pérez and Ida Sim from UC Berkeley and UCSF. The AMH project is a philanthropically funded effort with partners that include 2i2c.org, Simula Research Lab, The Commons Project (TCP), and the BIG IDEAs Lab at Duke University. It aims to bring open solutions to healthcare that connect personalized data managed by patients with their clinical team, and offers clinicians access to the modern data analytics stack through Jupyter, with the goal of improving patient care and the ability to deliver results at the point of care.

Motivation

The basic idea is that AMH is a clinical research effort undertaken by these teams, but its technology component is meant to be 100% open source and based on Jupyter, and that is what we refer to as JupyterHealth. In 2023, the team secured from the Jupyter Executive Council an authorization to use the JupyterHealth name, and originally we thought it would be better to develop this as a standalone effort, while maintaining all of the practices and workflow of the Jupyter community (most developers in the project already work on other parts of Jupyter), with an eye towards formal integration later.

In hindsight, we have come to realize that this is not ideal, as it presents the potential for confusion to outsiders: the name, team members and workflow all point to it as part of Jupyter, but it is currently a separate, unaffiliated project. In practice, we expect JupyterHealth to produce:

  • Most of its outputs as documentation, configuration examples, deployment practices, demonstration deployments, community building, and integration development for JupyterHub to be able to operate in a clinical environment, with all the regulatory constraints that entails, and interfacing with the vast acronym soup prevalent in that world. A draft of the project vision and documentation website is here.
  • Refinements and improvements to various aspects of the Jupyter ecosystem, the majority of which (if not all) will go directly upstream in JupyterHub.

The primary outputs of JupyterHealth will not be new software projects, which means we do not anticipate a substantial expansion of maintenance expectations from the existing JupyterHub team. In this regard, perhaps a good example/precedent can be found in Pangeo, a project that has been enormously impactful in earth sciences but focused the bulk of its software contributions in upstream technologies (Jupyter, Dask, Xarray, etc.) as well as developing documentation and community for using this stack to tackle challenges in earth science in the cloud. We hope JupyterHealth to have a similar effect, fostering a community of technologists in healthcare who want to use the open and interoperable tools of Jupyter in clinical workflows. This does not preclude package development, which we expect to mostly take the form of small extensions and plugins to smooth the connections between components as needed. An example is jupyter-smart-on-fhir, where we are working on a Jupyter Server Extension for authenticating to FHIR servers for authenticated data access. One new software project, the JupyterHealth Exchange (name still TBD) is currently owned by AMH member TCP, and may be moved under JupyterHealth in the future, which we would handle as any other repo adoption. Being clearly labeled JupyterHealth (a separate github org) will help delineate maintenance and maturity of any new projects as the responsibility of the JupyterHealth team, distinct from the broader JupyterHub team, while abiding by JupyterHub governance.

Considering the above, and the confusion generated by our original approach, we think that moving JupyterHealth to be officially part of JupyterHub, therefore governed by the JupyterHub council and practices, is a better path forward.

Logistics - people and repos

Currently, the JupyterHealth/AMH funds directly support the work of @minrk, @yuvipanda and @fperez, all members of the JupyterHub council, as well as @colliand (co-founder of 2i2c.org), @maryamv (managing director of AMH/JupyterHealth) and @ryanlovett, Berkeley engineer and Jupyter Distinguished contributor. It also funds other members of the organizations listed above, not directly involved with Jupyter. Considering this, these three members of the council recuse themselves from the vote.

Today, JupyterHealth lives in its own GitHub Organization. If this proposal is accepted, historically we might have moved all of its repositories under the JupyterHub org. However, as of May 7, 2024, Jupyter is now a GitHub Enterprise and this should make it viable to have a single project (JupyterHub) that manages more than one GitHub Organization. This proposal requests only the governance adoption of JupyterHealth under the JupyterHub subproject, and leaves the question of whether to move the repos or leave them as-is (while granting the JupyterHub council all administrative privileges) for later, once we understand better the practical benefits and implications of these new Enterprise features.

Feel free to reach out to @minrk, @fperez, or @maryamv for any questions and clarifications.

@fperez
Copy link
Collaborator

fperez commented Dec 4, 2024

Thanks @minrk for posting this! I'm happy to answer any questions that might arise (for reference, I co-authored the above document so it has my approval).

@maryamv
Copy link

maryamv commented Dec 4, 2024

Thanks, @minrk, for sharing our proposal. It's been a great collaborative effort to bring JupyterHealth to this stage. I'm happy to answer any questions about JuyterHealth or provide additional context as needed. Excited to see how we can continue to build on this within the Jupyter community.

@manics
Copy link
Member

manics commented Dec 5, 2024

Sounds good to me! My preference is to keep it in it's own GitHub org controlled by JupyterHub since it's a collection of components, and having it in a separate org helps with navigating our many repositories. I think it's also good for those components to have their own identity.

@choldgraf
Copy link
Member

I think that I'm in favor of this given the points above that this isn't going to introduce a lot of extra technical burden on the JupyterHub team. A few quick thoughts:

Sounds like maintenance burden and extra responsibility would be very little?

My main concern here is that JupyterHub is already over-exposed in the amount of infrastructure it must maintain, given the capacity of the team. That said, if this will largely serve as a "community of practice" and resources to help that community organize around JupyterHub workflows, then I think it could be a helpful partnership.

Practical implications?

What are the practical implications of making this a "sub-project of JupyterHub"? Would that mean that all jupyterhealth members become part of the jupyterhub team? Or that we'd create a different sub-team with totally separate membership rules / governance etc? Would JupyterHub have governing authority over JupyterHealth? (e.g., what if JupyterHub decided it wanted to rename JupyterHealth, would it have the authority to do so without requiring approval from Jupyter Health?)

Suggestion to check-in after a year

If we accept this proposal, I'd recommend that we set a time to check-in on how this sub-organization relationship is going. We might run into unexpected friction or confusion as a result of this, and it'd help to have a moment to reflect and all agree on if / how the relationship should change or continue.

@fperez
Copy link
Collaborator

fperez commented Dec 5, 2024

To @choldgraf's points:

Sounds like maintenance burden and extra responsibility would be very little?

I think that's the case - we intend most of this to simply be a motivating use case to strengthen JupyterHub (and uses of Jupyter in general) itself for this class of use cases (with more complex/delicate security constraints), which I think is an important space with benefits for many.

Practical implications?

I think it should follow the standard practices of existing teams and projects, where we have long-standing practices of consensus seeking but with, if necessary, votes from the council.

The key point is that by not requesting the creation of a new top-level Jupyter subproject, we're not asking to add a new council with a rep to the SSC, etc. But we have a long history of projects that manage more than one repo, with potentially multiple teams of maintainers for different areas (e.g. jupyterlab has different maintainers for jupyter-ai than for lumino).

Suggestion to check-in after a year

👍, great idea.

Thanks for the feedback/input! I'll keep monitoring and will try to address questions as best I can.

@minrk
Copy link
Member Author

minrk commented Jan 13, 2025

My main concern here is that JupyterHub is already over-exposed in the amount of infrastructure it must maintain, given the capacity of the team.

Absolutely, mine, too! I want to do this in a way that minimizes these things. For example, I don't want to create the expectation that JupyterHealth maintenance tasks will fall on Hub team members not already participating in Health activities while welcoming anyone to participate as desired (same as I believe it should be on a per-repo basis in the JupyterHub org). And in a worst-case scenario where JupyterHealth withers on the vine, I want it to be as easy as possible for JupyterHub to continue e.g. by archiving the whole jupyterhealth GitHub org. That's why I think a dedicated github org communicates best: clearer delineation of expectations of team responsibility than per-repo, which we have struggled with.

Would that mean that all jupyterhealth members become part of the jupyterhub team? Or that we'd create a different sub-team with totally separate membership rules / governance etc?

We have a list of teams here, I think it would be appropriate to define jupyter-health as a "team" there, much like mybinder.org operators and add current members. I don't think it would have any of its own rules, other than specifying these are the folks focusing on the jupyterhealth org. As it is, I think @rylo and @maryamv would be added to folks already on the jupyterhub team.

Would JupyterHub have governing authority over JupyterHealth?

Yes, that's the idea, in the same way that it has governing authority over binder.

what if JupyterHub decided it wanted to rename JupyterHealth, would it have the authority to do so without requiring approval from Jupyter Health?

Yes, technically, in that such a process should follow JupyterHub governance. Though with the JupyterHealth team already well represented on the JupyterHub Council, such an action can't really take place without the participation of Jupyter Health members (not that it should, in any case, since that wouldn't follow our contributor/team-inclusive practices). The already-existing significant overlap of people is part of why we are proposing JupyterHub as the logical home.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants