Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust Idle culler settings and add internal culling #1133

Merged
merged 10 commits into from
Mar 3, 2022
2 changes: 1 addition & 1 deletion qhub/template/stages/07-kubernetes-services/jupyterhub.tf
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ module "jupyterhub" {
name = "dask-etc"
namespace = var.environment
kind = "configmap"
}
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if extra mounts was instead added within the module so that we can directly reference the config map name. There is a terraform concat function that can be used.

Copy link
Contributor Author

@viniciusdc viniciusdc Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @costrouc, just to be sure... you are saying something like this

        extra-mounts      = merge(
          var.extra-mounts,
          {
            "/etc/jupyter" = {
              name = "server-idle-culling"
              namespace = var.namespace
              kind = "configmap"
            }
          }
        )

on main.tf? I changed over merge, as concat only works for lists

}

services = concat([
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# To help jupyterhub-idle-culler cull user servers, we configure the kernel manager to cull
# idle kernels that would otherwise make the user servers report themselves as active which
# is part of what jupyterhub-idle-culler considers.

# Extra config available at:
# https://zero-to-jupyterhub.readthedocs.io/en/1.x/jupyterhub/customizing/user-management.html#culling-user-pods

# Timeout (in seconds) in which a terminal has been inactive and ready to
# be culled.
c.TerminalManager.cull_inactive_timeout = 15 * 60

# The interval (in seconds) on which to check for terminals exceeding the
# inactive timeout value.
c.TerminalManager.cull_interval = 5 * 60

# cull_idle_timeout: timeout (in seconds) after which an idle kernel is
# considered ready to be culled
c.MappingKernelManager.cull_idle_timeout = 15 * 60

# cull_interval: the interval (in seconds) on which to check for idle
# kernels exceeding the cull timeout value
c.MappingKernelManager.cull_interval = 5 * 60

# cull_connected: whether to consider culling kernels which have one
# or more connections
c.MappingKernelManager.cull_connected = True

# cull_busy: whether to consider culling kernels which are currently
# busy running some code
c.MappingKernelManager.cull_busy = False

# Shut down the server after N seconds with no kernels or terminals
# running and no activity.
c.NotebookApp.shutdown_no_activity_timeout = 15 * 60

###############################################################################
# JupyterHub idle culler total timeout corresponds (approximately) to:
# max(cull_idle_timeout, cull_inactive_timeout) + shutdown_no_activity_timeout
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,17 @@ resource "random_password" "proxy_secret_token" {
special = false
}

resource "kubernetes_config_map" "server-idle-culling" {
metadata {
name = "server-idle-culling"
namespace = var.namespace
}

data = {
"jupyter_notebook_config.py" = file("${path.module}/files/04-idle-culler.py")
}
}

resource "helm_release" "jupyterhub" {
name = "jupyterhub"
namespace = var.namespace
Expand All @@ -30,7 +41,16 @@ resource "helm_release" "jupyterhub" {
shared-pvc = var.shared-pvc
conda-store-pvc = var.conda-store-pvc
conda-store-mount = var.conda-store-mount
extra-mounts = var.extra-mounts
extra-mounts = merge(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is concat no? Merged will combine the dicts https://www.terraform.io/language/functions/merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Contributor Author

@viniciusdc viniciusdc Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as var.extra-mounts is already a dict of the form

var.extra-mounts = {
   "/dask-etc" = {}
}

and extra-mounts expect a dict for jsonencode, so we will be merging those dicts indeed, but only the new key idle-culler-... will be included, the values are maintained the same. Wich results in

extra-mounts = {
   "/dask-etc" = {},
   "/jupyter" = {}
}

var.extra-mounts,
{
"/etc/jupyter" = {
name = "server-idle-culling"
namespace = var.namespace
kind = "configmap"
}
}
)
environments = var.conda-store-environments
}

Expand All @@ -41,9 +61,9 @@ resource "helm_release" "jupyterhub" {
}

extraConfig = {
"01-theme.py" = file("${path.module}/files/01-theme.py")
"02-spawner.py" = file("${path.module}/files/02-spawner.py")
"03-profiles.py" = file("${path.module}/files/03-profiles.py")
"01-theme.py" = file("${path.module}/files/01-theme.py")
"02-spawner.py" = file("${path.module}/files/02-spawner.py")
"03-profiles.py" = file("${path.module}/files/03-profiles.py")
}

services = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,19 @@ singleuser:
guarantee: "1G"
networkPolicy:
enabled: false

# cull relates to the jupyterhub-idle-culler service, responsible for evicting
# inactive singleuser pods.
#
# The configuration below, except for enabled, corresponds to command-line flags
# for jupyterhub-idle-culler as documented here:
# https://github.com/jupyterhub/jupyterhub-idle-culler#as-a-standalone-script
#
cull:
enabled: true
users: false
removeNamedServers: false
timeout: 1800
every: 600
concurrency: 10
maxAge: 0