Document the chart's non-HA status #1951

consideRatio · 2020-12-20T20:19:23Z

High Availability (HA) as I understand it the idea of running software in a way that makes it resistant to downtime by individual server failures, and perhaps further enhanced by features like a PodDisruptionBudget that can help avoid simultaneous downtime of multiple servers.

I raised an action point in the December JupyterHub team meeting that we document this better. This topic discussed in the context of changing of defaults of PDBs to not be enabled by default for our non-HA deployments.

We expose a configuration of replicas for various pods, but its mostly a remnant of the helm create command, rather than our actual ability to support it.

This Helm chart does not support HA in its hub pod, proxy pod, autohttps pod. It supports HA in the user-scheduler pod.
We have PDBs disabled by default for our non-HA, and enabled by default for our HA replicas.

Action points

Write about the HA status and what holds us back.
Add warning comments to the replicas configuration of hub/proxy/autohttps pod in values.yaml

Current status as far as I know it

About HA for JupyterHub itself

jupyterhub/jupyterhub#1932 (comment)

About HA for autohttps

We run traefik, which support HA, but, not for automatic TLS cert acquisition. They support that in their enterprise version, but we can't use that.

About HA for the proxy pod

The proxy pod runs jupyterhub/configurablehttpproxy (CHP) - a NodeJS server, which is configured dynamically by JupyterHub's proxy_class in Python of the same name. The problem is that JupyterHub sends one REST API request configuring one CHP server chosen at random behind the k8s Service exposing it, not all. So, if we have multiple replicas, JupyterHub configuring CHP will only configure one with how to route traffic.

#1673 is open to support using KubeIngressProxy - a standalone Python class that defined in the jupyterhub/kubespawner project. It creates k8s Ingress resources that describe how to route traffic to pods, which in turn an external ingress controller could use to know how to route traffic. That way the limitation of the CHP based setup is resolved

The text was updated successfully, but these errors were encountered:

consideRatio added the documentation label Dec 20, 2020

This was referenced Dec 20, 2020

Disable PDBs for hub/proxy, add PDB for autohttps, and relocate config proxy.pdb to proxy.chp.pdb #1938

Merged

Deployments with 1 replica and a PodDisruptionBudget enabled by default prevent node drain operations from succeeding #1934

Closed

consideRatio added this to the 1.0.0 milestone Jan 16, 2021

consideRatio changed the title ~~Document our charts non-HA status~~ Document the chart's non-HA status Jan 16, 2021

consideRatio mentioned this issue Apr 17, 2021

Why CHP replicas is 1 and non configurable? #2155

Closed

consideRatio removed this from the 1.0.0 milestone May 11, 2021

minrk mentioned this issue Sep 13, 2024

traefik proxy take 2 #3497

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document the chart's non-HA status #1951

Document the chart's non-HA status #1951

consideRatio commented Dec 20, 2020

Document the chart's non-HA status #1951

Document the chart's non-HA status #1951

Comments

consideRatio commented Dec 20, 2020

Action points

Current status as far as I know it

About HA for JupyterHub itself

About HA for autohttps

About HA for the proxy pod