-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doing a Kubernetes upgrade if JupyterHub is running will stall the Kubernetes upgrade #1575
Comments
This sounds like a valid use case. I've transferred this issue to the zero-to-jupyterhub repo which is where development happens. https://github.com/jupyterhub/helm-chart is for storing and publishing the charts. |
@manics Thank you! |
If you like, I could create a pull request with the change. |
Right now you can not run more than one JupyterHub pod at a time because of shared state. There are several people who are interested in changing this but this is a significant project so this will take some time. In the mean time I think the thing to do is to document that automatic upgrades or other operations that require automatic pod relocation won't work and what to do in this situation. I think it is better to document it and tell users that they need to explicitly delete the pod. This way the admin has full control over when the brief interruption to users will happen by choosing the moment when they delete the pod. |
OK, good to know. I may disable the pod disruption budget for now. Definitely an important thing to document, I waited for over an hour for that node to drain before I figured out something was wrong. I have't upgraded the cluster that often, so didn't fully understand if there would be a lot of variability in draining or not. Once JupyterHub was down, the nodes were pretty much clockwork. |
Agreed on the documentation as it is a pitfall/source of frustration/confusion. Do you want to open a PR to add this? |
Let me go through the documentation generator and get an initial stab at
it. I have not done ReadTheDocs before, but it will probably be good to
learn.
…On Mon, Feb 17, 2020 at 7:21 AM Tim Head ***@***.***> wrote:
Agreed on the documentation as it is a pitfall/source of
frustration/confusion. Do you want to open a PR to add this?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1575?email_source=notifications&email_token=ABO2Q4ZQ4JHM36D6Q26VRL3RDKMPVA5CNFSM4KVHBDLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL6STAY#issuecomment-587016579>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABO2Q44HTT5PVDX7KDSX2KDRDKMPVANCNFSM4KVHBDLA>
.
|
I'm starting to think that we should disable our PDBs by default unless with have two replicas, like for our user-scheduler pods. I don't see when PDB does more good than harm for the hub, proxy, autohttps pod atm. @betatim what do you think at this point? |
I don't see a clear win in allow the hub replicas be configurable, as it may indicate that it would make sense to increase it, and it would typically lead to not obvious runtime errors having multiple replicas, and to temporary make it zero one can do Unless i see a strong benefit of a PDB for the hub/proxy/autohttps pod, i think they should be allowed to disrupted during upgrades etc, I find that to be easier than to need to manually delete the pods. Anyone making a k8s version upgrade of JupyterHub should be aware that it will cause disruptions since we are not a HA helm chart, so then I'd say its better to just disrupt quickly without fuzz. |
Referencing a quote from #1649 (comment)
|
I have installed the JupyterHub chart 0.8.2.
I was doing an automatic upgrade of my Kubernetes cluster on GKE. Eventually the update stalled out. After some research I found that the node running the Jupyter Hub Hub pod was not updating. The Pod Disruption Budget for the Hub pod requires a minimum of 1 instance of the pod. This meant the single instance of the Hub could not be shut down because that would mean 0 instances of the Hub.
The deployment.yaml template for the Hub only allows for a single instance of the Hub (replicas: 1) and there is no way in the values.yaml file to specify the number of replicas desired.
This means if you enable a pod disruption budget, you cannot drain the node running the Hub pod unless you have a minimum value of 0 in your pod disruption budget.
Please make it so that the number of replicas of the Hub pod can be specified in a values.yaml file. You might also want to mention having a pod disruption budget enabled with only 1 replica will lead to this problem.
I have not checked if the other pods have similar issues.
The text was updated successfully, but these errors were encountered: