Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend KubeSpawner and its UI to handle Persistent Volume Claims #34

Closed
jlewi opened this issue Dec 11, 2017 · 22 comments
Closed

Extend KubeSpawner and its UI to handle Persistent Volume Claims #34

jlewi opened this issue Dec 11, 2017 · 22 comments
Assignees
Labels
Milestone

Comments

@jlewi
Copy link
Contributor

jlewi commented Dec 11, 2017

In the JuptyerHub spawner you can specify Docker image and resource requirements. It would be nice if you could also specify volume claims and mount points so that users can easily attach shared volume to their notebooks.

According to jupyterhub/jupyterhub/issues/1580 we can do this just by adding some properties to our KubeSpawner.

@foxish
Copy link
Contributor

foxish commented Dec 11, 2017

I'm not sure if it's intuitive to have the user specify PVCs and mount points, as opposed to just requesting resources (as they currently do). It runs a bit afoul of the broad goal of making infrastructure "disappear" IMO. Maybe we can have a single place to do advanced config, which can be enabled by jupyterhub/kubespawner#111

@jlewi jlewi added the area/jupyter Issues related to Jupyter label Feb 27, 2018
@jlewi
Copy link
Contributor Author

jlewi commented Mar 20, 2018

@foxish I'm not sure I understand the distinction you are drawing between resources and PVCs? In this case a PVC seems like a resource in the same regard as CPU/RAM/GPU.

@foxish
Copy link
Contributor

foxish commented Mar 20, 2018

My point was that PVCs are a much more K8s-specific construct as opposed to CPU/RAM/GPU. As I understood, we wanted to hide the complexity of k8s concepts from the user - but this is still probably useful for advanced users. IIRC, JupyterHub in kubeflow already uses PVCs under the hood for notebook storage right now. Is there a use-case driving exposing arbitrary mounts and PVCs?

@jlewi
Copy link
Contributor Author

jlewi commented Mar 26, 2018

So I think the use case is as follows.

People have a bunch of existing NFS servers or other types of storage systems that they want to bind to their Jupyter notebooks so that they can access data on those shares.

There are generally two approaches to this

  1. We could make it possible for people to modify their JupyterHub/KubeSpawner config so that certain PVC's get attached to every notebook.

  2. We can modify the spawner UI so that users can specify which PVCs to attach to their notebooks.

I think I lean towards #2. The advantage of #2 is that you don't have to redeploy to attach a newly created PVC.

Furthermore, #2 seems better suited to the case where we allow users to spawn multiple Jupyter notebooks which is a direction I think we want to move in.

My philosophy for Kubeflow is that we should expose K8s concepts/abstractions when it makes sense rather than introducing new ones.

PV/PVC is one of those abstractions that I think we should expose; I don't think we can avoid dealing with storage and PV/PVC is a great abstraction.

@pdmack any interest in trying to tackle it?

@pdmack
Copy link
Member

pdmack commented Mar 26, 2018

@jlewi I'll have a go. Probably have more questions as I begin the journey.

@jlewi
Copy link
Contributor Author

jlewi commented Apr 3, 2018

One idea I had was to generate the jupyterhub_spawner.py file using an init container. This would allow us to be more dynamic; e.g. we could make a K8s API request to see if there is a default storage class defined or not and adjust the config accordingly.

@pdmack
Copy link
Member

pdmack commented Apr 3, 2018

Yeah, I mean the KubeSpawner embeds the python k8s client so I was assuming that we could do some introspection from the hub pod itself. But init container might be worthwhile as a separation of concerns and a place to generally store "outward" facing logic. The hub UI will still need to reflect what comes back to it though.

@pdmack
Copy link
Member

pdmack commented Apr 5, 2018

/assign @pdmack

@ioandr
Copy link
Contributor

ioandr commented Sep 25, 2018

Hi all,

We would like to contribute to this issue and extend the current JupyterHub Spawner UI with the option of specifying PVCs for a Jupyter Notebook. We suggest that the corresponding UI element described above should be an extra entry in the 'Advanced Options' form, just like the rest of the resources.

A user should be able to attach 0-N new, empty volumes to their notebook, so we will implement a dynamic UI component that will allow the addition and removal of volumes prior to the spawning of the Notebook. These volumes will be able to back both a workspace (/home/jovyan) and arbitrary datasets. For each volume added, the new UI element will allow the user to specify:

  1. the size of the volume
  2. the mount point of the volume
  3. the access mode of the PVC

Regarding the default behavior, we suggest that only the default workspace (/home/jovyan) is created and no additional volumes are mounted to the Notebook, as some users might not want to work with external or shared volumes. This will also make things completely backwards compatible.

We can also have an option for the user to specify an existing PVC rather than a new, empty one, referenced by name. Please comment if this is something that would be of interest to the community, we believe this would be very helpful, as @pdmack also suggested in a previous community meeting. If this is the case, the user will need to define:

  1. the name of the existing PVC
  2. the mount point of the volume
  3. the access mode of the PVC

We also plan to extend the aforementioned solution and concurrently deal with two additional open issues that share similar context:

  1. JupyterHub spawner - Highlight PV requirement #541
  2. Spawner UI for GPU type selection #1457

Finally, the solution will allow for 3rd party extensions to be implemented, inheriting from the proposed UI, so custom/advanced options provided by 3rd parties to K8s can be exposed to the end users of Kubeflow/JupyterHub.

All the above are related to #56 and aim at making the default JupyterHub Spawner UI more intuitive in terms of configuring persistent storage, which as stated in the Kubeflow 0.4 Release Planning Document is a major feature for v0.4.

We are looking forward to your comments and feedback on the proposed solution. @jlewi @pdmack @inc0

@jlewi
Copy link
Contributor Author

jlewi commented Sep 26, 2018

In general the plan sounds good to me.

At the Kubeflow, contributor summit yesterday a couple key points emerged

  1. A richer Jupyter spawner UI would be useful for users

    • @pdmack mentioned he was working on some related skins
    • Lyft showed their notebook UI and talked about how giving users precanned machined specs (high-cpu, high ram) was useful for users (while still providing an escape hatch
  2. Long term it might make more sense to replace JupyterHub with a simple WebApp that uses the K8s API server to manage notebooks.

This raises the following problem, Can we continue to iterate and improve the user facing UI? without blocking on replacing JupyterHub which might take some time?

@ioandr Any thoughts about this? e.g.

  • How much effort would be required to implement the proposal?
  • Would the work be reusable if we switched to using our own web app on top of K8s?

/cc @pdmack

@ioandr
Copy link
Contributor

ioandr commented Sep 26, 2018

@vkoukis @iliastsi

@vkoukis
Copy link
Member

vkoukis commented Sep 28, 2018

This raises the following problem, Can we continue to iterate and improve the user facing UI? without blocking on replacing JupyterHub which might take some time?

Yes, we believe we can.

We agree completely that iterating on the user-facing UI, and evolving the user experience should not block on replacing JupyterHub, especially before we have a crystal clear view of what the shortcomings of the current JupyterHub-based approach are.

Having said that, it makes sense to ensure that when we do have a clear view of where we want to go, we don’t just have to rewrite everything, but can instead continue making piece-wise improvements to the code base.

So, a proposed plan of action is:

  • Step 1: Continue using JupyterHub for the time being. We may have to make extensive improvements to the functionality that KubeSpawner currently offers (as @pdmack also mentioned in the breakout session) to provide all desired functionality, but we will continue using the current JupyterHub -> Spawner class interface. At this point JupyterHub will just be used to trigger invocations of KubeSpawner. More on this below.

  • Step 2: Replace JupyterHub with a new, much thinner webapp, but keep the same Spawner interface, and continue using KubeSpawner. At this point we can have authentication be an external microservice, the webapp will be completely stateless, and no functionality should depend on the presence of JupyterHub.

We can focus on implementing Step 1 for now, which should be enough to improve the offered functionality in an end-to-end workflow considerably and to close this issue.

In the following, I try to describe what the goal of our PR will be:

  • [K8s-native approach] All changes to the KubeSpawner class will take advantage of being dependent on K8s underneath. Examples include spawning pods directly, using labels on pods and PVCs to be able to filter them and determine what notebook servers a user is currently running, using K8s PVCs to handle storage resources.
  • [PVC handling via the UI] The user will be able to specify PVCs to attach to the notebook pod, and optionally create them. For each volume to be attached, a generic UI will support either specifying an existing PVC by name or ask for a new PVC to be created. To keep things simple, a first iteration can ask the user for PVC name, size, and access mode only.
  • [Admin-settable defaults] The admin specifies the StorageClass, allowed access modes, and arbitrary vendor-specific annotations to use for PVCs created by the generic UI, when deploying Kubeflow. We can discuss whether we would like to expose more esoteric functionality [storage class names, access modes, annotations] via the UI in a subsequent iteration.
  • [K8s-based resource isolation] KubeSpawner currently creates all pods in the namespace in which it currently runs, which is the namespace in which JupyterHub runs, which means all pods run in the same namespace. This has significant shortcomings, because K8s cannot differentiate between pods belonging to different users when making authorization decisions. We propose we handle it using a new kubespawner_namespace parameter. If it is unset, then KubeSpawner will fall back to its current behavior. If it is set, then KubeSpawner will attempt to create pods in the specified namespace. If KubeSpawner treats this parameter as a template to be rendered at pod spawning time, then the admin can apply arbitrary policies. One example would be setting it to {{ user }}, so pods are created in the namespace that corresponds to the JupyterHub user.

@ioandr Any thoughts about this? E.g.

  • How much effort would be required to implement the proposal?

We implemented the biggest part of this proposal for our recent demo. We plan to have a PR submitted for this issue within the next two weeks, I'll let @ioandr elaborate more on this as he makes progress with the implementation.

  • Would the work be reusable if we switched to using our own web app on top of K8s?
    /cc @pdmack

This is what we are aiming for in the plan laid out above. It is best to treat JupyterHub as a small wrapper around KubeSpawner, focus most of the changes in KubeSpawner, then switch to using a Kubeflow-specific web app which will call into the same KubeSpawner class, ideally with minimal or no changes to its interface.

Finally, does it make sense to change the title of the current issue to "Extend KubeSpawner and its UI to handle Persistent Volume Claims", so it better reflects its scope?

@jlewi jlewi changed the title Add a UI element in JupyterHub Spawner to specify volume claims Extend KubeSpawner and its UI to handle Persistent Volume Claims Sep 28, 2018
@jlewi
Copy link
Contributor Author

jlewi commented Sep 28, 2018

This sounds like a great plan to me!

@pdmack any thoughts?

@kkasravi
Copy link
Contributor

@vkoukis @jlewi @pdmack @yuvipanda

A k8s native approach will likely need to scrap kubespawner entirely based on my own attempt to work within the kubespawner framework to replace kubespawner's proxy with ambassador as suggested by @yuvipanda in issue #239. I don't make this statement out-of-hand and the remainder of my comment is to substantiate this claim. It's entirely possible that JupyterHub/KubeSpawner have evolved from 0.8.1 or that I missed an approach that was obvious. The Jupyterhub/Kubespawner community may have contributed some clean solutions since 0.8.1. Hopefully Yuvi has some info in this regard.

The solution for #239 (which was merged) was to keep configurable-http-proxy which is the default proxy used by KubeSpawner as I documented in the PR. However this solution falls short of what we're after and authentication with JupyterHub remains an issue since it basically means that client_id, client_secret values must be propagated downstream from envoy or ambassador at point of time when the notebook is spawned by the jupyterhub pod (eg the user selects 'Spawn').

My attempt to extend KubeSpawner by replacing ConfigurableHttpProxy with one that uses ambassador is documented. I followed the documentation of 'Writing a custom proxy implementation' and the work is preserved in the unmerged PR #817. The custom proxy code can be found here. What made this intractable IMO is that JupyterHub, KubeSpawner use a number of flags to track and mutate pending state of launching the notebook that I've noted here. The code is a complex state machine whose transitions are arbitrated over these state flags - mostly in Spawner that include Spawner.pending, Spawner._start_pending, Spawner._stop_pending, Spawner._proxy_pending, Spawner._spawn_future. These flags may be mutated by other classes - for example in User.py that drive key state transitions. In the end one has to weigh the complexity of JupyterHub's state machine with the complexity noted by @yuvipanda.

@kkasravi
Copy link
Contributor

BTW my note above agrees with @vkoukis' plan of action (retain the KubeSpawner API but bypass the implementation with a new one).

@pdmack
Copy link
Member

pdmack commented Oct 1, 2018

Agree with @kkasravi. The Kubespawner implementation is where we need to place much of the new work (PVC, GPU type) and rip out some other stuff (auth, reverse proxy).

@inc0
Copy link

inc0 commented Oct 1, 2018

how about we rip out Kubespawner?:)

@vkoukis
Copy link
Member

vkoukis commented Oct 1, 2018

@kkasravi Your observations are very useful, and thanks for having taken the effort to document meticulously all the problems you encountered. I understand replacing JupyterHub with a K8s-native implementation is not going to be a light task, and the main problem is that JupyterHub/Kubespawner maintain a lot of mutable state, which is used to track the launch of a new notebook server.

That being said, our initial goal is to get a first set of changes into the KubeSpawner implementation and its UI so that it can handle PVCs uniformly.

Then, we will need to have a much better understanding of the way JupyterHub interacts with KubeSpawner, it will take a deeper discussion, and we will definitely keep you in the loop for any proposed changes, before we can replace JupyterHub with anything else.

If you agree, it would be best if we can focus the discussion in this issue on the "KubeSpanwer + support for PVCs" part of the problem, and continue the more open-ended discussion of how JupyterHub maintains local state and how to potentially replace it, either at #1630 or in a new issue that we can create just for this purpose.

@vkoukis
Copy link
Member

vkoukis commented Oct 2, 2018

/area 0.4.0

ioandr added a commit to arrikto/kubeflow that referenced this issue Nov 7, 2018
* Extend default Spawner UI with two new elements: Workspace and Data Volumes
* Extend `config.yaml` of the default UI with default values and
  options for the new Spawner form fields
* Add asynchronous logic for handling K8s-native API requests in
  `spawner.py` (e.g. provision new PVCs, retrieve existing PVCs)
* Add 'storageClass` ksonnet parameter to easily configure which PVC
  provisioner will be used - defaults to 'none'
* Remove `disks` and `notebookPVCMount` ksonnet parameters, along with
  related assignments in `jupyterhub_config.py`
* Remove buggy code for attaching multiple PVCs in `jupyterhub_config.py`
* Fix flake8 linting errors in `spawner.py`

Closes kubeflow#34
Closes kubeflow#541

Signed-off-by: Ioannis Androulidakis <[email protected]>
Signed-off-by: Ilias Tsitsimpis <[email protected]>
ioandr added a commit to arrikto/kubeflow that referenced this issue Nov 16, 2018
* Extend default Spawner UI with two new elements: Workspace and Data Volumes
* Extend `config.yaml` of the default UI with default values and
  options for the new Spawner form fields
* Add asynchronous logic for handling K8s-native API requests in
  `spawner.py` (e.g. provision new PVCs, retrieve existing PVCs)
* Add 'storageClass` ksonnet parameter to easily configure which PVC
  provisioner will be used - defaults to 'none'
* Remove `disks` and `notebookPVCMount` ksonnet parameters, along with
  related assignments in `jupyterhub_config.py`
* Remove buggy code for attaching multiple PVCs in `jupyterhub_config.py`
* Fix flake8 linting errors in `spawner.py`

Closes kubeflow#34
Closes kubeflow#541

Signed-off-by: Ioannis Androulidakis <[email protected]>
Signed-off-by: Ilias Tsitsimpis <[email protected]>
ioandr added a commit to arrikto/kubeflow that referenced this issue Nov 16, 2018
* Extend default Spawner UI with two new elements: Workspace and Data Volumes
* Extend `config.yaml` of the default UI with default values and
  options for the new Spawner form fields
* Add asynchronous logic for handling K8s-native API requests in
  `spawner.py` (e.g. provision new PVCs, retrieve existing PVCs)
* Add 'storageClass` ksonnet parameter to easily configure which PVC
  provisioner will be used - defaults to 'none'
* Remove `disks` and `notebookPVCMount` ksonnet parameters, along with
  related assignments in `jupyterhub_config.py`
* Remove buggy code for attaching multiple PVCs in `jupyterhub_config.py`
* Fix flake8 linting errors in `spawner.py`

Closes kubeflow#34
Closes kubeflow#541

Signed-off-by: Ioannis Androulidakis <[email protected]>
Signed-off-by: Ilias Tsitsimpis <[email protected]>
ioandr added a commit to arrikto/kubeflow that referenced this issue Nov 18, 2018
* Extend default Spawner UI with two new elements: Workspace and Data Volumes
* Extend `config.yaml` of the default UI with default values and
  options for the new Spawner form fields
* Add asynchronous logic for handling K8s-native API requests in
  `spawner.py` (e.g. provision new PVCs, retrieve existing PVCs)
* Add 'storageClass` ksonnet parameter to easily configure which PVC
  provisioner will be used - defaults to 'none'
* Remove `disks` and `notebookPVCMount` ksonnet parameters, along with
  related assignments in `jupyterhub_config.py`
* Remove buggy code for attaching multiple PVCs in `jupyterhub_config.py`
* Fix flake8 linting errors in `spawner.py`

Closes kubeflow#34
Closes kubeflow#541

Signed-off-by: Ioannis Androulidakis <[email protected]>
Signed-off-by: Ilias Tsitsimpis <[email protected]>
@jlewi
Copy link
Contributor Author

jlewi commented Nov 26, 2018

/assign @ioandr
/unassign @pdmack

@ioandr has #1918 which I believe will fix this issue.

@k8s-ci-robot k8s-ci-robot assigned ioandr and unassigned pdmack Nov 26, 2018
kimwnasptd pushed a commit to arrikto/kubeflow that referenced this issue Mar 5, 2019
* Chunchsiang is going to be a Google intern.
* Give him access to kubeflow-dev to try things out.
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue May 27, 2021
Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue May 27, 2021
The Spec will contain only one field for specifying which PVC to edit
its contents.

The Status will have two fields. `Conditions` which will mirror the
PodConditions of the underlying Pod and `Ready` which is a boolean that
will be true only if the Pod is has conditions Ready and
ContainersReady.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue May 27, 2021
The PVCViewers controller will be only creating a Pod, a Service and a
VirtualService for each PVCViewer CR.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue May 27, 2021
The manager will also have a culling controller. This controller will be
periodically reconciling PVCViewers and if they are idle then it will
delete them from the API Server.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue May 27, 2021
Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue May 27, 2021
Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue May 27, 2021
The controller will be also using an ENV Variable to check

1. what image to use for the pvc viewer pod.
2. what the will be the endpoint for a pvcviewer
3. what image pull policy to use (we want always when in dev mode)

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd pushed a commit to arrikto/kubeflow that referenced this issue May 27, 2021
This commit modifies the Makefile to download controller-gen to a folder
inside the pvcviewer-controller.
The goal is to make sure the specific version of controller-gen is used
to generate manifests for the pvcviewer-controller.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
Related-to: arrikto/dev#77
Related-to: arrikto/dev#333
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue Oct 6, 2021
Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue Oct 6, 2021
The Spec will contain only one field for specifying which PVC to edit
its contents.

The Status will have two fields. `Conditions` which will mirror the
PodConditions of the underlying Pod and `Ready` which is a boolean that
will be true only if the Pod is has conditions Ready and
ContainersReady.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue Oct 6, 2021
The PVCViewers controller will be only creating a Pod, a Service and a
VirtualService for each PVCViewer CR.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue Oct 6, 2021
The manager will also have a culling controller. This controller will be
periodically reconciling PVCViewers and if they are idle then it will
delete them from the API Server.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue Oct 6, 2021
Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue Oct 6, 2021
Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd added a commit to arrikto/kubeflow that referenced this issue Oct 6, 2021
The controller will be also using an ENV Variable to check

1. what image to use for the pvc viewer pod.
2. what the will be the endpoint for a pvcviewer
3. what image pull policy to use (we want always when in dev mode)

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
kimwnasptd pushed a commit to arrikto/kubeflow that referenced this issue Oct 6, 2021
This commit modifies the Makefile to download controller-gen to a folder
inside the pvcviewer-controller.
The goal is to make sure the specific version of controller-gen is used
to generate manifests for the pvcviewer-controller.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Yannis Zarkadas <[email protected]>
Github-PR: kubeflow#34
Related-to: arrikto/dev#77
Related-to: arrikto/dev#333
VaishnaviHire pushed a commit to VaishnaviHire/kubeflow that referenced this issue Sep 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants