Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request deployment] New Hub: demo-drakkar #2049

Closed
3 of 8 tasks
auraoupa opened this issue Jan 13, 2023 · 19 comments · Fixed by #2074
Closed
3 of 8 tasks

[Request deployment] New Hub: demo-drakkar #2049

auraoupa opened this issue Jan 13, 2023 · 19 comments · Fixed by #2074
Assignees

Comments

@auraoupa
Copy link

auraoupa commented Jan 13, 2023

Important dates

  • Target start date: 2023-01-23
  • Required start date: 2023-01-27
  • Any important dates for usage: 2023-01-30

Hub Authentication Type

GitHub (e.g., @MyGitHubHandle)

First Hub Administrators

Aurélie Albert, [email protected], @auraoupa
Takaya Uchida, [email protected], @roxyboy

[GitHub Auth only] How would you like to manage your users?

Manually, by adding specific GitHub handles in the JupyterHub Admin panel

[GitHub Teams Auth only] Profile restriction based on team membership

No response

Hub logo image URL

https://drakkar2023.sciencesconf.org/data/header/DrakkarOcean.png

Hub logo website URL

https://drakkar2023.sciencesconf.org/

Hub user image GitHub repository

https://github.com/auraoupa/hub-user-image-drakkar-demo

Hub user image tag and name

quay.io/auraoupa/drakkar-demo:62fa0b23aca5

Extra features you'd like to enable

  • Specific cloud provider or datacenter (otherwise GCP)
  • Dedicated Kubernetes cluster
  • Scalable Dask Cluster

(Optional) Preferred cloud provider

GCP (preferred)

(Optional) Billing and Cloud account

None

Other relevant information to the features above

No response

Tasks to deploy the hub

  • 1. Deploy information filled in above
  • 2. Engineer who will deploy the hub is assigned
  • 3. If using GitHub Orgs/Teams Auth, Engineer is given Owner rights to the org to set this up.
  • 4. Initial Hub deployment PR
  • 5. Administrators able to log on -> Hub now in steady-state
@jmunroe
Copy link
Contributor

jmunroe commented Jan 16, 2023

See https://github.com/2i2c-org/leads/issues/105 for related lead information.

@damianavila damianavila assigned consideRatio and unassigned colliand Jan 18, 2023
@consideRatio
Copy link
Member

Hi @auraoupa and @jmunroe!

I'm looking into deploying this hub and look to verify the following points first. Does this look right to you?

  • I understand this as a request to add a dedicated JupyterHub to run in the pre-existing GCP based meom-ige cluster.
  • We need a domain name for this hub, so would drakkar.meom-ige.2i2c.cloud be okay or should we choose something else?
  • The JupyterHub should be a standard jupyterhub without dask-gateway integration, so even though meom-ige.2i2c.cloud is what we call a "daskhub" where users can start and manage their own dask clusters with a dask_gateway client, this would be what we call a "basehub" where users can't use dask_gateway.

/ Erik

@auraoupa
Copy link
Author

Hi @consideRatio,
Thanks for taking care of setting up our demonstration hub !
My answers :

  • I understand this as a request to add a dedicated JupyterHub to run in the pre-existing GCP based meom-ige cluster.

Yes it will only be used for 3-days demonstration at the Drakkar meeting by around 40 people at a time.

  • We need a domain name for this hub, so would drakkar.meom-ige.2i2c.cloud be okay or should we choose something else?

I was suggesting drakkar-demo.2i2c.cloud but yours is fine too

  • The JupyterHub should be a standard jupyterhub without dask-gateway integration, so even though meom-ige.2i2c.cloud is what we call a "daskhub" where users can start and manage their own dask clusters with a dask_gateway client, this would be what we call a "basehub" where users can't use dask_gateway.

I am not sure about this one, I want the users to be able to launch a cluster in their notebook and do parallel computation, but not necessarily scale it or have a choice in the size of the server when logging.

@consideRatio
Copy link
Member

Thank you @auraoupa!

Does the code of "launch a cluster" involve "import dask_gateway"?

The concept of cluster is vague, you can have a local cluster to use many processes on the same server, but you can use dask_gateway to start external servers to communicate against.

@auraoupa
Copy link
Author

No dask_gateway indeed, only dask.distributed

@consideRatio
Copy link
Member

@auraoupa the hub is now available at https://drakkar-demo.meom-ige.2i2c.cloud/

This hub is not exactly like https://meom-ige.2i2c.cloud/, with some differences I want to highlight:

  1. Anything stored in the "shared" folder at drakkar-demo is separate from what you put in the shared folder at the other hub
  2. Users are not setup with credentials to access a "scratch bucket"
  3. Users are not able to start dedicated dask workers with dask_gateway via python code involving import dask_gateway

Does the hub at https://drakkar-demo.meom-ige.2i2c.cloud/ meet your needs for the drakkar-demo event?

@auraoupa
Copy link
Author

Nice, the hub looks fine ! I already have to change the docker image as I forgot some librairies in the previous one ... But I will do it via the configuration in the control panel. Can I already use it to test my demo or is there still work on your side before it is operational ? Thanks @consideRatio that was fast !

@consideRatio
Copy link
Member

Excellent @auraoupa! You can absolutely use the configurator at https://drakkar-demo.meom-ige.2i2c.cloud/services/configurator/ to choose a new image to use, and you can use the hub as you wish already.

@auraoupa
Copy link
Author

Hi @consideRatio, sorry to reopen this thread : I have a small issue while opening dataset from pangeo catalog that I do not have on meom-ige cloud deployment (even with the same docker image)

I try to open the data with :

from intake import open_catalog
cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/ocean.yaml")
ds  = cat["sea_surface_height"].to_dask()

and after a long wait I get :

ValueError: Bad Request: b/pangeo-cmems-duacs
User project specified in the request is invalid.

Is there anything different between the 2 hubs that can explain such behaviour ?

@consideRatio
Copy link
Member

@auraoupa pweh, I managed to resolve this in my third attempt! Thank you soo much for providing not only a well supported diagnosis and a way for me to reproduce the issue and check if I resolved the issue or not!

@auraoupa
Copy link
Author

Thank you so much for fixing this @consideRatio, I don't have to modify my demo for Monday then !
One last request : is it possible to limit the server spanning options to only Large (16CPU, 64Gb RAM) so that users don't choose the other ones ?

@consideRatio
Copy link
Member

Yes, I will get it done!

@auraoupa
Copy link
Author

Perfect ! Thank you for all your help deploying this hub, I guess all is ready for it to run this Monday 9am (CET) 😃
Since it would be very early for you, is there someone based in Europe that I can contact in case something goes wrong ?

@consideRatio
Copy link
Member

@auraoupa yes contact us via https://2i2c.freshdesk.com/support/home (or email [email protected]).

I want to make sure that we do what we can to accomodate the users, and while cloud providers can provide many machines - we must ask for the ability to start many servers and they must approve it.

How many users will start and run servers at the same time? If they all are to be granted a 16 CPU machine, I suspect I must also request an increase in the allocated quota granted by the cloud provider. Using the same quota, you can fit twice as many users with 8 CPU nodes btw.

@auraoupa
Copy link
Author

Ok it will be 40 people maximum, we can make it with 8 CPU nodes for sure. I can always ask them to pair up two by two if that is too much at the same time ...

@consideRatio
Copy link
Member

@auraoupa I've checked the quotas, and it should be fine if you end up with 100 people it seems!

I just learned that using Google's cloud, as we do here, is far more generous with the quotas provided than Amazon's cloud. So, this wasn't an issue really.

@colliand
Copy link
Contributor

Hi @auraoupa and @roxyboy! Should 2i2c decommission this hub since the event is now over? Please let us know.

@colliand colliand reopened this May 22, 2023
@colliand colliand self-assigned this May 22, 2023
@roxyboy
Copy link

roxyboy commented May 22, 2023

Yes, I think this should be decommisioned.

@damianavila
Copy link
Contributor

Decommissioned via #2571.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants