Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent server from getting stopped during long simulations #105

Closed
JordiBolibar opened this issue Jan 31, 2022 · 12 comments · Fixed by #155
Closed

Prevent server from getting stopped during long simulations #105

JordiBolibar opened this issue Jan 31, 2022 · 12 comments · Fixed by #155
Labels
bug Something isn't working 🏷️ JupyterHub Something related to JupyterHub

Comments

@JordiBolibar
Copy link

So far I am able to run long simulations as long as I keep my session more or less active (i.e. during my working hours). However, as soon as I leave it running during the night, my server is eventually stopped and the simulations get interrupted. I have configured my SSH file .ssh/ssh_config with keep alive as follows:

TCPKeepAlive yes
ClientAliveInterval 30
ClientAliveCountMax 240

X11Forwarding yes
X11UseLocalhost no

Is there anything else I should do in order to avoid that? What am I missing? Thanks in advance!

@consideRatio
Copy link
Member

consideRatio commented Jan 31, 2022

Ah thanks for reporting this @JordiBolibar, this is a novel situation for me. The automated logic to terminate an "inactive" server is quite intricate, but it won't check for open SSH connections as a sign of activity.

  1. I've opened an issue about this in the project that has enable us to have ssh connections, yuvipanda/jupyterhub-ssh, to our jupyter servers so we could get potential help solving this more thoroughly.
  2. We need to have a workaround for now.
    I'm quite confident you can accomplish a workaround if you visit your server via the web UI (hub.jupytearth.org), start any notebook, and keep it running like...
    import time
    # to avoid a mistake keeping a server running for months on end
    # adding costs to the cloud bill, avoid putting in 999999999999 etc.
    time.sleep(3600*24)

@JordiBolibar try the workaround for now and let me know if it works for now. I'm 99% confident it will work to block having the server stopped automatically.

@consideRatio consideRatio added 🏷️ JupyterHub Something related to JupyterHub bug Something isn't working labels Jan 31, 2022
@JordiBolibar
Copy link
Author

Hi Erik, thanks for the quick reply. I am in fact connecting through the web UI. I forgot I'm not longer attempting to connect from VSCode, since now there's the VSCode integrated in the browser. So indeed, the SSH configuration I posted is useless.

And thanks for the workaround. I will implement the same thing in Julia and see if that does the trick.

@consideRatio
Copy link
Member

Oh hmmm

  1. Do you think the bug I reported in Server connected to via SSH culled by inactivity - can we avoid it? yuvipanda/jupyterhub-ssh#67 is correct or incorrect?
  2. Is it correct that you are using visual studio code via the web interface (hub.jupytearth.org)?
  3. Can you clarify what "running a simulation" implies? Are you running something from a terminal opened in the hub.jupytearth.org based visual studio code UI?

And thanks for the workaround. I will implement the same thing in Julia and see if that does the trick.

The goal is to have a server not be stopped, to do that, you need to be seen as active. For that, this workaround relies on having a "jupyter kernel" running at all time. What's important isn't that you run it in julia or python, but it's actually a registered kernel running. In practice, you can visit https://hub.jupytearth.org/hub/user-redirect/lab and start a python notebook and associated kernel running this sleep command, and then go to https://hub.jupytearth.org/hub/user-redirect/vscode and keep working.

I think you can the running kernels from a terminal with jupyter server list.

@JordiBolibar
Copy link
Author

JordiBolibar commented Jan 31, 2022

  1. It depends if code-server is connecting via SSH or not. I'm not sure if it needs to do so or if it is run in the same server.
  2. Yes, I'm using VSCode via code-server directly in the browser, the one present in the launcher.
  3. Yes, I'm running a Julia simulation from the VSCode browser version from the hub UI.

OK, I will stick with the JupyterNotebook trick for now.

@facusapienza21
Copy link
Collaborator

I want to come back to this issue that @JordiBolibar opened some time ago.

I had been using the time.sleep() trick for some time now, and I notice that without further notice the kernel of that notebook also dies without giving news before the established time. I wonder if there is a more stable solution in order to keep a server running for several hours, especially for long and computationally expensive simulations. @consideRatio do you have any idea of how to accomplish this? Running the sleep command from a terminal will have the same effect?

Thank you!

@yuvipanda
Copy link
Member

I think the way to do this is to:

  1. Write a jupyter server extension that has UI that says 'keep this server alive'
  2. When it is enabled, it'll keep reporting that it is active to the server's API
  3. This will ensure that the multiple killers we have (idle and in-server) don't get to it.

So users can go to this page, and say 'keep alive for 8h, 24h, until turned off' etc

@consideRatio
Copy link
Member

@minrk's work on https://github.com/minrk/jupyter-keepalive address this quite well I think.

keepalive.mov

With #155 I'll add it to the base image.

@JordiBolibar
Copy link
Author

Hi @consideRatio, I'm still having issues with this. I cannot find the Keep server alive option, and running the notebook with a sleep command still doesn't work. I cannot run long simulations since my server gets disconnected after a short while.

Is it normal that I cannot access the option you displayed above in the video? Thanks a lot in advance!

@consideRatio
Copy link
Member

consideRatio commented Jan 12, 2023

Ah, I ended up disabling it when trying to resolve a very challengeng upgrade of other packages with coupled dependencies.

# https://github.com/minrk/jupyter-keepalive/archive/main.zip \
# This is a jupyter_server extension that is controllable via a
# JupyterLab plugin to keep a server running.
#
# ref: https://github.com/minrk/jupyter-keepalive
#
# NOTE: Disabled as we don't have nodejs installed, making us
# require a pre-built wheel or installation of nodejs.

I'll see if I can upstream a resolution to this by getting the package build and published so that nodejs isn't required.

@consideRatio
Copy link
Member

I opened minrk/jupyter-keepalive#4 @JordiBolibar.

I'll see if I can re-configure something to help you avoid getting shut down.

@JordiBolibar
Copy link
Author

Awesome, thanks a lot for your help!

@consideRatio
Copy link
Member

@JordiBolibar I've not dropped the ball on this, but I'm swamped with work items. There is progress to getting jupyter-keepalive to help us here, so I'm currently aiming for that as a resolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working 🏷️ JupyterHub Something related to JupyterHub
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants