Restoring computation output after disconnect in Notebook #641

hadim · 2015-10-22T13:31:54Z

Following this discussion https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/jupyter/8hyGKoBY6O0/RyEDDyOZAQAJ, I'd like to open an issue to track down progress on this feature.

The idea would be to add the capability to restore output of a running kernel after client reconnection.

Feel free to ask me to move this issue in a more appropriate jupyter project if needed.

rshpeley · 2016-10-19T04:30:49Z

Hey Carreau. I noticed on the google forum you said the code would need a significant refactor, and MinRK said it involved storing outputs on the server. In my case I'm ok with losing the outputs during the disconnect. I just need to be able to reconnect the browser session to the kernel to get new outputs and to get at the data and code I've loaded. Kernel/Reconnect doesn't do it. I've also tried closing the tab and opening a new one from the Jupyter base page.

This is fairly important for me. I live in a rural area with a poor internet connection and my notebook takes about 8 hours to load. I'm pretty well guaranteed to have an interruption during that time.

rshpeley · 2016-10-23T16:01:31Z

Still looking into this. This is currently just a problem for me on Jupyter, but I'm more interested in getting it solved for JupyterHub. I'm planning to use it to instruct a class at the university and students losing access to their assignment work through a disrupted connection is not an option.

mosh have solved this problem on terminals and it seems to me that using mosh's SSP (State Synchronization Protocol) with Speculation adapted to Jupyter Notebook requirements would be the best solution to this problem (see the mosh paper).

If you want to hand this problem off to JupyterHub that's ok with me, but others may like to see a solution for Jupyter as well.

minrk · 2016-10-25T11:39:40Z

This isn't really a JupyterHub problem. It should be fixed in the notebook (specifically JupyterLab), hopefully before too long.

post2web · 2017-03-01T15:11:49Z

This feature would be great!

manuelsh · 2017-03-27T08:46:14Z

Loosing the output of a Jupyter notebook if you are not connected to the hosting computer is a problem, although you can do some hacks. Having the capacity of recovering it would be great.

ghost · 2017-05-19T16:47:43Z

Just a side note: Apache Zeppelin notebook with Python interpreter doesn't have this problem, as it handles disconnects or multiple connects during tasks executions transparently. But it has its own problems: it loses interactive output for a running cell after a disconnect, although when a task is done it eventually shows all its output.

louismartin · 2020-03-02T20:01:08Z

Is there any update on this issue?

jasongrout · 2020-03-03T21:40:06Z

Not a huge amount beyond the discussion and issues linked above, that I know of.

There is (slow, contributors welcome!) work going on in JupyterLab around real-time collaboration, which also involves moving notebook state to the server, which solves this issue. @rgbkrk, @captainsafia and others have also experimented with moving state to the server in the nteract project. And @williamstein's cocalc implementation of a notebook frontend does have this feature - it saves state on the server, so you can reconnect whenever you want and get the full state.

So there have been some initial experiments (jlab, jupyter_server, nteract), and at least one full implementation tied to a platform (cocalc).

cossio · 2020-09-11T14:15:46Z

Any news on this? Is there an issue / PR that one can check to track progress on this?

rgbkrk · 2020-10-15T20:22:55Z

Follow this repository for the related effort: https://github.com/jupyterlab/rtc

apophenist · 2020-11-21T13:55:19Z

Hi friends,

I have this problem often, when the output buffer stops (I use Atom/Hydrogen), and I lose visibility of long running processes.

I have found a bit of a workaround which may be helpful to others (but it requires this to be done upfront, i.e. there is no way that I have found to resume output from existing processes).

The solution requires SSH access, and involves redirecting logging to a file as follows:

    import sys
    log_file = 'results.log'
    sys.stdout=(open(log_file,"w"))
    clf = RandomizedSearchCV(RandomForestRegressor(random_state = 0), param_distributions = param_grid, n_iter = 200, cv=3, random_state = 42, verbose = 51, n_jobs=8)
    sys.stdout.close()

My example is arbitrarily for a RandomizedSearchCV.

However, buffering in std.out proved to be an issue, so Magnus Lycka's referenced answer proved helpful in overriding this:

class Unbuffered(object):
   def __init__(self, stream):
       self.stream = stream
   def write(self, data):
       self.stream.write(data)
       self.stream.flush()
   def writelines(self, datas):
       self.stream.writelines(datas)
       self.stream.flush()
   def __getattr__(self, attr):
       return getattr(self.stream, attr)

and replace above sys.stdout=(open(log_file,"w")) with:
sys.stdout=Unbuffered(open(log_file,"w"))

Now, you can ssh into the machine, run docker exec -it image_name bash and tail -f the log file.

Hope this helps, it seems to be the best solution for now.

banderlog · 2021-07-01T18:20:01Z

Just met this issue with losing 5h computations on GPU. Used AI notebook in GCP. Thought that it will do plots with a closed browser, but I thought wrong :(

astrojuanlu · 2021-07-01T18:44:13Z

There is (slow, contributors welcome!) work going on in JupyterLab around real-time collaboration, which also involves moving notebook state to the server, which solves this issue.

To my understanding, Real Time Collaboration is now supported in JupyterLab jupyterlab/jupyterlab#5382

@jasongrout as I read in https://blog.jupyter.org/how-we-made-jupyter-notebooks-collaborative-with-yjs-b8dff6a9d8af

Currently, we still send HTTP PUT requests to save a document to the file system via the Jupyter Server. Once we have the Yjs CRDT working in Python, we will create a Python implementation of the shared notebook model which will allow the Jupyter Server to directly access the collaborative state and synchronize that with the file system.

It is not clear to me if the notebook state is now handled by the server and therefore this issue is fixed, or if it is still present.

jasongrout · 2021-07-01T19:04:10Z

To my understanding, Real Time Collaboration is now supported in JupyterLab jupyterlab/jupyterlab#5382

This will be released in the upcoming 3.1 release, behind a command-line flag.

It is not clear to me if the notebook state is now handled by the server and therefore this issue is fixed, or if it is still present.

This is still present for now, but the infrastructure needed to move things to the server is being put in place by the RTC work. Moving state to the server is another big project that builds on the RTC work being done now.

cossio · 2021-08-31T21:05:57Z

What issue / repo / PR should one follow to get relevant updates on this issue? Will it be resolved on this repo or on jupyterlab?

tgpfeiffer · 2022-04-13T02:01:34Z

Related to #641 (comment): I have tried the Jupyter collaboration feature in the latest release (3.3.3) and I found out that if user A starts a long-running computation, then the output is synced across browsers to all collaborators and they can see it in real-time, reload their browser windows etc., but if A does it, then we are back at the place of this issue, the output goes into the void and cannot be accessed later.

So as written by @jasongrout above this still seems to be a WIP. I'd be interested to contribute some work here, where is it happening? Any main issue, meetings etc. where this feature is driven forwards?

Wh1isper · 2022-04-22T07:02:45Z

Related to #641 (comment): I have tried the Jupyter collaboration feature in the latest release (3.3.3) and I found out that if user A starts a long-running computation, then the output is synced across browsers to all collaborators and they can see it in real-time, reload their browser windows etc., but if A does it, then we are back at the place of this issue, the output goes into the void and cannot be accessed later.

So as written by @jasongrout above this still seems to be a WIP. I'd be interested to contribute some work here, where is it happening? Any main issue, meetings etc. where this feature is driven forwards?

I am also interested in this issue, which is a problem of the overall Jupyter architecture. Jupyter puts all file-related operations on the front end, which disappears after closing the tab page, naturally leading to the problem of missing output

Looking at the behavior of JupyterLab, I found that JupyterLab itself has some problems with the kernel state loop. For example, when reconnecting to a kernel that is performing a computation task, JupyterLab will lose the ability to track its output, such as the following code.

import time 
while True:
  time.sleep(1)
  print(1)

I think the reconnection problem should be solved first, and the loss of output during front-end shutdown is probably tolerable.

Second, the best solution is to leave the output of the computation task to the backend and return it to the frontend appropriately.

davidbrochart · 2022-05-17T15:50:24Z

But it already works with Jupyter 3.4.2, so this is not related to jupyterlab/jupyterlab#12360.

cossio · 2022-05-17T15:57:02Z

@davidbrochart Wowwww had no idea! So if I just upgrade to latest Jupyterlab I will get this fix?

EDIT: Does it work with other languages besides Python?

davidbrochart · 2022-05-17T16:04:25Z

To be honest I'm as surprised as you, nothing has been done to fix this issue, and it's not related to RTC at all. Maybe it has always been working in JupyterLab?
It's not specific to Python, it should work with other languages too.

cossio · 2022-05-17T16:10:30Z

Hmm doesn't seem to be working for me (I'm using Julia).

What version of the Jupyter server you are using?

davidbrochart · 2022-05-17T16:13:38Z

Could you post a screencast?

cossio · 2022-05-17T16:27:36Z

Screencast.from.17-05-22.18_25_55.mp4

@davidbrochart I tried to replicate your video, with Python. Notice here that the outputs keep coming in even after the Throttling setting is set to Offline, so I'm not sure this is a good experiment?

davidbrochart · 2022-05-17T16:29:49Z

Outputs are updating event when you're offline 😄
I saw that too, maybe Google Chrome's dev tool is broken?

cossio · 2022-05-17T16:30:37Z

Maybe. If I try to refresh the notebook browser window (hit F5), however, I lose all output 😞

davidbrochart · 2022-05-17T16:32:22Z

Ah, but that's another issue. You might want to continue the discussion in jupyterlab/jupyterlab#12455.

williamstein · 2022-05-17T18:48:31Z

Maybe. If I try to refresh the notebook browser window (hit F5), however, I lose all output 😞

IIRC @rgbkrk (?) told me he added a hack long ago to Jupyter so that it would save up and push out messages that were missed when the user's network connection is down temporarily. As you note, it was just a small hack, and doesn't work when you refresh your browser. Maybe @rgbkrk has further thoughts...

davidbrochart · 2022-05-17T19:14:18Z

But the long-term fix is to move all kernel management (low-level kernel protocol) from the front-end to the back-end. In the end JupyterLab would merely be a (live) document viewer, and wouldn't talk to kernels directly.

Wh1isper · 2022-05-20T06:18:18Z

Screencast.from.17-05-22.18_25_55.mp4
@davidbrochart I tried to replicate your video, with Python. Notice here that the outputs keep coming in even after the Throttling setting is set to Offline, so I'm not sure this is a good experiment?

This is probably because Google Chrome's dev tool does not disconnect websockets that are already connected

Kesanov · 2022-08-03T07:03:42Z

This issue is my biggest pain with jupyter by far.
However, judging by the progress made since 2015,
it looks like this won't get ever implemented
despite this being probably the most upvoted issue here -.-

Since it seems upvotes fail as means of motivation,
would it be possible to put a bounty on this issue
(where people could donate small amount of money
as easily as hitting the upvote button)
that would be paid to the person who fixes it?

astrojuanlu · 2022-08-03T07:27:44Z

The current status of this issue is a bit unclear to me. jupyterlab/jupyterlab#12360 was merged, which in theory addressed it at least partially, but then there's also jupyterlab/jupyterlab#12455, which "would probably need a complete rework of JupyterLab's architecture". Additionally, some videos were posted in the latest comments but as @Wh1isper points out, they were using Chrome Dev Tools and therefore are not representative.

Would the devs add some clarification here? i.e. what changed after jupyterlab/jupyterlab#12360 and what remains.

davidbrochart · 2022-08-03T07:32:25Z

Load/auto-save document from the back-end using y-py jupyterlab/jupyterlab#12360 doesn't address this issue.
Add POST execute cell jupyter-server/jupyverse#191 is a step in that direction, but it's very experimental.

Wh1isper · 2022-08-03T14:48:41Z

@Kesanov @astrojuanlu Thank you for your continued interest in this issue

The biggest problem is that the storage of the files is on the back end (Server), but the saving of the cell output relies on the front end (Lab). This results in a situation where we lose the output forever after the front-end (Lab) is disconnected for some reason. The simplest is to implement a circuit breaker: when the front-end session is all disconnected, Server begins to write the cell output directly to the file.

I am often busy with my own work, and my vision for this problem is to build an OT-based intermediate service for connecting Jupyter Server and Lab, so that the two functions of multi-person collaboration and back-end preservation can be done simultaneously, with this intermediate service controlling when to write to Jupyter Server. the drawback of this solution is that the performance of the intermediate service may be challenged.

I haven't had time to practice my idea yet, so this is only on paper and is not compatible with the current Jupyter Lab implementation of multiplayer collaboration.

@davidbrochart Thanks to the Jupyter Server Team for their contribution on this issue. Currently Jupyter Lab has implemented CRDT-based multi-person collaboration, and it is a good option for the back-end to use itself as a front-end to actively push messages, but this requires some modification of the Server's Kernel module to make it fully accessible to the CRDT model. I may follow up on this project when I have time, it will be an exciting feature!

Wh1isper · 2022-09-14T10:23:49Z

I found a discussion here that in this issue, the cache of the notebook does not work because the browser loses the previous msg_id after refreshing, so I propose here that jupyterlab should store the msg_id in ipynb to ensure message recovery after refreshing

here is my issue jupyterlab/jupyterlab#12455

So I think the difficulty of implementing the jupyter messaging mechanism on the client side of the browser environment is one of the root causes of this problem

Wh1isper · 2022-12-07T16:38:30Z

JupyterLab 4.0 already has the ability to push file changes from local to the front end, and I'm trying to write a plugin that executes cell code and writes to files

https://github.com/Wh1isper/jupyter_kernel_executor

Welcome discussion here: jupyter-server/jupyter_server#1123

TinoDerb · 2024-02-26T12:05:11Z

I came across this problem not so long ago and I went to stackoverflow. Eventually I figured out that there was not straight-forward to solve this problem. So I went with a run-around instead. I saved all the values that I wanted in a file, which I then open in another notebook on the same IP. From there, I can open, plot, and display summaries without disturbing the measurement/calculation notebook.

Wanted to post here in case this would be a suggestion for others who are seeking an answer to this question

jtpio · 2024-02-26T12:33:03Z

Note there is some work happening upstream in JupyterLab to be able to customize the executor used when running cells: jupyterlab/jupyterlab#15830

This will allow for addressing issues like jupyterlab/jupyterlab#2833. If this lands in JupyterLab 4.2, it will also be available in Notebook 7.2.

drf5n · 2024-04-17T22:27:38Z

If you know you are going to lose the connection, you can capture the output for use later:

%%capture CELLOUT
for i in range(5):
    print(i)
    time.sleep(5)

next cell:

print(CELLOUT.stdout)

Carreau added this to the 5.0 milestone Oct 22, 2015

minrk modified the milestone: 5.0 Jan 13, 2017

AdrienLemaire mentioned this issue Feb 23, 2017

Reconnect to running session #1150

Closed

daikema mentioned this issue Jul 14, 2017

Missing output when temporarily disconnected while running jupyterhub/jupyterhub#1211

Closed

oersted mentioned this issue Aug 15, 2017

Reconnect to running session: keeping output jupyterlab/jupyterlab#2833

Closed

jasongrout mentioned this issue Sep 27, 2018

keep notebook running after the browser tab closed #1647

Closed

rgbkrk mentioned this issue Feb 22, 2019

Currently being generated cell output disappears once you close/reload the page jupyterlab/jupyterlab#6003

Open

telamonian mentioned this issue Jul 31, 2020

JupyterLab vision for the next few years jupyterlab/frontends-team-compass#80

Open

kevin-bates mentioned this issue Jul 7, 2021

I am training a deep learning model in Jupyter using Keras. When I close out of the window and open it again, the window no longer updates my training progress in real time. #6103

Closed

Zsailer added the enhancement label Feb 3, 2022

Zsailer added this to the Future milestone Feb 3, 2022

JasonWeill mentioned this issue Feb 24, 2022

Weekly Triage meetings: Feb-Jun 2022 jupyterlab/frontends-team-compass#137

Closed

lumbric mentioned this issue Aug 23, 2022

Get output after closing and reopening the browser tab with notebook jupyterlab/jupyterlab#12422

Closed

Wh1isper mentioned this issue Dec 7, 2022

Extension for POST Execute cell jupyter-server/jupyter_server#1123

Closed

Wh1isper mentioned this issue Dec 7, 2022

About this project Wh1isper/jupyter_kernel_executor#1

Open

kevin-bates mentioned this issue Feb 1, 2023

Notebook not reconnecting to cell output in progress on refresh jupyter/nbclassic#213

Open

kevin-bates mentioned this issue Jun 8, 2023

Meeting Notes 2023 jupyter-server/team-compass#45

Closed

ivanov mentioned this issue Dec 12, 2023

Output do not update after close and reopen the notebook page jupyter/jupyter#83

Closed

jtpio modified the milestones: Future, Backlog Feb 27, 2024

krassowski mentioned this issue Mar 6, 2024

Feature request: Be able to receive cell output from a remote kernel after reconnecting ipython/ipython#4140

Closed

Restoring computation output after disconnect in Notebook #641

Restoring computation output after disconnect in Notebook #641

Comments

hadim commented Oct 22, 2015

rshpeley commented Oct 19, 2016

rshpeley commented Oct 23, 2016

minrk commented Oct 25, 2016

post2web commented Mar 1, 2017

manuelsh commented Mar 27, 2017 • edited Loading

ghost commented May 19, 2017 • edited by ghost Loading

louismartin commented Mar 2, 2020

jasongrout commented Mar 3, 2020 • edited Loading

cossio commented Sep 11, 2020

rgbkrk commented Oct 15, 2020

apophenist commented Nov 21, 2020 • edited Loading

banderlog commented Jul 1, 2021

astrojuanlu commented Jul 1, 2021

jasongrout commented Jul 1, 2021

cossio commented Aug 31, 2021

tgpfeiffer commented Apr 13, 2022

Wh1isper commented Apr 22, 2022 • edited Loading

davidbrochart commented May 17, 2022

cossio commented May 17, 2022 • edited Loading

davidbrochart commented May 17, 2022

cossio commented May 17, 2022 • edited Loading

davidbrochart commented May 17, 2022

cossio commented May 17, 2022 • edited Loading

davidbrochart commented May 17, 2022

cossio commented May 17, 2022

davidbrochart commented May 17, 2022

williamstein commented May 17, 2022

davidbrochart commented May 17, 2022

Wh1isper commented May 20, 2022

Kesanov commented Aug 3, 2022

astrojuanlu commented Aug 3, 2022 • edited Loading

davidbrochart commented Aug 3, 2022

Wh1isper commented Aug 3, 2022

Wh1isper commented Sep 14, 2022 • edited Loading

Wh1isper commented Dec 7, 2022 • edited Loading

TinoDerb commented Feb 26, 2024

jtpio commented Feb 26, 2024

drf5n commented Apr 17, 2024

manuelsh commented Mar 27, 2017 •

edited

Loading

ghost commented May 19, 2017 •

edited by ghost

Loading

jasongrout commented Mar 3, 2020 •

edited

Loading

apophenist commented Nov 21, 2020 •

edited

Loading

Wh1isper commented Apr 22, 2022 •

edited

Loading

cossio commented May 17, 2022 •

edited

Loading

cossio commented May 17, 2022 •

edited

Loading

cossio commented May 17, 2022 •

edited

Loading

astrojuanlu commented Aug 3, 2022 •

edited

Loading

Wh1isper commented Sep 14, 2022 •

edited

Loading

Wh1isper commented Dec 7, 2022 •

edited

Loading