Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import deadlock with streams #193

Closed
rkern opened this issue Sep 23, 2016 · 3 comments · Fixed by jupyter/jupyter_client#206
Closed

Import deadlock with streams #193

rkern opened this issue Sep 23, 2016 · 3 comments · Fixed by jupyter/jupyter_client#206
Milestone

Comments

@rkern
Copy link
Contributor

rkern commented Sep 23, 2016

Write this text to the file minimal.py:

import sys

sys.__stdout__.write('About to do\n')
sys.__stdout__.flush()
sys.stdout.write('Doing\n')
sys.stdout.flush()
sys.__stdout__.write('Done\n')
sys.__stdout__.flush()

Start up an IPython (Python 2; the issue may not exist in Python 3) notebook. Execute import minimal. The kernel will become unresponsive.

The problem is that sys.stdout.flush() will call the OutputStream.flush() method which adds a callback to the event loop that sends off a zmq message to the notebook. That callback is executed in another thread. To create the message to send, it will create a UUID using uuid.uuid4() which has a local import of os in it. Python 2 has a global import lock that was acquired by the main thread which is executing our import minimal. That import does not complete because OutputStream.flush() is synchronous and is waiting for the event loop to be processed. Deadlock.

As far as I can tell, this is the only place that something is imported at runtime in the message-sending thread, so reimplementing uuid.uuid4() to not locally import would be a minimal fix that avoids the issue.

@rkern
Copy link
Contributor Author

rkern commented Sep 23, 2016

Oh, the offending uuid.uuid4() calls are in jupyter_client/session.py, FWIW. Let me know if this issue should be moved over to that repo.

@minrk
Copy link
Member

minrk commented Sep 24, 2016

Here's a fine place for the issue for now. We can consider removing the eventloop wait in flush. I'll need to investigate what that is there for (I think it might have to do with forked subprocesses exiting before sending completes).

@rkern
Copy link
Contributor Author

rkern commented Sep 26, 2016

It's there for the semantics, I think. flush() is supposed to block until the buffer is actually, you know, flushed.

@minrk minrk added this to the no action milestone Nov 16, 2016
Carreau added a commit to Carreau/ipykernel that referenced this issue Feb 17, 2022
Fixes ipython#193.

This should make sure we properly cull all subprocesses at shutdown,
it does change one of the private method from sync to async in order to
no user time.sleep or thread so this may affect subclasses, though I
doubt it.

It's also not completely clear to me whether this works on windows as
SIGINT I belove is not a thing.

Regardless as this affects things like dask, and others that are mostly
on unix, it should be an improvement.

It does the following, stopping as soon as it does not find any more
children to current process.

 - Send sigint to everything
 - Immediately send sigterm in look with an exponential backoff from
   0.01 to 1 second roughtly multiplying the delay until next send by 3
     each time.
 - Switch to sending sigkill with same backoff.

There is no delay after sigint, as this is just a courtesy.
The delays backoff are not configurable. I can imagine that on slow
systems it may make sens
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants