Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPKernelApp configs capture_fd_output and quiet conflict and break ipykernel #859

Closed
MrBago opened this issue Feb 7, 2022 · 15 comments · Fixed by #1111
Closed

IPKernelApp configs capture_fd_output and quiet conflict and break ipykernel #859

MrBago opened this issue Feb 7, 2022 · 15 comments · Fixed by #1111

Comments

@MrBago
Copy link

MrBago commented Feb 7, 2022

When c.IPKernelApp.capture_fd_output is True (default True) and c.IPKernelApp.quiet if False (default True), printing anything causes an infinite output in the cell. quiet = True causes outputs to get directed to fd, which then gets picked up by the capture_fd_output logic and re-printed. This loop continues until the process is killed.

@jasongrout-db
Copy link

jasongrout-db commented Feb 7, 2022

Adding context to link PRs together: the infinite output problem when IPKernelApp.quiet is False seems to have started in the initial commit of #630, and the capture_fd_output was introduced in #752.

@garlandz-db
Copy link
Contributor

garlandz-db commented Mar 25, 2023

From my findings, when capture_fd_output=True and quiet=False, we echo the text back to sys._ _ stdout _ _ which was overwritten (in addition to sys.stdout) since we change the underlying file descriptor. Therefore we are echoing text by writing it back to the pipe instead of the terminal. So the watcher thread then reads that same text again in the pipe, writes to the terminal and then echos. reads, writes to terminal, echos.

Not that this is a solution but setting echo=None prevents the infinite loop. Although that might be the same as just setting quiet=True.

@garlandz-db
Copy link
Contributor

garlandz-db commented Apr 3, 2023

One proposal for solution: in the case that quiet=False, capture=True, the watcher threads are only able to send zmq messages. That is, directly calling write() like sys.stdout.write() would actually echo the data onto the stdout pipe, which will then be picked up by the watcher thread and then write() to zmq but this time with echo disabled.
image

appreciate any feedback!

@garlandz-db
Copy link
Contributor

another solution alternative.

image

@Zsailer
Copy link
Member

Zsailer commented Apr 6, 2023

Hi @Carreau or @minrk, do you mind taking a look at these proposed solutions?

@Carreau
Copy link
Member

Carreau commented Apr 12, 2023

I haven't touched this file in a long time, and it's complicated enough that I don't think I can properly judge this.

In general I believe the watcher thread is complex, and a real fix that would fix this and other things, would be to really implement the kernel nanny.

The kernel nanny have a real way to actually capture output, and would allow to process signals, and metrics without having to worry about the kernel being stuck in a compiled library.

@garlandz-db
Copy link
Contributor

How much effort would it take to implement the kernel nanny? This also seems non-trivial and would probably require a major release to ship.

@Carreau
Copy link
Member

Carreau commented Apr 17, 2023

Another experiment; https://github.com/carreau/inplace_restarter, this intercept commands and when it see "%restart" just restart the subprocess.

It's not hard to implement, it's just the occasion to update the jupyter protocol and document it.

Technically we can implement it in a completely transparent manner, but I believe it would be nice to agree on a way to tell the kernel "you are under nanny" supervision, you do not need to do X, Y, Z. It would also be a good way to implement the handshake protocol, as the nany could do the handshake transparently for older kernels.

@garlandz-db
Copy link
Contributor

@Carreau thanks for your insight! im not too sure how long the process to shipping new code in ipykernel takes. if you had to give an estimate, how long would the kernel nanny take to be shipped in ipykernel? it also seems determining the handshake pattern is a blocker then to implementing the nanny (using handshake protocol) since that JEP is still in discussion.

@Carreau
Copy link
Member

Carreau commented Apr 21, 2023

The longest will be implementing it. Making a new release assuming purely nanny that does not change the protocol should not be long. We can even make it opt-in or in beta state. The hard thing is finding time to write the code.

@minrk
Copy link
Member

minrk commented Apr 21, 2023

While I still think the nanny is a good idea (I ended up implementing a version of it in ipython parallel, though notably it does not include output capture), I think this specific self-looping issue can be solved in a much smaller way, since we already have a solution for that for the logger. We can do the same thing for the echo and solve this specific bug (i.e. echo FD should be _original_stdstream_copy, not __stdout__)

@garlandz-db
Copy link
Contributor

garlandz-db commented Apr 21, 2023

@minrk just to double check my understanding, is this mental model correct that we are going from
image

to this (aka move the arrow to the terminal output):

image

if so then thats a really great idea and simple to implement. was trying to come up with solutions to break the cycle in different ways 😅

@minrk
Copy link
Member

minrk commented Apr 24, 2023

Yes, I think that's right. I believe #1111 fixes this, but I haven't quite got the tests to not mess with pytest's output capture.

@garlandz-db
Copy link
Contributor

can confirm this fix works as expected. i copied your line change and the loop issue is gone. thanks for doing this @minrk!! lmk if there is something i can help with to drive this to completion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants