Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reimplement entrypoint in Python #1014

Merged
merged 4 commits into from
Mar 23, 2021
Merged

Conversation

minrk
Copy link
Member

@minrk minrk commented Feb 17, 2021

Per @manics idea, this reimplements the entrypoint in Python, mainly so we can better apply our expertise and understanding to it.

  • repo2docker-entrypoint is now in Python
  • shebang points to new python3-login to preserve login-shell behavior (source profiles, etc.) because we can't nest interpreters in a single shebang line. First attempt extracted login env from a subprocess, but I think this is a little more natural, maybe, as it matches the usual process hierarchy (bash -l > Python > 'real' entrypoint.
  • make sure PYTHONUNBUFFERED env is set, which is ~always needed in containers
  • tee is implemented in Python with nonblocking io via fcntl
  • all available signals except a select few that cannot be caught (SIGKILL, SIGSTOP) and shouldn't (SIGCHLD) are relayed to the child

It seems to work! Honestly, I cannot say if this fixes the issue because I know how to do unbuffered output in Python and couldn't figure it out with other tools, or if it's because Python is just enough slower that the previously lost output always loses the race 🤷 .

Original description: use stdbuf to make sure everything is unbuffered

  • apply stdbuf to everything in repo2docker-entrypoint
  • join stdout, stderr into log so we don't miss anything in the log file (from Investigating the missing logs #1008)
  • use capfd to capture output in test_env (not relevant, should be a simplification)

A bit of a shot in the dark at #1008

@minrk
Copy link
Member Author

minrk commented Feb 17, 2021

Didn't work :/

@manics
Copy link
Member

manics commented Feb 17, 2021

Do you think it's worth converting the entrypoint to Python and using subprocess with it's myriad options, since we might have a better understanding of how it works?

@minrk
Copy link
Member Author

minrk commented Feb 17, 2021

Do you think it's worth converting the entrypoint to Python and using subprocess with it's myriad options, since we might have a better understanding of how it works?

Maybe. I'll give it a go and see if that changes anything.

- make sure to set PYTHONUNBUFFERED
- implement tee with subprocess readlines
- forward all relevant signals from entrypoint to 'true' command
- add monitor process to ensure child shuts down if parent does (maybe not useful in docker?)
@minrk minrk changed the title unbuffer entrypoint pipes with stdbuf reimplement entrypoint in Python Feb 19, 2021
"/bin/bash",
"-c",
# Docker exports all passed env variables, so we can
# just look at exported variables.
"export; sleep 1",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this sleep is what verifies that this is (probably) actually fixing the problem.

- remove redundant monitor sibling process
- use python3-login executable instead of login shell subprocess
  (same effect, but in more natural order)
- use non-blocking binary IO in tee instead of readline
  (switch to binary mode, as text wrappers don't support non-blocking mode
   see https://bugs.python.org/issue13322)
@minrk
Copy link
Member Author

minrk commented Mar 9, 2021

This does seem to work. I'm not sure how folks feel about doing things this way - it is basically a Python reimplementation of a subset of tini with the tee functionality we want in repo2docker.

I do feel a lot more comfortable with parsing out the logic of the pipes, processes, and flushing in Python than named fds in a shell, though the signals and reaping I am a bit less confident about.

@minrk minrk requested a review from betatim March 9, 2021 09:48
@manics
Copy link
Member

manics commented Mar 9, 2021

I haven't had time to test this yet, but I'm in favour of switching to Python.

@betatim betatim merged commit 8731ecf into jupyterhub:master Mar 23, 2021
@betatim
Copy link
Member

betatim commented Mar 23, 2021

Let's see what this does :D

It feels like some comment from Inception is appropriate as we add another layer of Python :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants