Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix launch on windows #34

Merged
merged 5 commits into from
Nov 26, 2019
Merged

Fix launch on windows #34

merged 5 commits into from
Nov 26, 2019

Conversation

kevin-bates
Copy link
Collaborator

Due to event loop implementation differences (see #31), we need to "pin"
the SelectorEventLoop for Windows platforms and use synchronous popen/wait
constructs. Although non-windows platforms will use async constructs,
that decision is maintained in a variable so we can easily change behavior.

@takluyver - I've also added appveyor.yml. It would be great if you could configure appveyor for the repo.

Fixes #31

Due to event loop implementation differences (see #31), we need to "pin"
the SelectorEventLoop for Windows platforms and use synchronous popen/wait
constructs.  Although non-windows platforms will use async constructs,
that decision is maintained in a variable so we can easily change behavior.

Fixes #31
@codecov-io
Copy link

codecov-io commented Nov 16, 2019

Codecov Report

Merging #34 into master will increase coverage by 0.07%.
The diff coverage is 57.14%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #34      +/-   ##
==========================================
+ Coverage   69.02%   69.09%   +0.07%     
==========================================
  Files          28       28              
  Lines        1995     2074      +79     
==========================================
+ Hits         1377     1433      +56     
- Misses        618      641      +23
Impacted Files Coverage Δ
jupyter_kernel_mgmt/tests/test_kernelapp.py 97.05% <100%> (ø) ⬆️
jupyter_kernel_mgmt/tests/test_localinterfaces.py 100% <100%> (ø) ⬆️
jupyter_kernel_mgmt/managerabc.py 77.27% <100%> (+6.68%) ⬆️
jupyter_kernel_mgmt/tests/test_async_manager.py 100% <100%> (ø) ⬆️
jupyter_kernel_mgmt/tests/test_kernelspec.py 100% <100%> (+0.81%) ⬆️
jupyter_kernel_mgmt/tests/test_restarter.py 100% <100%> (ø) ⬆️
jupyter_kernel_mgmt/tests/conftest.py 100% <100%> (ø) ⬆️
jupyter_kernel_mgmt/tests/test_discovery.py 98.05% <100%> (+0.03%) ⬆️
jupyter_kernel_mgmt/util.py 63.15% <30%> (-36.85%) ⬇️
jupyter_kernel_mgmt/subproc/manager.py 68.49% <43.47%> (-11.14%) ⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 90002cd...174f10e. Read the comment docs.

@kevin-bates
Copy link
Collaborator Author

Hi @takluyver - could you please review this when you get a chance? I've added appveyor support to the source branch of the PR and the tests are passing on Windows: https://ci.appveyor.com/project/kevin-bates/jupyter-kernel-mgmt

What I can't figure out is why the code coverage stuff is marked as failing? It seems like the coverage may have decreased, at least that's the only thing I see, but I'm surprised that would cause a failure. If that's the case, is there any way to reset the baseline?

I'd really like to get 0.5 out so that the jupyter_server work can more easily move forward. However, my PR in that repo is failing on Windows but I also have to use a temporary modification to pull my dev branch in order to test. (sigh)

I suppose we should configure appveyor for this repo, although that brings up the discussion of moving jupyter_kernel_mgmt (and jupyter_protocol) into the jupyter (or equivalent) org.

try:
self.kernel.wait(sync_wait_timeout)
except subprocess.TimeoutExpired:
self.log.warning("Timeout expired waiting for process '{}' to terminate. Continuing...".
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very ugly - it's impossible to wait for longer than 2 seconds, but the event loop could be blocked for that long. I think the simplest improvement is to implement a polling loop with something like a 0.1 second async sleep in here - then it works basically as expected, just with a tiny extra delay after the process has really terminated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - I'm disappointed I left it like that, but I did - sorry.

It sounds like you're talking about implementing something close to the previous finish_shutdown() method. I'll go ahead with this, leaving things essentially as they are in jupyter_client (i.e., 5 second max).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar implementation, but we don't need any timeout here (because you can impose it from outside for async code). Just something like:

while self.kernel.poll() is None:
    await asyncio.sleep(0.1)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And no need to apologise - I think I had also thought this was harder to work around before.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the responses. I guess I’m not following how the timeout would be imposed. Could you provide an example?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

await asyncio.wait_for(manager.wait(), timeout=3.0)

The asyncio cancellation mechanism should be able to interrupt asyncio.sleep() perfectly fine, but it can't interrupt the blocking Popen.wait() call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but manager.wait() is likely only called by manager.kill() so are you suggesting that this be used from kill() and is 3 seconds sufficient?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kill could use something like that - I've no particular preference on what the timeout should be. It's quite unusual for a process to hang so it can't be killed, but it is possible.

But other things could also call wait - e.g. the restarter machinery, or code outside this repo.

if type(asyncio.get_event_loop_policy()) is WindowsProactorEventLoopPolicy:
# WindowsProactorEventLoopPolicy is not compatible with tornado 6
# fallback to the pre-3.8 default of Selector
asyncio.set_event_loop_policy(WindowsSelectorEventLoopPolicy())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not wild on setting this on import. Generally, I think it should be up to application code, not library code, to set global config like this.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Application code or test code, of course)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was going to talk to you about this since it's potentially too far down stream in most cases. I'll move it into the util directory as a method - where tests and more focused (i.e., thiner than server) applications can leverage its existence.

appveyor.yml Outdated
CONDA_PY_SPEC: ">=3.8,<3.9.0a0"
CONDA_INSTALL_LOCN: "C:\\Miniconda36-x64"
- CONDA_PY: 37
CONDA_PY_SPEC: ">=3.7,<3.8.0a0"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just ask conda for python=3.7, and it knows that means the highest available 3.7 version. This is how I'm used to using it, but maybe that's not recommended, or maybe I'm missing some corner case where specifying it this way makes a difference.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably right, I was following some recent changes in the notebook repo. That said, I can't get the variables to work at all. Very frustrating to "debug" Windows solely by way of appveyor results! 😞
I'll have to play with the settings some more.

@takluyver
Copy link
Owner

I'm not too concerned about the coverage - it shows one failure if the coverage for the new code is lower than the average, and one if the change causes overall coverage to drop. Ignoring it and merging the pull request anyway resets the baseline for future pull requests.

(It would perhaps be nice if this could be represented as less serious than a failing test, but that's how it is)

@kevin-bates
Copy link
Collaborator Author

@takluyver - I believe I've addressed the review comments. In addition, I've been able to get all Windows builds passing. The primary issue was that on python 3.8, despite us using the correct event loop, launched ipykernels were not, which prevented their termination, holding onto resources, etc. Once I installed Min's ipykernel patch, all the tests started passing. I think I'd prefer to hold off on merging this until Min's patch is merged, per this comment, but I don't think we need to wait for the next ipykernel release - since this is only comes into play on Windows 3.8 systems.

@kevin-bates
Copy link
Collaborator Author

@takluyver - all Windows tests are passing as well: https://ci.appveyor.com/project/kevin-bates/jupyter-kernel-mgmt

@takluyver takluyver merged commit 0eaf82c into takluyver:master Nov 26, 2019
@takluyver
Copy link
Owner

Thanks Kevin!

@kevin-bates kevin-bates deleted the fix-windows branch November 26, 2019 15:54
@kevin-bates
Copy link
Collaborator Author

Thank you for the review and merge @takluyver. I think this is a good time to cut 0.5.0 as it provides a good place to let the jupyter_server pin to this version so that other efforts can take place. Are you okay with that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Async kernel launch fails on Windows
3 participants