Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nursery: don't act as a checkpoint when not running anything #1696

Merged
merged 9 commits into from
Oct 21, 2023

Conversation

catern
Copy link
Contributor

@catern catern commented Aug 26, 2020

Previously, a Nursery would checkpoint on exit, regardless of whether
it was running anything. This is a bit annoying, see #1457, so let's
change it to not checkpoint on exit. Now it won't be a checkpoint at
all if there's no tasks running inside.

However, this doesn't fully solve #1457, because we'll still inject a
cancel even if none of the child tasks raise Cancelled.

This seems to break
trio/_core/tests/test_asyncgen.py::test_last_minute_gc_edge_case
such that it no longer ever hits the edge case.

That seems like a pretty complicated test, so maybe @oremanj can say
better why exactly it's breaking with this change.

@njsmith
Copy link
Member

njsmith commented Aug 26, 2020

However, this doesn't fully solve #1457, because we'll still inject a cancel even if none of the child tasks raise Cancelled.

Here's the code that does that, at line 923:

            def aborted(raise_cancel):
                self._add_exc(capture(raise_cancel).error)

...I was going to say we could delete this line and simply ignore cancellation here, but that's not quite right. We do want to ignore Cancelled exceptions here, but if Trio is using cancellation to deliver a KeyboardInterrupt to the main task while it's blocked in __aexit__, then we still need that to be injected so that the nursery will cancel the other tasks and then raise KeyboardInterrupt. That should be fixed at some point Soon™, but in the mean time I guess we could work around this by adding a check for whether capture(raise_cancel).error is a Cancelled, and if so then ignore it.

@oremanj
Copy link
Member

oremanj commented Aug 26, 2020

I left a comment on #1457: I think we shouldn't drop the schedule point in either branch, only the cancellation point; and we shouldn't drop the cancellation point if the nursery didn't run any tasks, so that async with open_nursery(): pass still executes a full checkpoint.

The test is failing because it relies on nursery exit being a schedule point.

Previously, a Nursery could raise Cancelled even if all its children
had completed successfully.

This is undesirable, as discussed in python-trio#1457, so now a Nursery will
shield itself from cancels in __aexit__.

A Nursery with children can still be cancelled fine: if any of its
children raise Cancelled, then the whole Nursery will raise Cancelled.
If none of the Nursery's children raise Cancelled, then the Nursery
will never raise Cancelled.

As another consequence, a Nursery which never had any child tasks will
never raise Cancelled.

As yet another consequence, since an internal Nursery is used in the
implementation of Nursery.start(), if start() is cancelled, but the
task it's starting does not raise Cancelled, start() won't raise
Cancelled either.
Comment on lines 1692 to 1701
# but if the task does not execute any checkpoints, and exits, then start()
# doesn't raise Cancelled, since the task completed successfully.
async def started_with_no_checkpoint(task_status=_core.TASK_STATUS_IGNORED):
task_status.started("hi")

async with _core.open_nursery() as nursery:
with _core.CancelScope() as cs:
cs.cancel()
await nursery.start(started_with_no_checkpoint)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more implications here. This is because a Nursery is used in the implementation of start().

I'd argue that this is correct.

@catern
Copy link
Contributor Author

catern commented Sep 1, 2020

Also, what do you think about making assert_yields_or_not public? I just needed it for the test that a nursery is a schedule point but not a cancel point. Maybe that's something we should discourage?

@codecov
Copy link

codecov bot commented Dec 13, 2020

Codecov Report

Merging #1696 (06c176c) into master (3288cfc) will increase coverage by 0.10%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1696      +/-   ##
==========================================
+ Coverage   99.03%   99.14%   +0.10%     
==========================================
  Files         115      115              
  Lines       17433    17456      +23     
  Branches     3107     3110       +3     
==========================================
+ Hits        17265    17306      +41     
+ Misses        122      104      -18     
  Partials       46       46              
Files Coverage Δ
trio/_core/_run.py 100.00% <100.00%> (ø)
trio/_core/_tests/test_run.py 100.00% <100.00%> (ø)

... and 5 files with indirect coverage changes

@njsmith
Copy link
Member

njsmith commented Dec 13, 2020

Also, what do you think about making assert_yields_or_not public? I just needed it for the test that a nursery is a schedule point but not a cancel point. Maybe that's something we should discourage?

I actually just ran into this issue again while working on #1805 on stream, so I did some tweaks to land this quicker. And one of those tweaks was to revert the assert_yields_or_not changes, because they seemed overly complex for a single weird internal test :-). So I guess that answers the question...

There's no easy way to work around it, and it's not worth it anyway
since almost no-one uses ancient 3.6.x point releases. (It was fixed
in 3.6.2.)
Copy link
Member

@goodboy goodboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code all looks great (as usual) but some minor clarifications to an onlooker (newb) such as myself might be handy 😸

When exiting a nursery block, the parent task always waits for child
tasks to exit. This wait cannot be cancelled. However, previously, if
you tried to cancel it, it *would* inject a `Cancelled` exception,
even though it wasn't cancelled. Most users probably never noticed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wait cannot be cancelled. However, previously, if
you tried to cancel it, it would inject a Cancelled exception,
even though it wasn't cancelled.

So this means the Cancelled is still injected but now is ignored via the new isinstance check?

Also if the following is true:

This wait cannot be cancelled. However, previously, if
you tried to cancel it, it would inject a Cancelled exception,
even though it wasn't cancelled.

Then why the change to cancel_shielded_checkpoint()?

summary = {}
for exc in multi_exc.exceptions:
summary.setdefault(type(exc), 0)
summary[type(exc)] += 1
assert summary == {_core.Cancelled: 4, KeyError: 1}
assert summary == {_core.Cancelled: 3, KeyError: 1}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarifying the cancelled count in a comment might be clearer here for those unfamiliar with the recent change?

Something like we only receive cancelleds from subtasks not from the nursery itself?

with _core.CancelScope() as cs:
cs.cancel()
async with _core.open_nursery() as nursery:
nursery.start_soon(noop_with_no_checkpoint)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume it's out of scope (and already covered in other tests) but normally doing this cancellation before opening the nursery results in cancellation at the first checkpoint right?


async with _core.open_nursery() as nursery:
with _core.CancelScope() as cs:
cs.cancel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where you cancel from inside the nursery after having started tasks with checkpoints, won't counts of expected cancels change as well or is that entirely covered by the one test change in test_run.py?

@catern
Copy link
Contributor Author

catern commented Dec 17, 2020

The changes look fine to me. Excited for this to be fixed!

@richardsheridan
Copy link
Contributor

The merge was not as painful as one might expect. The two new commits are nearly trivial fixes responsive to recent updates to the test suite.

@richardsheridan richardsheridan merged commit b073287 into python-trio:master Oct 21, 2023
28 of 30 checks passed
@trio-bot
Copy link

trio-bot bot commented Oct 21, 2023

Hey @catern, it looks like that was the first time we merged one of your PRs! Thanks so much! 🎉 🎂

If you want to keep contributing, we'd love to have you. So, I just sent you an invitation to join the python-trio organization on Github! If you accept, then here's what will happen:

  • Github will automatically subscribe you to notifications on all our repositories. (But you can unsubscribe again if you don't want the spam.)

  • You'll be able to help us manage issues (add labels, close them, etc.)

  • You'll be able to review and merge other people's pull requests

  • You'll get a [member] badge next to your name when participating in the Trio repos, and you'll have the option of adding your name to our member's page and putting our icon on your Github profile (details)

If you want to read more, here's the relevant section in our contributing guide.

Alternatively, you're free to decline or ignore the invitation. You'll still be able to contribute as much or as little as you like, and I won't hassle you about joining again. But if you ever change your mind, just let us know and we'll send another invitation. We'd love to have you, but more importantly we want you to do whatever's best for you.

If you have any questions, well... I am just a humble Python script, so I probably can't help. But please do post a comment here, or in our chat, or on our forum, whatever's easiest, and someone will help you out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants