Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add eager task creation API to asyncio #97696

Closed
jbower-fb opened this issue Sep 30, 2022 · 38 comments · Fixed by #102853
Closed

Add eager task creation API to asyncio #97696

jbower-fb opened this issue Sep 30, 2022 · 38 comments · Fixed by #102853
Labels
3.12 bugs and security fixes performance Performance or resource usage release-blocker topic-asyncio type-feature A feature request or enhancement

Comments

@jbower-fb
Copy link
Contributor

jbower-fb commented Sep 30, 2022

Feature or enhancement

We propose adding “eager” coroutine execution support to asyncio.TaskGroup via a new method enqueue() [1].

TaskGroup.enqueue() would have the same signature as TaskGroup.create_task() but eagerly perform the first step of the passed coroutine’s execution immediately. If the coroutine completes without yielding, the result of enqueue() would be an object which behaves like a completed asyncio.Task. Otherwise, enqueue() behaves the same as TaskGroup.create_task(), returning a pending asyncio.Task.

The reason for a new method, rather than changing the implementation of TaskGroup.create_task() is this new method introduces a small semantic difference. For example in:

async def coro(): ...

async with TaskGroup() as tg:
  tg.enqueue(coro())
  raise Exception

The exception will cancel everthing scheduled in tg, but if some or all of coro() completes eagerly any side-effects of this will be observable in further execution. If tg.create_task() is used instead no part of coro() will be executed.

Pitch

At Instagram we’ve observed ~70% of coroutine instances passed to asyncio.gather() can run fully synchronously i.e. without performing any I/O which would suspend execution. This typically happens when there is a local cache which can elide actual I/O. We exploit this in Cinder with a modified asyncio.gather() that eagerly executes coroutine args and skips scheduling a asyncio.Task object to an event loop if no yield occurs. Overall this optimization saved ~4% CPU on our Django webservers.

In a prototype implementation of this proposed feature [2] the overhead when scheduling TaskGroups with all fully-synchronous coroutines was decreased by ~8x. When scheduling a mixture of synchronous and asynchronous coroutines, performance is improved by ~1.4x, and when no coroutines can complete synchronously there is still a small improvement.

We anticipate code relying on any semantics which change between TaskGroup.create_task() and TaskGroup.enqueue() will be rare. So, as the TaskGroup interface is new in 3.11, we hope enqueue() and its performance benefits can be promoted as the preferred method for scheduling coroutines in 3.12+.

Previous discussion

This new API was discussed informally at PyCon 2022, with at least some of this being between @gvanrossum, @DinoV, @markshannon, and /or @jbower-fb.

[1] The name "enqueue" came out of a discussion between @gvanrossum and @DinoV.

[2] Prototype implementation (some features missing, e.g. specifying Context), and benchmark.

Linked PRs

@jbower-fb jbower-fb added the type-feature A feature request or enhancement label Sep 30, 2022
@kumaraditya303 kumaraditya303 added 3.12 bugs and security fixes performance Performance or resource usage labels Oct 1, 2022
@gvanrossum
Copy link
Member

If cond is True and some or all of coro() completes eagerly, any side-effects of this will be observable in further execution. If tg.create_task() is used instead, no part of coro() will be executed if cond is True.

If I didn't already understand the changes to async functions you are proposing I wouldn't understand what this means (or the code fragment above it). What does cond refer to? I think what you are trying to say here is that, if the body of the with TaskGroup() block raises before the first await, it will cancel all created tasks before they have started executing. Right?

Separately -- Can you remind me of the reason to make this a TaskGroup method instead of a new flag on create_task()? And the enqueue() name feels odd, especially since it may immediately execute some of the coroutine, which seems the opposite of putting something in a queue.

To be clear, I have no objection to the feature, and I think making this an opt-in part of task creation avoids worries of changing semantics for something as simple as await coro().

@itamaro
Copy link
Contributor

itamaro commented Oct 1, 2022

Separately -- Can you remind me of the reason to make this a TaskGroup method instead of a new flag on create_task()? And the enqueue() name feels odd, especially since it may immediately execute some of the coroutine, which seems the opposite of putting something in a queue.

This came up in the conversation we had with @DinoV during the Language Summit. We started out with a flag on create_task (e.g. eager=True), but we didn't like the fact that when setting this flag, the name of the method may now be misleading (because, in fact, it may not create a task).
enqueue was the alternative (I don't remember who suggested it and why), but we can definitely bikeshed the naming :)
If we take inspiration from Trio's nurseries, we can use start_soon.
Other options are execute or run (although these names may be misleading too, since users may assume these APIs never create a task).

@jbower-fb
Copy link
Contributor Author

If I didn't already understand the changes to async functions you are proposing I wouldn't understand what this means (or the code fragment above it). What does cond refer to? I think what you are trying to say here is that, if the body of the with TaskGroup() block raises before the first await, it will cancel all created tasks before they have started executing. Right?

Right. Yeah, this is more contrived than needed, I'll fix up the text.

This came up in the conversation we had with @DinoV during the Language Summit. We started out with a flag on create_task (e.g. eager=True), but we didn't like the fact that when setting this flag, the name of the method may now be misleading (because, in fact, it may not create a task).

I wasn't in this conversation but when writing this issue it occurred to me that even with eager execution we could still return something Task-like. This would mitigate any issues with knowing the interface of the returned object (e.g. is cancel() available). This would also mean the name create_task() would still make sense even with a keyword-argument to alter the behavior. Unless, I'm missing something which doesn't make this viable?

Having said this, I think eager execution should be the default behavior users reach for going forward. So it might still be better to have a new method.

@gvanrossum
Copy link
Member

Thanks for fixing up the description, it's clear now.

I wasn't in this conversation but when writing this issue it occurred to me that even with eager execution we could still return something Task-like. This would mitigate any issues with knowing the interface of the returned object (e.g. is cancel() available). This would also mean the name create_task() would still make sense even with a keyword-argument to alter the behavior. Unless, I'm missing something which doesn't make this viable?

It could return a Future, which has many of the same methods. If the coroutine returns an immediate result that could be the Future's result.

However, if the result is generally not used though the need to return something with appropriate methods would just reduce the performance benefit.

Having said this, I think eager execution should be the default behavior users reach for going forward. So it might still be better to have a new method.

Yeah, that is definitely an advantage of a new method name. I find enqueue() rather easy to mistype though.

Have the performance results been repeated with TaskGroup.enqueue(), or are the quoted measurements based on the Cinder implementation using gather()? There could be surprises here.

@jbower-fb
Copy link
Contributor Author

Have the performance results been repeated with TaskGroup.enqueue(), or are the quoted measurements based on the Cinder implementation using gather()? There could be surprises here.

The benchmark showing an 8x improvement (etc.) is with Dino's prototype of TaskGroup.enqueue() against some revision of 3.11. This is functional, but still needs some work. Notably it does not yet implement management of the current Context when eagerly executing. There will be some extra overhead from that but my hope is it'll be negligible.

It could return a Future, which has many of the same methods. If the coroutine returns an immediate result that could be the Future's result.

However, if the result is generally not used though the need to return something with appropriate methods would just reduce the performance benefit.

Dino's prototype currently returns a newly constructed Future if the eager execution fully completes, but returns a Task otherwise. While I can imagine it'll be less common to use the extra fields on Task, it might be annoying to have to think about whether a Future or a Task is returned. Again, unless I missed something, I don't think there should be extra overhead from returning some kind of eagerly-completed-Task value vs. a completed Future.

I find enqueue() rather easy to mistype though.

Indeed, I have misspelled it a number of times already. I like start_soon or start_immediate which to me imply execution start time may not be bound by the async with scope.

@gvanrossum
Copy link
Member

For a new API, returning either a Future or a Task is totally fine -- Task is a subclass of Future, so we can just document it as returning a Future.

I hadn't realized the prototype doesn't even need C changes -- I had assumed it depended on the changes to vectorcall to pass through a bit indicating it's being awaited immediately. Is that change even needed once we have this? (Probably we resolved that already during discussions at PyCon, it's just so long ago I can't remember anything.)

I think we need to bikeshed some more on the name, but you're right about the impression the name ought to give. Maybe start_task()?

@jbower-fb
Copy link
Contributor Author

We can just document it as returning a Future.

I think it'd be better to document it as returning a Task? That way it's clear cancellation is still generally an option.

I had assumed it depended on the changes to vectorcall to pass through a bit indicating it's being awaited immediately. Is that change even needed once we have this? (Probably we resolved that already during discussions at PyCon, it's just so long ago I can't remember anything.)

That is an interesting question, and again I don't think I was around if that was discussed in detail. For this feature, that flag is not needed. If we want to add an optimization for the large body of existing code that uses asyncio.gather(), then it would help significantly. We are probably going to carry this feature forward in Cinder for a while at least as it'll likely take quite some time (years?) before we can fully migrate to the new TaskGroup API. I'd definitely like to know what your, or other members of the community feel about adding an optimization to help with the older/existing API.

Maybe start_task()?

Fine with me. Maybe we should discuss further once a PR is up.

@gvanrossum
Copy link
Member

I think it'd be better to document it as returning a Task? That way it's clear cancellation is still generally an option.

From a static type POV it's always a Future, not always a Task. If you want to cancel it you take your chances -- cancel() will return a bool telling you whether it worked. The docs should just explain the situation without lying.

I will wait for the PR (that you're hopefully working on?). I recommend using start_task() as the method name until we come up with something better.

@jbower-fb
Copy link
Contributor Author

jbower-fb commented Oct 6, 2022

From a static type POV it's always a Future, not always a Task.

Whoops, I just realized Future has a cancel() method. I mistakenly thought that was a feature of Task. Sorry for the confusion.

I will wait for the PR (that you're hopefully working on?)

We should have something up soon. Really wanted to get an issue open first for early feedback on the plan.

@gvanrossum
Copy link
Member

We had a fruitful discussion on this topic at the core dev sprint. @1st1 and @carljm were present (and others).

Yury recommends not adding a new API like proposed here, but instead solving this globally (per event loop). We can have an optional task factory that people can install using loop.set_task_factory(), and which does effectively what is being proposed here for all create_task() calls (not just for the TaskGroup method).

The new create_task() would do what you propose here (call coro.send(None) and analyze the outcome), and either create a Future and set its result (or exception), or create a Task from a coroutine and a yield_result. The latter operation could be done by an extension to the create_task() API and the Task constructor API -- if an optional keyword parameter yield_result=result is passed, the task constructor doesn't end by calling loop.call_soon(self.__step, context=self._context), but instead treats result as is done by __step, interpreting it as a future to be blocked on.

This requires an extension of the create_task() API, but that should be okay -- if you are using a loop whose create_task() doesn't yet support the yield_result keyword parameter, you just can't benefit from the new functionality yet. (I'm guessing the loop should advertise whether it supports this somehow.)

@gvanrossum
Copy link
Member

I'm working on a prototype implementation in #98137.

@gvanrossum
Copy link
Member

I'm hoping that one of the Instagram folks can work on this contribution? I don't have the bandwidth to finish up that PR -- it was just meant as a quick proof of concept of what we discussed at the sprint.

@jbower-fb
Copy link
Contributor Author

Yep, we'll pick that up! Just need to find a moment to look in more detail.

@gvanrossum gvanrossum changed the title Add eager evaluation API to TaskGroups Add eager task creation API to asyncio Oct 19, 2022
@gvanrossum
Copy link
Member

@jbower-fb Have you ever found the time to look into this more?

@jbower-fb
Copy link
Contributor Author

@gvanrossum I have not but I think @itamaro is going to pick this up. Sorry for the confusion, it was a bit unclear last year who was going to have time to do what.

@gvanrossum
Copy link
Member

Cool. Do I need to keep this PR open?

@itamaro
Copy link
Contributor

itamaro commented Feb 1, 2023

Cool. Do I need to keep this PR open?

already pulled that PR locally, so you can close it!

probably going to take me a bit to learn enough asyncio internals to get this working and tested

@gvanrossum
Copy link
Member

Let me know (e.g. by posting questions here) if you need any help!

@itamaro
Copy link
Contributor

itamaro commented Feb 6, 2023

GH-101613 has a working prototype. still much left to do. TODO items and additional discussion items on the draft PR (let me know if you prefer to keep the discussion on the issue).

@graingert
Copy link
Contributor

graingert commented Mar 22, 2023

isn't this going to cause issues with structured concurrency objects (those that use asyncio.current_task(), eg timeout and TaskGroup)?

eg in

async def some_task():
    async with asyncio.timeout(1):
        await asyncio.Event().wait() # never gets cancelled

here asyncio.timeout grabs the asyncio.current_task(), which would actually be some previous task

We anticipate code relying on any semantics which change between TaskGroup.create_task() and TaskGroup.enqueue() will be rare

opening a timeout or TaskGroup in the first step of an async function doesn't seem like a rare occurrence

@itamaro
Copy link
Contributor

itamaro commented Mar 24, 2023

isn't this going to cause issues with structured concurrency objects (those that use asyncio.current_task(), eg timeout and TaskGroup)?

yes, this is a real problem.

one possible solution is to modify asyncio.current_task to materialize a task on-demand for the current coroutine if it detects it's being called in the context of an eagerly executing coroutine (would require the eager task factory to set something on the loop to communicate this to current_task).

would that resolve the issue and be an acceptable solution?

itamaro added a commit to itamaro/cpython that referenced this issue May 7, 2023
- Document `eager_start` argument to Task constructor
- Document usage of `create_eager_task_factory`
- Fix docs nit from https://github.com/python/cpython/pull/104140/files#r1186714354
kumaraditya303 pushed a commit that referenced this issue May 7, 2023
itamaro added a commit to itamaro/cpython that referenced this issue May 8, 2023
jbower-fb added a commit to jbower-fb/cpython-jbowerfb that referenced this issue May 8, 2023
Remove "#include cpython/context.h"` from `_asynciomodule.c`.

It's already included in `Python.h`.
jbower-fb pushed a commit to jbower-fb/cpython-jbowerfb that referenced this issue May 8, 2023
jbower-fb added a commit to jbower-fb/cpython-jbowerfb that referenced this issue May 8, 2023
gvanrossum pushed a commit that referenced this issue May 9, 2023
Instead, add docstring to create_eager_task_factory.
carljm added a commit to carljm/cpython that referenced this issue May 9, 2023
* main: (47 commits)
  pythongh-97696 Remove unnecessary check for eager_start kwarg (python#104188)
  pythonGH-104308: socket.getnameinfo should release the GIL (python#104307)
  pythongh-104310: Add importlib.util.allowing_all_extensions() (pythongh-104311)
  pythongh-99113: A Per-Interpreter GIL! (pythongh-104210)
  pythonGH-104284: Fix documentation gettext build (python#104296)
  pythongh-89550: Buffer GzipFile.write to reduce execution time by ~15% (python#101251)
  pythongh-104223: Fix issues with inheriting from buffer classes (python#104227)
  pythongh-99108: fix typo in Modules/Setup (python#104293)
  pythonGH-104145: Use fully-qualified cross reference types for the bisect module (python#104172)
  pythongh-103193: Improve `getattr_static` test coverage (python#104286)
  Trim trailing whitespace and test on CI (python#104275)
  pythongh-102500: Remove mention of bytes shorthand (python#104281)
  pythongh-97696: Improve and fix documentation for asyncio eager tasks (python#104256)
  pythongh-99108: Replace SHA3 implementation HACL* version (python#103597)
  pythongh-104273: Remove redundant len() calls in argparse function (python#104274)
  pythongh-64660: Don't hardcode Argument Clinic return converter result variable name (python#104200)
  pythongh-104265 Disallow instantiation of `_csv.Reader` and `_csv.Writer` (python#104266)
  pythonGH-102613: Improve performance of `pathlib.Path.rglob()` (pythonGH-104244)
  pythongh-103650: Fix perf maps address format (python#103651)
  pythonGH-89812: Churn `pathlib.Path` methods (pythonGH-104243)
  ...
carljm added a commit to carljm/cpython that referenced this issue May 9, 2023
* main:
  pythongh-104276: Make `_struct.unpack_iterator` type use type flag instead of custom constructor (python#104277)
  pythongh-97696: Move around and update the whatsnew entry for asyncio eager task factory (python#104298)
  pythongh-103193: Fix refleaks in `test_inspect` and `test_typing` (python#104320)
  require-pr-label.yml: Add missing "permissions:" (python#104309)
  pythongh-90656: Add platform triplets for 64-bit LoongArch (LA64) (python#30939)
  pythongh-104180: Read SOCKS proxies from macOS System Configuration (python#104181)
@gvanrossum
Copy link
Member

What’s left? Is there a what’s new entry?

@itamaro
Copy link
Contributor

itamaro commented May 9, 2023

What’s left? Is there a what’s new entry?

I think this is done now!
Yes, we added a what's new entry.

@carljm carljm closed this as completed May 9, 2023
carljm added a commit to carljm/cpython that referenced this issue May 9, 2023
* main: (29 commits)
  pythongh-104276: Make `_struct.unpack_iterator` type use type flag instead of custom constructor (python#104277)
  pythongh-97696: Move around and update the whatsnew entry for asyncio eager task factory (python#104298)
  pythongh-103193: Fix refleaks in `test_inspect` and `test_typing` (python#104320)
  require-pr-label.yml: Add missing "permissions:" (python#104309)
  pythongh-90656: Add platform triplets for 64-bit LoongArch (LA64) (python#30939)
  pythongh-104180: Read SOCKS proxies from macOS System Configuration (python#104181)
  pythongh-97696 Remove unnecessary check for eager_start kwarg (python#104188)
  pythonGH-104308: socket.getnameinfo should release the GIL (python#104307)
  pythongh-104310: Add importlib.util.allowing_all_extensions() (pythongh-104311)
  pythongh-99113: A Per-Interpreter GIL! (pythongh-104210)
  pythonGH-104284: Fix documentation gettext build (python#104296)
  pythongh-89550: Buffer GzipFile.write to reduce execution time by ~15% (python#101251)
  pythongh-104223: Fix issues with inheriting from buffer classes (python#104227)
  pythongh-99108: fix typo in Modules/Setup (python#104293)
  pythonGH-104145: Use fully-qualified cross reference types for the bisect module (python#104172)
  pythongh-103193: Improve `getattr_static` test coverage (python#104286)
  Trim trailing whitespace and test on CI (python#104275)
  pythongh-102500: Remove mention of bytes shorthand (python#104281)
  pythongh-97696: Improve and fix documentation for asyncio eager tasks (python#104256)
  pythongh-99108: Replace SHA3 implementation HACL* version (python#103597)
  ...
carljm added a commit to carljm/cpython that referenced this issue May 9, 2023
* main:
  pythongh-97696 Add documentation for get_coro() behavior with eager tasks (python#104304)
  pythongh-97933: (PEP 709) inline list/dict/set comprehensions (python#101441)
  pythongh-99889: Fix directory traversal security flaw in uu.decode() (python#104096)
  pythongh-104184: fix building --with-pydebug --enable-pystats (python#104217)
  pythongh-104139: Add itms-services to uses_netloc urllib.parse. (python#104312)
  pythongh-104240: return code unit metadata from codegen (python#104300)
carljm added a commit to carljm/cpython that referenced this issue May 9, 2023
* main: (156 commits)
  pythongh-97696 Add documentation for get_coro() behavior with eager tasks (python#104304)
  pythongh-97933: (PEP 709) inline list/dict/set comprehensions (python#101441)
  pythongh-99889: Fix directory traversal security flaw in uu.decode() (python#104096)
  pythongh-104184: fix building --with-pydebug --enable-pystats (python#104217)
  pythongh-104139: Add itms-services to uses_netloc urllib.parse. (python#104312)
  pythongh-104240: return code unit metadata from codegen (python#104300)
  pythongh-104276: Make `_struct.unpack_iterator` type use type flag instead of custom constructor (python#104277)
  pythongh-97696: Move around and update the whatsnew entry for asyncio eager task factory (python#104298)
  pythongh-103193: Fix refleaks in `test_inspect` and `test_typing` (python#104320)
  require-pr-label.yml: Add missing "permissions:" (python#104309)
  pythongh-90656: Add platform triplets for 64-bit LoongArch (LA64) (python#30939)
  pythongh-104180: Read SOCKS proxies from macOS System Configuration (python#104181)
  pythongh-97696 Remove unnecessary check for eager_start kwarg (python#104188)
  pythonGH-104308: socket.getnameinfo should release the GIL (python#104307)
  pythongh-104310: Add importlib.util.allowing_all_extensions() (pythongh-104311)
  pythongh-99113: A Per-Interpreter GIL! (pythongh-104210)
  pythonGH-104284: Fix documentation gettext build (python#104296)
  pythongh-89550: Buffer GzipFile.write to reduce execution time by ~15% (python#101251)
  pythongh-104223: Fix issues with inheriting from buffer classes (python#104227)
  pythongh-99108: fix typo in Modules/Setup (python#104293)
  ...
carljm added a commit to carljm/cpython that referenced this issue May 9, 2023
* main: (35 commits)
  pythongh-97696 Add documentation for get_coro() behavior with eager tasks (python#104304)
  pythongh-97933: (PEP 709) inline list/dict/set comprehensions (python#101441)
  pythongh-99889: Fix directory traversal security flaw in uu.decode() (python#104096)
  pythongh-104184: fix building --with-pydebug --enable-pystats (python#104217)
  pythongh-104139: Add itms-services to uses_netloc urllib.parse. (python#104312)
  pythongh-104240: return code unit metadata from codegen (python#104300)
  pythongh-104276: Make `_struct.unpack_iterator` type use type flag instead of custom constructor (python#104277)
  pythongh-97696: Move around and update the whatsnew entry for asyncio eager task factory (python#104298)
  pythongh-103193: Fix refleaks in `test_inspect` and `test_typing` (python#104320)
  require-pr-label.yml: Add missing "permissions:" (python#104309)
  pythongh-90656: Add platform triplets for 64-bit LoongArch (LA64) (python#30939)
  pythongh-104180: Read SOCKS proxies from macOS System Configuration (python#104181)
  pythongh-97696 Remove unnecessary check for eager_start kwarg (python#104188)
  pythonGH-104308: socket.getnameinfo should release the GIL (python#104307)
  pythongh-104310: Add importlib.util.allowing_all_extensions() (pythongh-104311)
  pythongh-99113: A Per-Interpreter GIL! (pythongh-104210)
  pythonGH-104284: Fix documentation gettext build (python#104296)
  pythongh-89550: Buffer GzipFile.write to reduce execution time by ~15% (python#101251)
  pythongh-104223: Fix issues with inheriting from buffer classes (python#104227)
  pythongh-99108: fix typo in Modules/Setup (python#104293)
  ...
bdraco added a commit to home-assistant/core that referenced this issue Feb 25, 2024
python 3.12 supports eager tasks

reading:
https://docs.python.org/3/library/asyncio-task.html#eager-task-factory
python/cpython#97696

There are lots of places were we are unlikely to suspend, but we might
suspend so creating a task makes sense
balloob pushed a commit to home-assistant/core that referenced this issue Feb 26, 2024
* Add support for eager tasks

python 3.12 supports eager tasks

reading:
https://docs.python.org/3/library/asyncio-task.html#eager-task-factory
python/cpython#97696

There are lots of places were we are unlikely to suspend, but we might
suspend so creating a task makes sense

* reduce

* revert entity

* revert

* coverage

* coverage

* coverage

* coverage

* fix test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes performance Performance or resource usage release-blocker topic-asyncio type-feature A feature request or enhancement
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

7 participants