[WIP] Async stream handlers #4474

ian-r-rose · 2021-02-01T19:18:11Z

I haven't done much profiling yet to see what kind of performance improvements this has (if any), I'll take a look at that next and post some results here.

ian-r-rose · 2021-02-01T19:23:56Z

distributed/pubsub.py

 import weakref

 from .core import CommClosedError
 from .metrics import time
 from .utils import sync, TimeoutError, parse_timedelta
 from .protocol.serialize import to_serialize

+if typing.TYPE_CHECKING:


I found adding type annotations to the stream_handlers object was useful in implementing this (cf #2803), allowing type checkers to catch which handlers were async or sync. However, I needed this pretty unsightly hack to avoid a circular import, just to get the type name in the module scope. I'm not happy about it, and it could be removed. On the other hand, it's kind of nice for refactoring.

Does from __future__ import annotations help at all here? Admittedly that is Python 3.7+ only. Though we are planning to drop Python 3.6 soon ( #4390 )

Yeah, I had to remove annotations in 4f1944b for 3.6 support. Though it didn't help that much, it mostly meant I didn't have to quote the type names below. The circular import problem remained.

@jakirkham it doesn't help - you still need those imports. from __future__ import annotations helps when you have a class accepting itself or returning itself in its methods, or when class A in a module accepts or returns class B which is declared afterwards in the same module.

ian-r-rose · 2021-02-01T19:24:57Z

distributed/utils.py

+T = TypeVar("T")
+
+
+def asyncify(func: Callable[..., T]) -> Callable[..., Awaitable[T]]:


In some places I've used an async wrapper to have a lighter footprint on the API, but more functions could just be made async in the first place.

Yeah, I almost raised a question about this until I looked at how these methods were used. I agree that making them sync by default is probably the right choice. This seemed like a good solution to me.

mrocklin · 2021-02-01T20:27:41Z

I haven't done much profiling yet to see what kind of performance improvements this has (if any), I'll take a look at that next and post some results here.

Ooh, fun. I look forward to seeing which of the many different ways of profiling you choose :)

mrocklin · 2021-02-03T03:53:00Z

So, a thought just came to me. I wonder if all of this could be solved by lru cache.

…

On Mon, Feb 1, 2021, 2:36 PM Ian Rose ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In distributed/pubsub.py <#4474 (comment)>: > import weakref from .core import CommClosedError from .metrics import time from .utils import sync, TimeoutError, parse_timedelta from .protocol.serialize import to_serialize +if typing.TYPE_CHECKING: Yeah, I had to remove annotations in 4f1944b <4f1944b> for 3.6 support. Though it didn't help that much, it mostly meant I didn't have to quote the type names below. The circular import problem remained. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#4474 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTDYWHBF5WNQBWKF7DDS44UHDANCNFSM4W5KM4JQ> .

crusaderky · 2021-02-03T08:52:42Z

distributed/pubsub.py

@@ -3,20 +3,25 @@
 from contextlib import suppress
 import logging
 import threading
+import typing


as a personal esthetic preference, I find that using from typing import ... is always a nicer option. It does lead to collisions with from collections.abc import ..., which can however be worked around after you drop Python 3.6 support, add from __future__ import annotations everywhere, and you run a Python 3.9-compatible version of mypy (on python 3.7+).

I generally agree with you here @crusaderky. In this case I kept it this way because the thing I'm using is typing.TYPE_CHECKING, and I didn't really want that floating around the module namespace. No strong preference either way, though.

mrocklin · 2021-02-03T17:24:48Z

So, a thought just came to me. I wonder if all of this could be solved by lru cache.

Expanding on this a bit more (that thought came to me last night and I wanted to get it out quickly). I think that we actually prefer being able to use both sync and async handlers, but we're choosing not to do this for performance reasons, mostly because the inspect.is_coroutine_function function is surprisingly slow. An alternative that might make it fast is to cache the result of this function. This might be a one-line fix

is_coroutine_function = functools.lru_cache(inspect.is_coroutine_function)

This does open up questions about how expensive lru_cache is, but I suspect that it's significantly faster.

ian-r-rose · 2021-02-03T17:29:02Z

So, a thought just came to me. I wonder if all of this could be solved by lru cache.

Expanding on this a bit more (that thought came to me last night and I wanted to get it out quickly). I think that we actually prefer being able to use both sync and async handlers, but we're choosing not to do this for performance reasons, mostly because the inspect.is_coroutine_function function is surprisingly slow. An alternative that might make it fast is to cache the result of this function. This might be a one-line fix

This is a good point, I'll test it out

ian-r-rose · 2021-02-03T22:31:34Z

Okay, here is a look at some profiling results. I've mostly followed the same strategy detailed by @mrocklin in #4443 (comment) using viztracer and shuffling the same large timeseries. The specific durations for the functions in question are small, but they are typically called thousands of times a second, so it does add up.

I've looked at three cases:

The current main
The async-only stream handler implemented here.
Using an lru cache for is_coroutine_function

Specifically, I'm looking at the profile of Server.handle_stream and is_coroutine_function. The latter only applies in cases (1) and (3).

`is_coroutine_function`:

Here is a histogram of durations for all sampled calls of is_coroutine_function:

As expected, using the LRU cache is significantly faster bringing the average duration from 15-20 ns down to 2-3 ns. This is a one-line solution, and keeps the API unchanged.

`handle_stream`:

Here is a histogram of duraitons for all sampled calls of handle_stream:

A few observations about this:

The main and LRU cases both have pretty wide distributions, depending on which synchronous op comes in. They average about 500 ns.
It's hard to tell from the image alone, but the average duration of the LRU case is in fact shorter than that of the main case, by some tens of ns. It also looks peakier, maybe because of the more consistent duration of the cached is_coroutine_function
The fully async case has a completely different distribution. In this case all handlers are pushed onto the async stack, so we aren't blocking on anything. It has a shorter average duration than the other two (~60 ns). Now, that time difference should be paid for later by the ioloop scheduling the previously-synchronous handlers at some other point.

Overall, it seems to me like the LRU version is indeed a quick win if we don't want to change any of the API, and continue to allow for synchronous handlers. Going fully async may be worthwhile, but involves a breaking API change, and maybe some pain for downstream plugin authors.

jakirkham · 2021-02-03T22:41:58Z

Thanks for the detailed results Ian 😄

Do we know how many plugins are developed for Distributed?

jakirkham · 2021-02-03T22:48:38Z

Oh sorry one more question, was the asyncio case using Tornado or uvloop? Wouldn't expect a difference, but would be good to know if there happened to be one (as we are using the latter in performance sensitive cases atm)

ian-r-rose · 2021-02-03T22:52:04Z

Do we know how many plugins are developed for Distributed?

Good question. I'm not aware of any third-party ones myself, but I don't see any reason why ones couldn't do something similar to this one.

Oh sorry one more question, was the asyncio case using Tornado or uvloop? Wouldn't expect a difference, but would be good to know if there happened to be one (as we are using the latter in performance sensitive cases atm)

This was with uvloop, I haven't done extensive profiling with tornado at the moment

mrocklin · 2021-02-03T22:54:26Z

Thank you for the detailed analysis @ian-r-rose

It sounds like we're going with lru?

ian-r-rose · 2021-02-03T23:10:05Z

It sounds like we're going with lru?

I'm happy to go with that, it's a simpler change for the time being (though I generally like more predictable async vs sync APIs where possible).

mrocklin · 2021-02-03T23:25:25Z

I'm happy either way with a mild preference for lru, just because it seems less likely to have unintended consequences. I'm also totally happy to be overruled here. The work in this PR helps to make things more predictable, which has real value and which we could use more of.

…

On Wed, Feb 3, 2021, 3:10 PM Ian Rose ***@***.***> wrote: It sounds like we're going with lru? I'm happy to go with that, it's a simpler change for the time being (though I generally like more predictable async vs sync APIs where possible). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4474 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTE4J4CLQ3PHT47B353S5HJV3ANCNFSM4W5KM4JQ> .

jakirkham · 2021-02-04T02:14:03Z

Went ahead and merged Ian's LRU cache PR ( #4481 ). It's a pretty light PR. This coroutine check also only happens in a handful of places.

That said, I don't think it rules this async approach. There may be other reasons (as noted above) for going this route.

ian-r-rose · 2021-02-04T16:09:30Z

I'll close this as #4474 is in, but will keep the branch around in case we want to revisit later

Ian Rose added 6 commits January 29, 2021 17:33

Type stream handlers as async-only

36e026e

Asyncify some stream handlers

d682c13

Fix circular imports -- a lot of work just to get a type annotation.

d80cf8d

Roll back some API changes, instead introduce asyncify util function.

6b90ee8

Also ensure that every_cycle is async

bebc06e

Python 3.6 support

4f1944b

ian-r-rose commented Feb 1, 2021

View reviewed changes

crusaderky reviewed Feb 3, 2021

View reviewed changes

ian-r-rose mentioned this pull request Feb 3, 2021

Cache is_coroutine_function #4481

Merged

2 tasks

ian-r-rose closed this Feb 4, 2021

ian-r-rose mentioned this pull request Mar 23, 2022

Avoid is_coroutine_function calls in Server #4469

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Async stream handlers #4474

[WIP] Async stream handlers #4474

ian-r-rose commented Feb 1, 2021

ian-r-rose Feb 1, 2021

jakirkham Feb 1, 2021

ian-r-rose Feb 1, 2021

crusaderky Feb 3, 2021

ian-r-rose Feb 1, 2021

mrocklin Feb 1, 2021

mrocklin commented Feb 1, 2021

mrocklin commented Feb 3, 2021 via email

crusaderky Feb 3, 2021

ian-r-rose Feb 3, 2021

mrocklin commented Feb 3, 2021

ian-r-rose commented Feb 3, 2021

ian-r-rose commented Feb 3, 2021

jakirkham commented Feb 3, 2021

jakirkham commented Feb 3, 2021

ian-r-rose commented Feb 3, 2021

mrocklin commented Feb 3, 2021

ian-r-rose commented Feb 3, 2021

mrocklin commented Feb 3, 2021 via email

jakirkham commented Feb 4, 2021

ian-r-rose commented Feb 4, 2021

		T = TypeVar("T")


		def asyncify(func: Callable[..., T]) -> Callable[..., Awaitable[T]]:

[WIP] Async stream handlers #4474

[WIP] Async stream handlers #4474

Conversation

ian-r-rose commented Feb 1, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrocklin commented Feb 1, 2021

mrocklin commented Feb 3, 2021 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrocklin commented Feb 3, 2021

ian-r-rose commented Feb 3, 2021

ian-r-rose commented Feb 3, 2021

is_coroutine_function:

handle_stream:

jakirkham commented Feb 3, 2021

jakirkham commented Feb 3, 2021

ian-r-rose commented Feb 3, 2021

mrocklin commented Feb 3, 2021

ian-r-rose commented Feb 3, 2021

mrocklin commented Feb 3, 2021 via email

jakirkham commented Feb 4, 2021

ian-r-rose commented Feb 4, 2021

`is_coroutine_function`:

`handle_stream`: