You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The sizeof computation can be a little expensive, particularly when we run it on every message. This ends up taking around 10% of our time under some benchmarks
In the case of workers, this is probably fine (and maybe a good idea). However for the scheduler this is probably unnecessary. The scheduler tends to only store pre-serialized data, so the serialization process is just unpacking some Python objects and won't need to be done in a separate thread.
It would be good to skip offloading in the scheduler, but keep it in the workers.
Probably the place to specify this is in the Scheduler's ConnectionPool. However we'll want to be careful because not every Comm serializes and offloads. This maybe requires some sort of kwargs option? I'm not sure.
The text was updated successfully, but these errors were encountered:
Currently in TCP and UCX comms we offload serialization for large messages
distributed/distributed/comm/utils.py
Lines 72 to 75 in d0f6aec
The
sizeof
computation can be a little expensive, particularly when we run it on every message. This ends up taking around 10% of our time under some benchmarksIn the case of workers, this is probably fine (and maybe a good idea). However for the scheduler this is probably unnecessary. The scheduler tends to only store pre-serialized data, so the serialization process is just unpacking some Python objects and won't need to be done in a separate thread.
It would be good to skip offloading in the scheduler, but keep it in the workers.
Probably the place to specify this is in the Scheduler's
ConnectionPool
. However we'll want to be careful because not every Comm serializes and offloads. This maybe requires some sort ofkwargs
option? I'm not sure.The text was updated successfully, but these errors were encountered: