Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize how we fix serialization problems for Protobufs #29

Closed
pabloem opened this issue Jun 15, 2022 · 6 comments
Closed

Standardize how we fix serialization problems for Protobufs #29

pabloem opened this issue Jun 15, 2022 · 6 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@pabloem
Copy link
Collaborator

pabloem commented Jun 15, 2022

From review:

Nit / general remark: It's a bit unfortunate that we keep running into the serialization issue, and sometimes solve it by using a custom reduce, sometimes by registering a custom serializer (ray.util.register_serializer), and sometimes manually (SerializeToString / FromString).
It would be good if we could enable serialization for all the protobuf components in one central place - I'm not sure how that could be done though, as ray.util.register_serializer would have to be called on every ray worker that transmits protobuf objects. Maybe something to discuss?

@wilsonwang371
Copy link
Contributor

Hi Pablo,

Can you point out some of the places of using different serialization/deserialization ?

@pabloem pabloem added the good first issue Good for newcomers label Nov 30, 2022
@rkenmi rkenmi self-assigned this Nov 30, 2022
@rkenmi
Copy link
Member

rkenmi commented Dec 5, 2022

Hi @iasoon ,

Just wanted to understand a bit more about the issue 🙂

It seems like there are some pros and cons with each serialization approach:

  • __reduce__: These definitions make classes a little more bloaty but allows us to customize how objects can be serialized/pickled. One downside is that for every Protobuf message we use, a wrapper class is probably needed to explicitly define how to serialize and deserialize. This has some advantages too, for some classes like RayRunnerExecutionContext we can skip serializing some of the instance attributes and lazily reconstruct the instance - the memory footprint would be smaller that way.
  • ray.util.register_serializer: According to the Ray docs, the API needs to be called on every Ray worker since the serializers are managed locally per Python process and it can be called idempotently. It seems like this is one of the more simpler approaches; we just simply call this at the start of Ray tasks, register serializers for every Protobuf message we use, and remove most of the __reduce__ code since they aren't needed anymore.

Curious for your opinion on keeping a mix of them or stick to one convention of using __reduce__ or ray.util.register_serializer for consistency?

@iasoon
Copy link
Member

iasoon commented Dec 6, 2022

@rkenmi Personally, I definitely prefer the register_serializer approach, as it is more universal - we don't always serialize protobuf objects as part of an owning object, but also just as regular task arguments.

A while back, I took a stab at implementing this for all protobuf types we depend on:
pabloem@3445540#diff-7507ff302fc9e22524c6fea6648354d4dc21c5076957b063efc296acf929f742
I'm not really sure what the best way would be to include this in our code. I guess the most straightforward way would be to indeed add a manual call to each task that needs to serialize protobuf messages. I imagine this could get tedious though.

There is also this effort that would help us, but I don't know where that went.
ray-project/ray#21383

I definitely agree that having custom __reduce__s is a good idea for some of the structs we have, but I think that's orthogonal to the serialization of the protobuf messages themselves. We could definitely do both.

Hope that helps! Happy to discuss more.

@rkenmi
Copy link
Member

rkenmi commented Dec 6, 2022

Nice, thanks for sharing the snippet!

I'm not really sure of a good way either for propagating register_serializer to all workers. The manual call seems okay with only a handful of Ray tasks/actors right now.

I did find this API which looks useful, but they recently marked it as deprecated due to reliability issues.

@iasoon
Copy link
Member

iasoon commented Dec 6, 2022

I'm not sure either. Too bad that that API was deprecated! Maybe we can ask the ray team whether there is a recommended way to do this?

I think technically we don't need it on any worker either, but only the ones that serialize protobuf messages. If my memory serves well, I think that's currently only the main task and the run_bundle tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants