Redesign `DeserializerAdapter` trait to support deserializing using `DeserializeSeed` #313

phil-opp · 2023-12-19T15:00:12Z

Since the CDR format is not self-describing, the target type needs to be known for deserialization. The current RustDDS implementation ensures this by requiring a serde::Deserialize implementation for all target types. Unfortunately this makes it impossible to deserialize when the type information is only available at runtime. For example, it is not possible to base the deserialization on type information parsed from ROS2 msg files at runtime.

This pull request solves this limitation by adding basic support for deserialization based on serde's DeserializeSeed trait, which allows to use additional state for deserialization.

The main changes are:

Add a new Decode trait for deserializing a byte slice with access to the RepresentationIdentifier. The trait is implemented for the following new types:
- PlCdrDeserializer is an empty newtype that implements Decode for all types that implement PlCdrDeserialize.
- CdrDeserializeDecoder is an empty newtype that implements Decode for all types that implement serde::Deserialize.
- CdrDeserializeSeedDecoder wraps a DeserializeSeed implementation and implements Decode using the seed for deserialization.
Redesign the DeserializerAdapter trait
- Add a from_bytes_with method that takes an additional decoder: impl Decode argument. The method has a default implementation that calls the decoder.
- Add a new associated type named Deserialized, which describes the type that the decoder will deserialize to.
  - Most of the time, the type is equal to the generic type D.
  - Some adapters, mainly DAWrapper, need to do an additional transformation after deserializing. This transformation is possible by implementing a new transform_deserialized trait method. For example, DAWrapper wraps the deserialized value into a NoKeyWrapper.
- Create a new DefaultDecoder sub-trait, which describes adapters that can be used without supplying a decoder (since they provide a default one).
  - Gate the from_bytes and from_vec_bytes methods on the DefaultDecoder trait and provide a default implementation using the *_with variants and the default decoder.
  - The DefaultDecoder trait is implemented:
    - for CDRDeserializerAdapter if D implements serde::Deserialize.
    - for PlCdrDeserializerAdapter if D implements PlCdrDeserialize.
Use the DefaultDecoder trait to support both stateful and stateless deserialization in most functions, without duplicating the code
Create new try_take_one_with and as_async_stream_with API methods to allow ros2-client and others to plug in custom stateful deserializers.

I tried my best to limit the scope of this PR as much as possible and to document all the changes. Unfortunately, I failed to make this a non-breaking change, mainly because of the new DeserializerAdapter design. I hope you still consider this PR, as this feature is critical for our use case in the dora project.

Please let me know if you have any questions or ideas for improvements!

(edit: The commit list got a bit messy, sorry. I can squash if you like.)

Discovery messages don't implement the serde Deserialize traits because they need an additional encoding value for Deserialization. By using a custom DefaultSeed trait, we can at least work around this issue by providing a dummy implementation.

…oding` trait

Enables us to provide a default implementation for `from_bytes_seed`.

Based on cloning the seed.

phil-opp · 2023-12-19T16:10:51Z

With these changes, we can update ros2-client to provide take_seed and async_stream_seed functions on Subscription. I opened a draft PR for this at jhelovuo/ros2-client#20 to show the required ros2-client changes.

phil-opp · 2023-12-20T09:40:31Z

I'm not sure why the "Code Coverage" CI job fails, but it appears to be broken on the master branch as well: https://github.com/jhelovuo/RustDDS/actions/workflows/coverage.yml . So I don't think that the failure is related to this PR.

jhelovuo · 2023-12-21T09:27:19Z

Hello,

Thank you for the interesting idea and PR. I was not aware of the DeserializeSeed trait before. I can see the use case here.

If the deserialization would be changed as you propose, how would the serialization go then?

And no need to worry about the code coverage CI job.

phil-opp · 2023-12-21T10:00:21Z

Serialization doesn't need to be changed, since the serde::Serialize trait gives access to the runtime value that should be serialized. So we can use the type information parsed from ROS2 message files in a custom Serialize implementation.

For example:

pub struct TypedPythonValue<'a> {
    pub value: &'a pyo3::Any,              // given from Python
    pub type_info: &'a Ros2TypeInfo,       // parsed from ROS2 message files
}


impl serde::Serialize for TypedValue<'_> {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        match &self.type_info.data_type {
            DataType::UInt8 => {
                // the ROS2 topic expects an `u8` value, so try to convert the Python value
                let value: u8 = self.value.extract().map_err(serde::ser::Error::custom)?;
                serializer.serialize_u8(value)
            }
            .... // handle other ROS2 data types
            _ => todo!(),
        }
    }
}

This approach doesn't work with the standard serde::Deserialize trait because it is completely stateless, i.e. there is no self argument. So for deserialization based on runtime type information, we need to use DeserializeSeed.

codecov-commenter · 2024-04-05T15:33:04Z

Codecov Report

Attention: Patch coverage is 52.17391% with 66 lines in your changes are missing coverage. Please review.

Project coverage is 73.26%. Comparing base (28ed26f) to head (4e8e37d).
Report is 130 commits behind head on master.

Files	Patch %	Lines
src/dds/no_key/simpledatareader.rs	0.00%	25 Missing ⚠️
src/dds/with_key/simpledatareader.rs	48.64%	19 Missing ⚠️
src/dds/adapters.rs	52.94%	16 Missing ⚠️
src/serialization/cdr_deserializer.rs	77.77%	6 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #313      +/-   ##
==========================================
- Coverage   73.53%   73.26%   -0.28%     
==========================================
  Files          94       95       +1     
  Lines       19845    20192     +347     
==========================================
+ Hits        14594    14793     +199     
- Misses       5251     5399     +148

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jhelovuo · 2024-04-10T13:00:57Z

Thanks for keeping your PR up to date. I assume this means you are using it for yourself.

We are considering accepting this PR, even though it adds extra complication. We may have another use case where it will be useful. This is just to let you know that the PR is not dead.

phil-opp · 2024-04-10T16:56:59Z

Thanks for the update! Yes, we're using this PR for the https://github.com/dora-rs/dora project. We need it because we precompile Python bindings on a different machine, which need to be able to deserialize ROS2 messages based on type information parsed at runtime.

jhelovuo · 2024-04-11T06:08:24Z

Looking at the central part of the new design, namely, new code to adapters.rs , terms "decode" and "transform" are used. Are these interchangeable? If yes, perhaps only one term should be used for clarity?

jhelovuo · 2024-04-11T10:18:59Z

Looking at the central part of the new design, namely, new code to adapters.rs , terms "decode" and "transform" are used. Are these interchangeable? If yes, perhaps only one term should be used for clarity?

This question makes no sense. Apparently, "decode" and "deserialize" are more like synonyms. So the pipeline is

byte slice → deserialize/decode → value of type Self::Deserialized → call
[Self::transform_deserialized] → value of type D

And it makes sense to use "decode", because the term "deserialize" is already taken.

Am I now on the correct track?

jhelovuo · 2024-04-11T10:32:59Z

Another question: What is the use case for transform_deserialized() ?
In this PR and the corresponding ros2-client PR it always seems to be the identity function.

jhelovuo · 2024-04-11T11:11:37Z

I just created a new branch deserialize-with-seed for developing this feature. Please redirect the PR there.

jhelovuo · 2024-04-11T11:43:10Z

Is there a specific reason in

fn from_bytes_with<S>(
      input_bytes: &[u8],
      encoding: RepresentationIdentifier,
      decoder: S,
    ) -> Result<D, S::Error>
    where
      S: Decode<Self::Decoded>,
    {
...
    }

for the decoder to be passed by value rather than reference? Correspondingly, Decode::decode_bytes() takes self rather than &self. (Or maybe &mut self?)

If using a reference is possible, then we could get rid of the Clone requirement in DefaultDecoder::Decoder. Requiring Clone is a bit awkward, because it would really do the cloning at every sample processed.

phil-opp · 2024-04-11T13:06:58Z

So the pipeline is

byte slice → deserialize/decode → value of type Self::Deserialized → call [Self::transform_deserialized] → value of type D

Yes, exactly!

Another question: What is the use case for transform_deserialized() ?

There are some adapters that add a wrapper around the deserialized values. With the normal Deserialize trait, this could happen behind the scenes because the adapters could just call deserialize on the desired type. With DeserializeSeed, the adapters need to explicitly specify the type that the DeserializeSeed instance should create, which might be different than the output type D.

In this PR and the corresponding ros2-client PR it always seems to be the identity function.

Yes, it's almost always the identity function. The one exception is no_key::DAWrapper, which wraps a DeserializerAdapter<D>, but returns a DeserializerAdapter<NoKeyWrapper<D>>. So it's transform_deserialized call wraps the result in a new NoKeyWrapper instance:

NoKeyWrapper::<D> {
    d: DA::transform_deserialized(deserialized),
}

(This implementation is at src/dds/no_key/wrappers.rs#L94-L96.)

phil-opp · 2024-04-11T13:09:11Z

I just created a new branch deserialize-with-seed for developing this feature. Please redirect the PR there.

I just tried, but it looks like all the commits of this PR are already part of the branch.

phil-opp · 2024-04-11T13:24:51Z

Is there a specific reason in
fn from_bytes_with<S>(
      input_bytes: &[u8],
      encoding: RepresentationIdentifier,
      decoder: S,
    ) -> Result<D, S::Error>
    where
      S: Decode<Self::Decoded>,
    {
...
    }
for the decoder to be passed by value rather than reference? Correspondingly, Decode::decode_bytes() takes self rather than &self. (Or maybe &mut self?)

If using a reference is possible, then we could get rid of the Clone requirement in DefaultDecoder::Decoder. Requiring Clone is a bit awkward, because it would really do the cloning at every sample processed.

I just added that code last week to get things working again with the latest changes from master. I'm not satisfied with that part of the implementation either, so it would be great if we could find a better solution.

The reason for 4e8e37d was the loop introduced in 6908f40, which calls deserialize_with multiple times.

I agree that taking a (mutable?) reference to the decoder would be the cleaner solution and would be fine for our use case. Unfortunately, the DeserializeSeed::deserialize method takes self by value, so I'm not sure how we could avoid a Clone bound...

(Side note: If your DeserializeSeed type T only needs &self, you could implement DeserializeSeed for &T as well. Then the .clone() call would be just a pointer copy.)

jhelovuo · 2024-04-14T09:30:55Z

So the pipeline is
byte slice → deserialize/decode → value of type Self::Deserialized → call [Self::transform_deserialized] → value of type D

Yes, exactly!

Ok. Your technical implementation is brilliant. This is quite complicated abstract metaprogramming.

In order to make it more accessible to others, I decided to rename some things. The basic renaming idea is

deserialize = decode + transform.

In your original code, there was confusion between deserialize and decode, as these were used as synonyms. Now it should be more clear that the whole two-step process is "deserialize" and the first step is called "decode".

I am migrating this PR to PR #334 , so we can continue the work there.

jhelovuo · 2024-04-14T09:48:42Z

Is there a specific reason in

...code...

for the decoder to be passed by value rather than reference? Correspondingly, Decode::decode_bytes() takes self rather than &self. (Or maybe &mut self?)
If using a reference is possible, then we could get rid of the Clone requirement in DefaultDecoder::Decoder. Requiring Clone is a bit awkward, because it would really do the cloning at every sample processed.

I just added that code last week to get things working again with the latest changes from master. I'm not satisfied with that part of the implementation either, so it would be great if we could find a better solution.

The reason for 4e8e37d was the loop introduced in 6908f40, which calls deserialize_with multiple times.

The new loop was logically necessary, because it can happen that in a with_key DDS topic, e.g. most of the Discovery topics, we could receive a Dispose instead of data update in a DATA submessage. The Dispose is indicated by sending a Key instead of Data, or alternatively, sending an MD5 hash of the Key.

In case we, as a DataReader, join an already long-running Topic, it may have some past data values published that we can no longer receive, because the History buffers are not deep enough. But we can at some point receive Dispose notifications of these samples we have never heard of. In case the Dispose was via a hash only, there is no way we can tell the application about the Dispose, because we do not know the actual key, so it is best to just ignore it, and move on to the next incoming sample. This is why the looping is needed.

This is something we could perhaps work around to avoid the need for cloning.

I agree that taking a (mutable?) reference to the decoder would be the cleaner solution and would be fine for our use case. Unfortunately, the DeserializeSeed::deserialize method takes self by value, so I'm not sure how we could avoid a Clone bound...

(Side note: If your DeserializeSeed type T only needs &self, you could implement DeserializeSeed for &T as well. Then the .clone() call would be just a pointer copy.)

Ok, the pass-by-value of self in DeserializeSeed::deserialize is something we cannot work around.

Then we must keep the Clone bound, and document that there will be quite a lot of cloning, about one .clone() per deserialized sample, so the application designer knows to invest in making it cheap, like a pointer copy.

phil-opp · 2024-04-15T08:13:18Z

Thanks a lot!

phil-opp added 20 commits December 13, 2023 20:38

Add DeserializeSeed support

e2fd400

Start replacing DeserializeSeed trait with custom `FromBytesWithEnc…

b01ca98

…oding` trait

Implement FromBytesWithEncoding for Cdr deserializer

85d55a4

Implement FromBytesWithEncoding for PlCdr deserializer

70a9bf4

Remove unused imports

4273dc5

Run cargo fmt

f1bbddc

Remove unneeded lifetime arguments

ad15237

Rename trait: FromBytesWithEncoding to DecodeWithEncoding

e3b1b6f

Remove more unneeded lifetimes

24cc552

Add DeserializerAdapter::transform_deserialized function

1bb6c14

Enables us to provide a default implementation for `from_bytes_seed`.

Add CDR deserializer variant with seed

6c4dd82

Don't require specific error type for seed

829df14

Implement try_take_one_seed

ae4f75a

Implement an async stream with seed support

a601fdc

Based on cloning the seed.

Rename DecodeWithEncoding trait to Decode

052a609

Improve docs and rename: Use 'decoder' instead of 'seed'

dd6645a

Fix clippy warning

93ef8e6

Format using nightly rustfmt

cbf0d4f

Fix one more clippy suggestion

fae9fcf

phil-opp mentioned this pull request Dec 19, 2023

Update to RustDDS with DeserializeSeed support jhelovuo/ros2-client#20

Merged

phil-opp mentioned this pull request Dec 20, 2023

Update ros2-client and rustdds dependencies to latest fork version dora-rs/dora#397

Merged

phil-opp added 2 commits April 5, 2024 17:09

Merge branch 'main' into deserialize-seed-2

d543845

Add Clone requirement to decoder for try_take_one

4e8e37d

jhelovuo mentioned this pull request Apr 14, 2024

Redesign DeserializerAdapter trait to support deserializing using DeserializeSeed #334

Merged

jhelovuo closed this Apr 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesign `DeserializerAdapter` trait to support deserializing using `DeserializeSeed` #313

Redesign `DeserializerAdapter` trait to support deserializing using `DeserializeSeed` #313

phil-opp commented Dec 19, 2023 •

edited

Loading

phil-opp commented Dec 19, 2023

phil-opp commented Dec 20, 2023

jhelovuo commented Dec 21, 2023 •

edited

Loading

phil-opp commented Dec 21, 2023

codecov-commenter commented Apr 5, 2024

jhelovuo commented Apr 10, 2024

phil-opp commented Apr 10, 2024

jhelovuo commented Apr 11, 2024

jhelovuo commented Apr 11, 2024

jhelovuo commented Apr 11, 2024

jhelovuo commented Apr 11, 2024

jhelovuo commented Apr 11, 2024

phil-opp commented Apr 11, 2024

phil-opp commented Apr 11, 2024

phil-opp commented Apr 11, 2024

jhelovuo commented Apr 14, 2024

jhelovuo commented Apr 14, 2024

phil-opp commented Apr 15, 2024

Redesign DeserializerAdapter trait to support deserializing using DeserializeSeed #313

Redesign DeserializerAdapter trait to support deserializing using DeserializeSeed #313

Conversation

phil-opp commented Dec 19, 2023 • edited Loading

phil-opp commented Dec 19, 2023

phil-opp commented Dec 20, 2023

jhelovuo commented Dec 21, 2023 • edited Loading

phil-opp commented Dec 21, 2023

codecov-commenter commented Apr 5, 2024

Codecov Report

jhelovuo commented Apr 10, 2024

phil-opp commented Apr 10, 2024

jhelovuo commented Apr 11, 2024

jhelovuo commented Apr 11, 2024

jhelovuo commented Apr 11, 2024

jhelovuo commented Apr 11, 2024

jhelovuo commented Apr 11, 2024

phil-opp commented Apr 11, 2024

phil-opp commented Apr 11, 2024

phil-opp commented Apr 11, 2024

jhelovuo commented Apr 14, 2024

jhelovuo commented Apr 14, 2024

phil-opp commented Apr 15, 2024

Redesign `DeserializerAdapter` trait to support deserializing using `DeserializeSeed` #313

Redesign `DeserializerAdapter` trait to support deserializing using `DeserializeSeed` #313

phil-opp commented Dec 19, 2023 •

edited

Loading

jhelovuo commented Dec 21, 2023 •

edited

Loading