-
-
Notifications
You must be signed in to change notification settings - Fork 774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider a Serializer::serialize_byte_array
method
#2120
Comments
This isn't necessarily limited to byte arrays, it could be useful for any type of Formats which do not care would just ignore that option. |
@Marwes You’d still need to hint to the deserializer that it’s expecting a fixed-size array, which in So I thought it might be better not to try open that can of worms and just look at fixed-size byte arrays as a complement to byte slices. They can be treated as orthogonal to arrays vs tuples, because they’re just another kind of value like a |
Updated to pass arrays by reference when serializing, and by value when deserializing. That seems like a more natural fit. |
Also thinking defaulting to unsupported rather than using a fallback is probably a better approach. Basically the same as 128bit numbers, but with an expectation that any format should be able to support it. That way formats can choose an appropriate implementation in their own time. |
I can see how it may be appropriate to uuid, but my feeling is that this case isn't going to be broadly applicable enough to justify dedicated Serializer and Deserializer and Visitor methods, and the amount of benefit when applicable is also quite small. Thanks anyway for the suggestion and the writeup! |
Here (1 ,2) is another use case where having specific support for fixed sized arrays would be great, especially fixed sized binary blobs. The issue is not really that the serde data model doesn't have the ability to represent fixed sized array (Tuples do this nicely), but that there is no way to tell the serializer that all objects in the the Tuple/Sequence are of the same 'type' which prevents them from optimizing how they are encoded. In this case, Bincode users will prefer serializing [u8;N] as a Tuple because Bincode doesn't have any type information, so Tuples end up as binary blobs ( This proposal would enable both Serializers to encode data optimally. A similar addition would likely be required for the Deserializing side. |
This use case also appears for cryptographic things. Signatures, hashes, etc. usually have a fixed size that is known up-front. |
I can confirm this is very painful for cryptographic use cases. We were previously using We are going to move to See also: rozbb/rust-hpke#53 |
This also comes with performance penalties. We saw a larger than 5x speedup by handrolling our serialization instead of using |
Summary
Add new methods to
Serializer
,Deserializer
, andVisitor
to support cases where a binary format can take advantage of the fact that a byte slice has a fixed size.Background
serde
[u8; 16]
binary representation uuid-rs/uuid#557Motivations
In
uuid
we've been looking at the trade-offs of various representations for a 128bit value for binary formats. The current options are:Serializer::serialize_bytes
using&[u8]
. This is a natural fit, but requires a redundant length field, even though the value is guaranteed to always be 16 bytes. That redundancy may result in anywhere from 1 to 8 additional bytes of overhead.Serializer::serialize_tuple
using[T; N]
. This can avoid the need for a redundant field, but the lazy serialization may impact performance. It can also introduce more overhead in other formats that don't encode tuples as sequences.Serializer::serialize_u128
usingu128
. This can avoid the drawbacks of the above approaches, but support is still spotty. Some formats that need to interoperate with other languages simply won't or can't support 128bit numbers.Each of these approaches comes with drawbacks that are problematic for different groups of end-users. They can be mitigated with the introduction of a new
serialize_byte_array
method:Proposal
Add the following method to
Serializer
:and the following methods to
Deserializer
andVisitor
:This is a complementary API to
Serializer::serialize_bytes
that allows datatypes to communicate to a format that the byte buffer has a fixed length. The format may choose to optimize that case by treating the byte buffer as a tuple instead of as a slice.serde
considers[T; N]
to be equivalent to(T, ..N)
. This proposal doesn't attempt to change that.A datatype that serializes using
serialize_byte_array
will need to deserialize usingdeserialize_byte_array
with its length supplied. The underlying format may treat that as a request for a byte buffer, a sequence, or a tuple, depending on its implementation.Formats like
bincode
will need to be updated to make use of this new method in coordination with aserde
release that enabled them, so they have a chance to decide what semantics they want before inheriting the default.Drawbacks
This is an arguably niche case that increases the burden on formats. It requires coordination and consideration to support.
It may also not be possible for
serde
's MSRV to parse theconst N: usize
syntax.Adding default methods to the
Deserializer
that forward to others might be an undesirable approach, because it's been a source of bugs in the past.Alternatives
Avoid const generics in favor of something like:
and
where the length is implicitly fixed by the length of the passed in slice. This might work better for the
Deserializer
than trying to propagate consts. The downside of that approach is that it doesn't clearly distinguish fromserialize_bytes
. Besides the name they look the same.Pass byte arrays by reference instead of by value.
Error in the default implementations instead of using a fallback. This would let formats opt-in to support in their own time, but as a widely used type,
Uuid
would create churn while they do.The text was updated successfully, but these errors were encountered: