Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Const generics support #1937

Open
donkeyteethUX opened this issue Dec 24, 2020 · 19 comments
Open

Const generics support #1937

donkeyteethUX opened this issue Dec 24, 2020 · 19 comments

Comments

@donkeyteethUX
Copy link

donkeyteethUX commented Dec 24, 2020

I am working on a library for creating a bag-of-words representation of a corpus of images, for use with image/place recognition or SLAM algorithms. There are various types of image keypoints/descriptors, but generally a descriptor is a bit array of 32, 64, or 128 bytes. It would be nice to use the new min-const-generics to cleanly support descriptors of any size, and use serde to save/load the resulting descriptor vocabulary.

Are there plans to support (de)serialization of generic sized arrays? Is there a decent workaround in the meantime?

Edit: I saw the discussion here: #1860 Is this being worked on?

@MikailBag
Copy link

AFAIK there is no way to directly support arrays of arbitrary ways without breaking backcompat.

As a workaround, one can make following:

struct Arr<T, const N: usize>(pub [T;N]);
impl<T: Serialize, const N: usize> Serialize for Arr<T, N> {}
impl<'de, T: Deserialize, const N: usize> Deserialize for Arr<T, N> {}

and also provide some with helpers.

@sdd
Copy link

sdd commented Apr 1, 2021

This is also something that I am having problems with. I'm trying to work with both const generic arrays and Vecs of const generic arrays.

@MikailBag can you point me in the right place for how to do what you suggest above in a little more detail? It would be much appreciated. Thanks!

@jonasbb
Copy link
Contributor

jonasbb commented Apr 1, 2021

@sdd If you need support for nesting you either need wrapper types or you try serde_with. This will work for any number, you just need to make sure the value in the type and the annotation is equal.

#[serde_with::serde_as]
#[derive(Serialize)]
struct Foo {
    #[serde_as(as = "[_; 2]")]
    bar: [String; 2],
    #[serde_as(as = "Vec<[_; 5]>")]
    foobar: Vec<[u8; 5]>,
}

@MikailBag
Copy link

here is more verbose (but untested) code

mod arrays {
    use std::{convert::TryInto, marker::PhantomData};

    use serde::{
        de::{SeqAccess, Visitor},
        ser::SerializeTuple,
        Deserialize, Deserializer, Serialize, Serializer,
    };
    pub fn serialize<S: Serializer, T: Serialize, const N: usize>(
        data: &[T; N],
        ser: S,
    ) -> Result<S::Ok, S::Error> {
        let mut s = ser.serialize_tuple(N)?;
        for item in data {
            s.serialize_element(item)?;
        }
        s.end()
    }

    struct ArrayVisitor<T, const N: usize>(PhantomData<T>);

    impl<'de, T, const N: usize> Visitor<'de> for ArrayVisitor<T, N>
    where
        T: Deserialize<'de>,
    {
        type Value = [T; N];

        fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
            formatter.write_str(&format!("an array of length {}", N))
        }

        #[inline]
        fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
        where
            A: SeqAccess<'de>,
        {
            // can be optimized using MaybeUninit
            let mut data = Vec::with_capacity(N);
            for _ in 0..N {
                match (seq.next_element())? {
                    Some(val) => data.push(val),
                    None => return Err(serde::de::Error::invalid_length(N, &self)),
                }
            }
            match data.try_into() {
                Ok(arr) => Ok(arr),
                Err(_) => unreachable!(),
            }
        }
    }
    pub fn deserialize<'de, D, T, const N: usize>(deserializer: D) -> Result<[T; N], D::Error>
    where
        D: Deserializer<'de>,
        T: Deserialize<'de>,
    {
        deserializer.deserialize_tuple(N, ArrayVisitor::<T, N>(PhantomData))
    }
}

use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize)]
struct TypeWithArray {
    #[serde(with = "arrays")]
    arr: [u8; 64],
}

@sdd
Copy link

sdd commented Apr 2, 2021

I wasn't sure if the above was going to work for a type defined like this:

#[derive(Serialize, Deserialize, Debug)]
struct TypeWithConstGenericArray<const N: usize> {
    #[serde(with = "arrays")]
    arr: [u8; N],
}

But it does! See here: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8b514073821e558a5ce862f64361492e

Thanks so much for your help @MikailBag and @jonasbb

@simias
Copy link

simias commented Apr 5, 2021

I apologize in advance if this is not the right place to voice this opinion but I really find that making it a pain to serialize arbitrarily large arrays in order to support the (in my experience) rare [T; 0] case was a questionable decision.

Now, hindsight is 20/20 and I understand that this issue was probably not obvious at the time, and now it would break backcompat, but IMO it's important enough that I think a serde 2.0 wouldn't necessarily be unwarranted here.

I don't know if it's because of the way I code but I encounter this limitation all the time and it's always frustrating to work around, especially now that Rust officially supports const generics.

I realize that you shouldn't break backcompat willy-nilly and I appreciate the commitment to stability, but that introduces a pretty huge usability issue for the sake of preserving some (IMO) very niche [T; 0] behavior.

Anyway, that's my two cents, I enjoy using serde a whole lot so I hope I don't come off as too entitled, I complain because I care!

@noonien
Copy link

noonien commented Apr 11, 2021

I keep bumping into this issue on almost every project that I need to use serde on.

I agree with @simias. I don't think support for [T; 0], which makes no sense to (de)serialize, is worth not supporting arrays of any size.

@fadedbee
Copy link

fadedbee commented Apr 12, 2021

Likewise, I also keep hitting this issue. Upon searching, I find my own Stack Overflow question.

@Kromey
Copy link

Kromey commented May 7, 2021

Since I needed this too, I began working on my own solution before being pointed to this issue. I wound up with something similar to @MikailBag 's code above, which I've just finished adding tests and documentation to. If anyone wants to take a look at my code I'd appreciate any feedback. If you're just looking for a fix, you can grab it from Crates.io.

I also happened to stumble upon another crate, serde-big-array, which can provide similar functionality; it relies on unsafe code, which frankly is beyond my ability to fully understand its safety or possible repercussions, hence continuing on to publish my own (though mine was basically done by the time I found it anyway).

@vadixidav
Copy link

Ran into this for (almost) the same reason as @donkeyteethUX. Working on computer vision stuff (binary feature descriptors from AKAZE, and other algorithms) and I have things like this and this (this one actually had a bug in it that broke bincode but not serde_json in my testing!). Official implementations would help a lot, as I made a subtle error here (in the second example).

@vbakc
Copy link

vbakc commented Jun 30, 2023

The only blocker for now is the empty array [de]serialization.
Since current implementation does not require Serialize/Deserialize traits to be implemented for empty array.

@tarcieri
Copy link

tarcieri commented Jul 2, 2023

I think it'd be nice to have a method like Serializer::serialize_array (or failing that Serializer::serialize_byte_array) which uses const generics to inform the array size.

As noted in this comment, the problem with serialize_tuple is that it can't inform the serializer impl that the tuple is homogeneously typed, which in the case of formats like CBOR and MessagePack would allow for a more compact representation. There's serialize_bytes to handle the case of [u8], but it always adds a length prefix, where the length of a byte array may be known at compile time, in which case a length prefix is redundant. However, it could be provided and default to serialize_tuple underneath (which is what the impls on array types do already, I believe).

serialize_tuple is more of a product type in Serde's data model, leaving fixed-sized homogeneously typed arrays notably absent.

@ycscaly
Copy link

ycscaly commented Aug 28, 2023

Any update on this? would be really beneficial to be able to serialize and deserialize [T; N] types

@JosiahParry
Copy link

Noting that I would like this as well :) But it looks like @MikailBag provided a working solution

@JosiahParry
Copy link

Is there an example of using this with a Vec<[T; N]>?

@hak8or
Copy link

hak8or commented Dec 24, 2023

I also echo the need for this, and couldn't care any less about a 0 element array not being supported.

@MikailBag thank you so much for that snippet. I don't see what sort of license you are releasing that under, would it be fair to assume some variant of USA public domain? Or is it under MIT, or GPL V2, AGPL, etc?

@Pzixel
Copy link

Pzixel commented Feb 11, 2024

I did my small research on github and I didn't actually find any use cases where ability to serialize [T; 0] would be beneficial. A couple of points:

  1. If you use it in any generic way (const generics, macro generation, anything), you are guaranteed to have T: Serialize, because you must provide implementations for N > 0 sizes.
  2. If you're using it as a standalone [T; 0] then it's equal to () and can be just replaced with any ZST newtype.

While serde provide ways to override default serialization and use custom function (as one you provided above) it still makes sense to introduce a breaking change and support arrays in a generic way. Again, I checked a lot of code as for now and didn't find any use cases that actually utilize this behavior nor I could craft some artificial example to show its value. In all my attempts I either have a non-empty array or T is known to be serializable or I can just replace it.

So, the question if there are actually any usages of this, and if not we can remove it without breaking anything, and if they exist we can evaluate if it's worth it to still change behavior with a major patch or not.

cc @dtolnay

@tanriol
Copy link

tanriol commented Feb 11, 2024

I wonder whether it'd be acceptable from the serde semver POV to move the [T; 0] where T: !Serialize case under a separate non-default feature zero-length-array-magic and add a new non-default feature arbitrary-length-array using const generics (compile error for now if both are active at the same time).

zero-length-array-magic's documentation should clearly state that this feature is incompatible with arbitrary length array support and one should not enable it in any crate that can compile without it, especially library crates.
arbitrary-length-array's documentation should state that this feature is incompatible with the zero length array magic and one should not enable it in library crates if it's reasonable to implement the desired functionality without that.

Both features' documentation should mention that library crates should, if possible, use where clauses to ensure the crate automatically works in all cases of none, either or both (if that becomes possible in the future) being active.

@soqb
Copy link
Contributor

soqb commented Feb 25, 2024

i think that solution is just too brittle for a crate as pervasive as serde.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests