Improve WriteBytesExt API #27

nwin · 2015-04-22T07:31:01Z

At the moment the byteorder crate uses a enum to generalize over the endianness and the method name to select the type. Both is not very ergonomic to use. I propose the following interface:

trait WriteBytesExt<T> {
    fn write_le(&mut self, n: T) -> io::Result<()>;
    fn write_be(&mut self, n: T) -> io::Result<()>;
}

impl<W> WriteBytesExt<u8> for W where W: Write {
    fn write_le(&mut self, n: u8) -> io::Result<()> {
        ....
    }
    ....
}

First of all it gets rid of the enum. Since the enum is purely a compile time parameter it cannot be used for dynamic dispatch. This is as good or bad as having it directly the method name. Thus I do not see the point of having it. Secondly it gets rid of the redundant type name in the signature.

This shortens the method call significantly

w.write_u16::<LittleEndian>(42u16)

becomes

w.write_le(42u16)

My two points are:

The type in the method signature carries redundant information.
The enum type parameter does not provide any benefit for the user.
Enums are most useful for runtime polymorphism. Thus, as the enum variants are no types these *BytesExt traits cannot be use to write generic code that abstracts over endianness. Again no benefit for the user.

The text was updated successfully, but these errors were encountered:

nwin · 2015-04-22T07:58:44Z

Alternatively one could elevate the enum variants to real types and use them as parameters which allows a more ergonomic use (as it should make it easier to write generic code for it) at the cost of being as long as before.

trait ByteOrder<T> {}

struct LittleEndian<T>(pub T);
struct BigEndian<T>(pub T);

impl<T> ByteOrder<T> for LittleEndian<T> {}
impl<T> ByteOrder<T> for BigEndian<T> {}

trait WriteBytesExt<T, B> where B: ByteOrder<T> {
    fn write_bytes(&mut self, n: B) -> io::Result<usize>;
}

impl<W> WriteBytesExt<u8, LittleEndian<u8>> for W where W: Write {
    fn write_bytes(&mut self, n: LittleEndian<u8>) -> io::Result<usize> {
        self.write(&[n.0])
    }
}

Then

w.write_u16::<LittleEndian>(42u16)

would become

w.write_bytes(LittleEndian(42u16))

BurntSushi · 2015-04-23T00:47:38Z

I've heard a few people moan about the current API, but this is the best alternative I've seen so far. A few comments:

You will need one more method, probably write_native to replicate existing functionality. Currently, a type alias is used instead.
If WriteBytesExt changes, then so to must ReadBytesExt. I assume we'll need to rely on return value polymorphism? So e.g., instead of rdr.read_u32::<BigEngian>() you'd have rdr.read_be::<u32>() or preferably, let n: u32 = rdr.read_be().

I want to noodle on this some more. I'd also love to get feedback from others. I don't have a good handle on who is using byteorder, but here are some names I've picked off from: https://crates.io/crates/byteorder/reverse_dependencies

cc @TyOverby @jamesrhurst @blackbeam @sfackler @mitsuhiko @cyderize

TyOverby · 2015-04-23T01:55:45Z

I use byteorder in bincode and while I don't use the functionality yet, once default-types for functions land I plan on using the current implementation to allow my users to write code like this:

bincode::encode(&my_struct); // default to network byte order
bincode::encode::<LittleEndian>(&my_struct); // specify endianness
// Use other endianness like above

The benefit to the way that @BurntSushi currently implements it is that having these functions called with different endianness imposes no runtime lookup, which is very important to bincode.

BurntSushi · 2015-04-23T01:58:07Z

@TyOverby This is the first I'm hearing about default types for functions. Cool! Is there an RFC?

sfackler · 2015-04-23T01:59:02Z

My use case is pretty simple - I never need to generically vary over either endianness or the type. Either the fn <T> write_be(t: T) design or the old school write_be_i32(t: i32) design would be a bit less verbose than the current setup for me.

TyOverby · 2015-04-23T02:04:20Z

@BurntSushi Yeah, the main default types RFC.

Their example is

pub fn range<A:Enumerable=uint>(start: A, stop: A) -> Range<A> {
    Range{state: start, stop: stop, one: One::one()}
}

nwin · 2015-04-23T05:21:53Z

I must redact my criticism a little bit. :( I do not know how I missed that but BigEndian and LittleEndian are of course not enum variants.

That means my strong arguments shrink to the return type polymorphism. Still I think a shortcut like write_le would be nice.

nwin · 2015-06-08T08:48:03Z

Have you spend some more thought about it? If one would use a generic solution, third party types would be able to hook into the same infrastructure by simply implementing ByteOrder.

Something like this:

#![feature(associated_consts)]
use std::io;

trait ByteOrder<T> {
    const Size: usize;
    fn read_bytes(buf: &[u8]) -> T;
}

enum LittleEndian {}
enum BigEndian {}

impl ByteOrder<u8> for LittleEndian {
    const Size: usize = 1;
    fn read_bytes(buf: &[u8]) -> u8 {
        buf[0]
    }
}
impl ByteOrder<u16> for LittleEndian {
    const Size: usize = 2;
    fn read_bytes(buf: &[u8]) -> u16 {
        ((buf[1] as u16) << 8) | buf[0] as u16
    }
}
trait ReadBytesExt<T> {
    fn read<B: ByteOrder<T>>(&mut self) -> io::Result<T>;
}
impl<T, R> ReadBytesExt<T> for R
where R: io::Read {
    fn read<B: ByteOrder<T>>(&mut self) -> io::Result<T> {
        let mut buf = [0; B::Size];
        assert!(try!(self.read(&mut buf)) == B::Size);
        Ok(B::read_bytes(&buf))
    }
}

fn main() {
    let mut v = &[1, 2, 3][..];
    let a: u8 = v.read::<LittleEndian>().unwrap();
    let b: u16 = v.read::<LittleEndian>().unwrap();
    println!("{} {}", a, b);
}

(Doesn’t compile thought. Probably one has to wait for non-type type parameters)

Edit: Seems like it is just a temporary restriction that the code gets rejected.

nwin · 2015-06-08T17:04:56Z

Ok I just wrote the whole thing to illustrate my point: nwin/byteio.

It has almost the same API as byteorder:

use byteio::{LittleEndian, ReadBytesExt};
let mut reader = &[1, 2][..];
let val: u16 = reader.read_as::<LittleEndian>().unwrap();
assert_eq!(val, 513u16);

Benchmarks indicate that it may be faster than byteorder but I don’t trust them. In theory it should, since I’m using byte arrays instead of slices.

BurntSushi · 2015-07-10T12:27:26Z

@nwin Sorry for the late reply. I do actually like your approach. Not only is the implementation smaller/simpler, but the API surface is smaller too.

What is the best way to proceed? Lots of people are using byteorder, so I think we should:

Continue to maintain the current API at 0.3.x, but deprecate it. (So put it in a branch, I guess.)
Remove the old API and move over to your API and bump the version to 0.4.x.
Replace current docs with 0.4.x docs and put a link/notice to the old 0.3.x docs.

nwin · 2015-07-10T19:51:23Z

I agree, that seems to be a good approach.

What about the method name of Read/WriteBytesExt? as is cost-free by convention. Maybe one should use read/write_into?

BurntSushi · 2015-07-10T21:30:43Z

I really like how read_as/write_as sounds, but erring on the side of convention seems like a good idea. I'm fine with read_into/write_into.

nwin · 2015-07-11T06:06:06Z

I’ll file a PR this weekend then we can discuss further issues with the new implementation there.

cesarb · 2015-07-11T19:27:55Z

I'd like to present a different opinion: having the type in the method signature is a good thing, even if it's redundant.

I've followed for a while the cleanups the LibreOffice team were applying to the legacy OpenOffice codebase. One of these cleanups was to change the implicit SvStream operators, where writing a 16-bit quantity would be something like:

int16_t var = ...;
stream << var;

to an explicit method call, like:

int16_t var = ...
stream.writeInt16(var);

This prevents a kind of error they often encountered, where changing a variable type somewhere accidentally changed a binary serialization format elsewhere. Since a byteorder crate is going to be used for binary serialization, where the type width is as important as its endianness, that's important.

The API I'd prefer is something like:

w.write_le::<i16>(var);

(same for _be/_ne), where the type can be elided if the variable is declared nearby, but can also be given explicitly to prevent accidental serialization format changes (in this example, the compiler would give an error if var isn't an i16). But anything works, as long as the endianness and the type can both be specified explicitly at the same time in the method call.

nwin · 2015-07-11T19:57:39Z

That is a good point. In that case

w.write_as::<i16, LittleEndian>(var);

would be a good API. Unfortunately the type can’t be elided in that call (yet?).

cesarb · 2015-07-11T20:42:19Z

If you want to ever be able to elide it, it must be in the opposite order:

w.write_as::<LittleEndian, i16>(var);

So the declaration would be something like:

fn write_as<B, T>(&mut self, n: T) -> Result<()>
    where B: ByteOrder<T> { ... }

The question would then be: how to default the type parameter?

fn write_as<B, T = ???>(&mut self, n: T) -> Result<()>
    where B: ByteOrder<T> { ... }

The obvious answer would be to add a type Type = T to the trait ByteOrder<T> and use it:

fn write_as<B, T = B::Type>(&mut self, n: T) -> Result<()>
    where B: ByteOrder<T> { ... }

but that looks awfully circular.

cesarb · 2015-07-11T21:23:26Z

In the worst case, the explicit version could be written on top of the implicit version, as a separate method:

w.write_as::<LittleEndian>(var);
w.write_as_explicit::<LittleEndian, i16>(var);

but of course having a single method would be much cleaner. And there's always the following, which should already work with your current code:

w.write_as::<LittleEndian as ByteOrder<i16>>(var);

I could also imagine a type LittleEndianT<T> which implements only ByteOrder<T> (passing everything through to LittleEndian) allowing the following:

w.write_as::<LittleEndian>(var);
w.write_as::<LittleEndianT<i16>>(var);

LukasKalbertodt · 2015-11-02T21:23:31Z

Any updates on this? Many ideas in the comments above look a lot better than the current design...

BurntSushi · 2015-11-02T22:20:02Z

@LukasKalbertodt Not yet, just haven't gotten to it yet, but I haven't forgotten.

One excuse I have is that there are many reverse dependencies of byteorder that use *. Pulling this in will break all of those crates instantly. Once this RFC takes effect, then it will be a better time to come back to this.

TyOverby · 2015-11-03T00:30:58Z

@BurntSushi Do you plan on rewriting byteorder from the ground up with the new api, or just adding these on top?

I still plan on using the generic functions heavily; will I be forced to use an abandoned version?

BurntSushi · 2015-11-03T00:53:09Z

@TyOverby I forgot about that. There's no way I'm maintaining two distinct APIs. If the new API loses a real use case, then that kind of seems like a serious mark against it.

TyOverby · 2015-11-03T02:31:33Z

Yeah, I think that once default-types for functions land, having trait-based (compile-time) dispatch will be a lot more attractive.

With the same function you could express all of these some library API.

// call regularly
function_that_uses_byteorder(&my_obj, &mut writer);
// call with specified order 
function_that_uses_byteorder::<LittleEndian>(&my_obj, &mut writer);
// all I care about is perf!
function_that_uses_byteorder::<NativeEndian>(&my_obj, &mut writer);

This is exactly what I plan on doing for Bincode. Right now I use big-endian, but for projects like Servo that are just using Bincode for cross-process communication, I could let them choose the endianness without convoluting the Bincode API.

BurntSushi · 2015-11-03T02:40:23Z

@TyOverby So it sounds like we might want to table API redesign (if any) until at least default-types for functions lands.

LukasKalbertodt · 2015-11-03T09:58:18Z

One excuse I have is that there are many reverse dependencies of byteorder that use *.

I count 5 crates with a total of about 1800 Downloads (~1.5% of byteorder's downloads). That's not too much IMO, "it's their own fault" and the RFC will not magically fix those crates. I wouldn't really care about that too much actually...

I did not read all comments above, so I just have to agree that waiting for Rust features that allow a better API design is probably a good idea. And I also agree that maintaining two different APIs is not a good idea. Thanks for the status update :)

BurntSushi · 2015-11-03T12:08:17Z

I count 5 crates with a total of about 1800 Downloads (~1.5% of byteorder's downloads). That's not too much IMO

I count 50 crates with a * dependency on byteorder. However, only a few have more than a thousand downloads: sha1, tz and websocket.

"it's their own fault"

I don't really care if it's their "own fault" or not. The fact is, there are lots of crates that will break immediately for all downstream users. If there's an opportunity to mitigate that, then I'm going to take it.

And also, the README for this project recommended a * dependency for quite some time. So I'm not sure it really is their own fault anyway.

the RFC will not magically fix those crates

It will mitigate breakage on active crates.

LukasKalbertodt · 2015-11-03T12:22:00Z

Oops -- shame on me. Nevermind then.

pravic · 2016-06-26T09:29:39Z

It seems this issue is abandoned for now.

So, there are two ways out of here: change api with backward compatibility (via semver) or fork this crate with proposed changes.

nwin · 2016-06-26T09:36:01Z

It is not abandoned, as far as I understand we're just waiting for default type parameters to happen (rust-lang/rust#27336). At least the *-issue should be resolved by now as it is now disallowed on crates.io.

pravic · 2016-06-26T10:03:33Z

1.2 years of waiting? Really perfect issue resolving way for software engineers.

BurntSushi · 2016-06-26T18:22:37Z

I'm planning to take this crate to 1.0 soon with the current API.

If and when default type parameters is a thing, then we can revisit.

nwin changed the title ~~WriteBytesExt API is not ergonomic to use~~ Improve WriteBytesExt API Apr 22, 2015

nwin mentioned this issue Jul 11, 2015

Switch to new api and bump version to 0.4. #34

Closed

BurntSushi closed this as completed Jun 26, 2016

BurntSushi mentioned this issue Aug 1, 2016

approaching 1.0 #54

Closed

BurntSushi mentioned this issue Oct 18, 2016

Add iterators #58

Closed

BurntSushi mentioned this issue Jan 31, 2017

Add wrapper for ReadBytesExt/WriteBytesExt which stores the endianness #67

Closed

BurntSushi mentioned this issue Feb 28, 2019

[Feature Request] ByteOrder::read<T> #143

Closed

BurntSushi mentioned this issue Dec 24, 2021

Endianness-managing Read/Write wrapper? #187

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve WriteBytesExt API #27

Improve WriteBytesExt API #27

nwin commented Apr 22, 2015

nwin commented Apr 22, 2015

BurntSushi commented Apr 23, 2015

TyOverby commented Apr 23, 2015

BurntSushi commented Apr 23, 2015

sfackler commented Apr 23, 2015

TyOverby commented Apr 23, 2015

nwin commented Apr 23, 2015

nwin commented Jun 8, 2015

nwin commented Jun 8, 2015

BurntSushi commented Jul 10, 2015

nwin commented Jul 10, 2015

BurntSushi commented Jul 10, 2015

nwin commented Jul 11, 2015

cesarb commented Jul 11, 2015

nwin commented Jul 11, 2015

cesarb commented Jul 11, 2015

cesarb commented Jul 11, 2015

LukasKalbertodt commented Nov 2, 2015

BurntSushi commented Nov 2, 2015

TyOverby commented Nov 3, 2015

BurntSushi commented Nov 3, 2015

TyOverby commented Nov 3, 2015

BurntSushi commented Nov 3, 2015

LukasKalbertodt commented Nov 3, 2015

BurntSushi commented Nov 3, 2015

LukasKalbertodt commented Nov 3, 2015

pravic commented Jun 26, 2016

nwin commented Jun 26, 2016

pravic commented Jun 26, 2016

BurntSushi commented Jun 26, 2016

Improve WriteBytesExt API #27

Improve WriteBytesExt API #27

Comments

nwin commented Apr 22, 2015

nwin commented Apr 22, 2015

BurntSushi commented Apr 23, 2015

TyOverby commented Apr 23, 2015

BurntSushi commented Apr 23, 2015

sfackler commented Apr 23, 2015

TyOverby commented Apr 23, 2015

nwin commented Apr 23, 2015

nwin commented Jun 8, 2015

nwin commented Jun 8, 2015

BurntSushi commented Jul 10, 2015

nwin commented Jul 10, 2015

BurntSushi commented Jul 10, 2015

nwin commented Jul 11, 2015

cesarb commented Jul 11, 2015

nwin commented Jul 11, 2015

cesarb commented Jul 11, 2015

cesarb commented Jul 11, 2015

LukasKalbertodt commented Nov 2, 2015

BurntSushi commented Nov 2, 2015

TyOverby commented Nov 3, 2015

BurntSushi commented Nov 3, 2015

TyOverby commented Nov 3, 2015

BurntSushi commented Nov 3, 2015

LukasKalbertodt commented Nov 3, 2015

BurntSushi commented Nov 3, 2015

LukasKalbertodt commented Nov 3, 2015

pravic commented Jun 26, 2016

nwin commented Jun 26, 2016

pravic commented Jun 26, 2016

BurntSushi commented Jun 26, 2016