Tracking Issue for `io::Cursor::{split, split_mut}` #86369

soerenmeier · 2021-06-16T15:20:28Z

Feature gate: #![feature(cursor_split)]

impl<T> Cursor<T>
where
    T: AsRef<[u8]>,
{
    pub fn split(&self) -> (&[u8], &[u8]);
}

impl<T> Cursor<T>
where
    T: AsMut<[u8]>,
{
    pub fn split_mut(&mut self) -> (&mut [u8], &mut [u8]);
}

Steps / History

Add io::Cursor::{remaining, remaining_slice, is_empty} #86037
Replace io::Cursor::{remaining_slice, is_empty} #109174
Final commenting period (FCP)
Stabilization PR

Unresolved Questions/Issues

~~name remaining Add io::Cursor::{remaining, remaining_slice, is_empty} #86037 (comment)~~
~~change signature of remaining to return usize. Tracking Issue for io::Cursor::{split, split_mut} #86369 (comment)~~
~~conflict with <Cursor as bytes::Buf>::remaining Tracking Issue for io::Cursor::{split, split_mut} #86369 (comment)~~

The text was updated successfully, but these errors were encountered:

yaahc · 2021-06-16T22:41:49Z

One additional concern that @mystor pointed out is that stabilizing these APIs might prevent us from adding support to non-contiguous buffers/sets of buffers to Cursor in the future. The remaining method wouldn't translate, since it returns a single slice.

We should be absolutely certain that we don't intend to extend Cursor this way in the future or we may wish to just stick with the usize returning version for now, since that one would have solved the original motivating example and it can be extended to work for multiple non-contiguous buffers.

soerenmeier · 2021-06-16T23:27:38Z

Wouldn't that still work though? Since those contiguous buffers won't implement AsRef<[u8]>.
To make it clear that this function only returns a slice we could rename it as suggested to remaining_slice and then the implementation for contiguous buffers could have a method named for example remaining_slices.

yaahc · 2021-06-16T23:46:40Z

Wouldn't that still work though? Since those contiguous buffers won't implement AsRef<[u8]>.

Oh, I had forgotten that remaining also had the as_ref bound. Yea, that should work fine.

To make it clear that this function only returns a slice we could rename it as suggested to remaining_slice and then the implementation for contiguous buffers could have a method named for example remaining_slices.

Sounds like a plan!

soerenmeier · 2021-06-17T00:18:11Z

I renamed remaining to remaining_slice and added:

fn remaining(&self) -> u64;

Is that ok or should i remove it again?

yaahc · 2021-06-17T00:28:03Z

Is that ok or should i remove it again?

sounds great

Add `io::Cursor::{remaining, remaining_slice, is_empty}` Tracking issue: rust-lang#86369 I came across an inconvenience when answering the following [Stack Overflow](https://stackoverflow.com/questions/67831170) question. To get the remaining slice you have to call `buff.fill_buf().unwrap()`. Which in my opinion doesn't really tell you what is returned (in the context of Cursor). To improve readability and convenience when using Cursor i propose adding the method `remaining`. The next thing i found inconvenient (unnecessary long) was detecting if the cursor reached the end. There are a few ways this can be achieved right now: - `buff.fill_buf().unwrap().is_empty()` - `buff.position() >= buff.get_ref().len()` - `buff.bytes().next().is_none()` Which all seem a bit unintuitive, hidden in trait documentations or just a bit long for such a simple task. Therefor i propose another method called `is_empty`, maybe with another name, since this one may leave room for interpretation on what really is empty (the underlying slice, the remaining slice or maybe the position). Since it seemed easier to create this PR instead of an RFC i did that, if an RFC is wanted, i can close this PR and write an RFC first.

seanmonstar · 2021-06-24T15:16:56Z

I'll just note that this change conflicts with bytes::Buf that is implemented for Cursor<impl AsRef<[u8]>>. The proposed remaining method here has a different signature, so shadowing won't work. While adding this method may be desired and ultimately the right decision (I haven't thought enough to say myself), it will have an impact on the ecosystem using bytes::Buf.

I've also noticed an inconvenience that won't affect stable: since the method is "unstable" on nightly, it triggers an error, even though the trait is in scope.

soerenmeier · 2021-06-24T18:36:42Z

That seems unfortunate. I'd be okay with changing the signature to return a usize, I chose u64 since position is u64 but since the underlying slice cannot be bigger than usize it would be possible to change the signature. The only point to consider is if support for non-contiguous buffers is added the length could be bigger than usize, but that also could be a problem with u64.

About the error in nightly, I don't know how this is handled normally in std. But the impact should be rather small, since Cursor is only scarcely used, mostly in tests. Am I wrong?

seanmonstar · 2021-06-25T00:19:24Z

I don't mean an error in libstd or it's test suite. But rather the greater ecosystem, anywhere using bytes::Buf and also running tests on nightly. For example, this PR for h2: hyperium/h2#547

soerenmeier · 2021-06-25T01:19:14Z

I didn't either. I meant the use of Cursor in combination with bytes::Buf on nightly calling remaining. After your first comment i made a few short searches concluding with the statement above, interpreting your reaction, that was probably the wrong assessment. Imho failing nightly tests should be acceptable but addressed (if stable is the target).
Do you have a suggestion to solve this?

seanmonstar · 2021-06-25T01:38:03Z

Yea, I see two solutions (so far, maybe there's a better): either libstd changes to so that nightly doesn't include the conflict, or the ecosystem that depends on bytes::Buf and runs tests on nightly all need to change.

(I agree failing nightly tests is acceptable to a degree, but at the same time, if they're all set to allow failure, it's very easy to miss that nightly is broken, and if they block the PR, then it disrupts library developers. 🤷 )

soerenmeier · 2021-06-26T14:28:47Z

@yaahc can remaining be removed from nightly without removing it from this proposal? Or can we rename it just for nightly (ex. remaining_len)?

yaahc · 2021-07-01T22:36:36Z

@yaahc can remaining be removed from nightly without removing it from this proposal? Or can we rename it just for nightly (ex. remaining_len)?

Yes, to either option, whichever you prefer. I'm not confident the remaining_len ident will pass muster during stabilization, so if I had to lean one way I'd say just remove it since you only really needed the slice method afaik, and it's not a long jump from getting a slice to getting the len.

Remove unstable `io::Cursor::remaining` Adding `io::Cursor::remaining` in rust-lang#86037 caused a conflict with the implementation of `bytes::Buf` for `io::Cursor`, leading to an error in nightly, see rust-lang#86369 (comment). This fixes the error by temporarily removing the `remaining` function. r? `@yaahc`

jkugelman · 2021-09-25T13:49:23Z

It would be nice if remaining_slice could return a slice with the lifetime of the underlying buffer, rather than of the cursor.

Sometimes I want to avoid copying a large chunk of data and get a slice of the byte buffer instead. remaining_slice would be great for that, except the slice it returns doesn't live long enough.

Let's say I've read a file into memory, and now I'm pulling out individual assets:

#![feature(cursor_remaining)]

use std::io::{self, BufRead, Cursor, Read};

struct Asset<'file> {
    data: &'file [u8],
}

impl<'file> Asset<'file> {
    fn load(cursor: &mut Cursor<&'file [u8]>) -> io::Result<Self> {
        let mut len = [0u8; 2];
        cursor.read_exact(&mut len[..])?;
        let len = u16::from_le_bytes(len) as usize;

        let data = cursor.remaining_slice().get(..len)
            .ok_or(io::ErrorKind::UnexpectedEof)?;
        cursor.consume(len);

        Ok(Self { data })
    }
}

This fails to compile because the slice from remaining_slice has the lifetime of the cursor and the cursor doesn't live long enough:

error[E0621]: explicit lifetime required in the type of `cursor`
  --> src/lib.rs:19:19
   |
10 |     fn load(cursor: &mut Cursor<&'file [u8]>) -> io::Result<Self> {
   |                     ------------------------ help: add explicit lifetime `'file` to the type of `cursor`: `&'file mut std::io::Cursor<&'file [u8]>`
...
19 |         Ok(Self { data })
   |                   ^^^^ lifetime `'file` required

Playground

soerenmeier · 2021-09-27T16:17:37Z

That would be nice, but sadly i see no way to do this. I tried to annotate all lifetimes that exist when T = &'a [u8] to explain my reasoning:

impl<T: AsRef<[u8]>> Cursor<&'a [u8]> {
    pub fn remaining_slice<'b>(&'b self) -> &'b [u8] {

        let slice: &'b &'a [u8] = AsRef::as_ref(&'b self.inner);

        let slice: &'a [u8] = *slice;
        // where 'a: 'b

        slice
    }
}

It is not possible to change the returned lifetime to 'a since 'a needs to outlive 'b because &'a is borrowed with &'b.
i hope this is understandable.

soerenmeier · 2021-10-05T16:35:49Z

@yaahc do you think we can start an FCP? Its been about 3 month since the initial PR.

yaahc · 2021-10-05T21:52:03Z

sure.

@rfcbot merge

rfcbot · 2021-10-05T21:52:04Z

Team member @yaahc has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

is_empty (Tracking Issue for io::Cursor::{split, split_mut} #86369 (comment))

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

BurntSushi · 2021-10-06T11:50:10Z

Hmmm, so I have a couple small thoughts here:

If we have an is_empty method, we probably should have a len method. (And it looks like that's what remaining was before it was removed. So I wonder whether that should be added back as len.)
Alternatively, we might just remove is_empty altogether. Folks can do cursor.remaining_slice().is_empty() (and cursor.remaining_slice().len() for the length).

If we went with (2), I like the name remaining better than remaining_slice, but it sounds like that would cause ecosystem issues with bytes. But remaining_slice is not too bad...

soerenmeier · 2021-10-07T21:33:45Z

My initial PR was with remaining returning a slice, so I'm in favor of that, but it's probably not worth it.
About len, I don't know if that could be ambiguous since it could mean the len of the underlying slice or the remaining len.

What about adding remaining with the same signature as bytes::Buf and making it insta-stable (to not trigger unstable warnings)?

fn remaining(&self) -> usize;

seanmonstar · 2021-10-07T21:45:37Z

I personally think cursor.remaining_slice().is_empty() is clearer than cursor.is_empty(). With a Cursor<Vec<u8>>, with a bunch of bytes in the vector, it's not immediately clear to me that the cursor itself could be empty...

m-ou-se · 2021-10-27T16:23:17Z

@rfcbot concern is_empty

As pointed out above, the meaning of is_empty can be confusing:

#![feature(cursor_remaining)]

use std::io::Write;

fn main() {
    let mut buf = [0u8; 10];
    let mut cursor = std::io::Cursor::new(&mut buf[..]);
    dbg!(cursor.is_empty()); // false
    cursor.write_all(b"1234567").unwrap();
    dbg!(cursor.is_empty()); // false
    cursor.write_all(b"890").unwrap();
    dbg!(cursor.is_empty()); // true
}

Here, is_empty returns false when nothing is written yet, and true when the buffer is completely full.

m-ou-se · 2021-10-27T16:28:10Z

This API seems focussed on the Read side of things. When Seeking around and Writeing data, is_empty becomes confusing and remaining_slice is not very useful. When writing, the part of the slice before the cursor is much more interesting. Maybe we should have functions for both? Or a single function that returns both slices? (If we also have a mut version, a single function would be useful to allow mutable access to both sides at once.)

m-ou-se · 2021-10-27T16:31:30Z

If we have an is_empty method, we probably should have a len method. (And it looks like that's what remaining was before it was removed. So I wonder whether that should be added back as len.)

Note that the length of the underlying buffer doesn't change when reading or writing or seeking through it. The only thing that changes is the position of the cursor, and the bytes in the buffer (when writing). (E.g. seeking to the beginning will 'unempty' it and produce the already read data again.)

jkugelman · 2021-10-28T00:18:21Z

I really like that, looking at the cursor symmetrically. You're talking about something like this?

pub struct Cursor<T> {
    pub fn remaining(&self) -> &[u8];
    pub fn remaining_mut(&mut self) -> &mut [u8];
    pub fn preceding(&self) -> &[u8];
    pub fn preceding_mut(&mut self) -> &mut [u8];
    pub fn split_at(&self) -> (&[u8], &[u8]);
    pub fn split_at_mut(&mut self) -> (&mut [u8], &mut [u8]);
}

m-ou-se · 2022-01-12T14:53:12Z

It doesn't look like this will go through in its current form, so I'm cancelling the proposed FCP.

@rfcbot cancel

@jkugelman Yeah, something like that. Or before_cursor, after_cursor and split_at_cursor or something in that direction. As long as it makes sense from both a Read and Write perspective (+Seek).

rfcbot · 2022-01-12T14:53:13Z

@m-ou-se proposal cancelled.

glandium · 2022-07-08T02:24:25Z

Stumbled upon these functions today. Considering the concerns, would it be worth removing the current implementation?

soerenmeier · 2022-07-08T13:23:56Z

I once made a commit with possible alternatives but wasn't sure about if the implementation should not be generic over T: AsRef<[u8]>. That would allow to return different lifetimes for example for Cursor<&'a [u8]> so the function could look like fn split(&self) -> (&'a [u8], &'a [u8]);. That was one concern raised in #86369 (comment).

My Implementation was based on these functions:

fn split(&self) -> (&[u8], &[u8]);
fn before(&self) -> &[u8];
fn after(&self) -> &[u8];
fn split_mut(&mut self) -> (&mut [u8], &mut [u8]);
fn before_mut(&mut self) -> &mut [u8];
fn after_mut(&mut self) -> &mut [u8];

Commit: soerenmeier@eab83e3

Replace `io::Cursor::{remaining_slice, is_empty}` This is a late follow up to the concerns raised in rust-lang#86369. rust-lang#86369 (comment) > This API seems focussed on the `Read` side of things. When `Seek`ing around and `Write`ing data, `is_empty` becomes confusing and `remaining_slice` is not very useful. When writing, the part of the slice before the cursor is much more interesting. Maybe we should have functions for both? Or a single function that returns both slices? (If we also have a `mut` version, a single function would be useful to allow mutable access to both sides at once.) New feature name: `cursor_remaining` > `cursor_split`. Added functions: ```rust fn split(&self) -> (&[u8], &[u8]); // fn before(&self) -> &[u8]; // fn after(&self) -> &[u8]; fn split_mut(&mut self) -> (&mut [u8], &mut [u8]); // fn before_mut(&mut self) -> &mut [u8]; // fn after_mut(&mut self) -> &mut [u8]; ``` A question was raised in rust-lang#86369 (comment) about whether to return a lifetime that would reflect the lifetime of the underlying bytes (`impl Cursor<&'a [u8]> { fn after(&self) -> &'a [u8] }`). The downside of doing this would be that it would not be possible to implement these functions generically over `T: AsRef<[u8]>`. ## Update Based on the review, before* and after* methods where removed.

Rollup merge of rust-lang#109174 - soerenmeier:cursor_fns, r=dtolnay Replace `io::Cursor::{remaining_slice, is_empty}` This is a late follow up to the concerns raised in rust-lang#86369. rust-lang#86369 (comment) > This API seems focussed on the `Read` side of things. When `Seek`ing around and `Write`ing data, `is_empty` becomes confusing and `remaining_slice` is not very useful. When writing, the part of the slice before the cursor is much more interesting. Maybe we should have functions for both? Or a single function that returns both slices? (If we also have a `mut` version, a single function would be useful to allow mutable access to both sides at once.) New feature name: `cursor_remaining` > `cursor_split`. Added functions: ```rust fn split(&self) -> (&[u8], &[u8]); // fn before(&self) -> &[u8]; // fn after(&self) -> &[u8]; fn split_mut(&mut self) -> (&mut [u8], &mut [u8]); // fn before_mut(&mut self) -> &mut [u8]; // fn after_mut(&mut self) -> &mut [u8]; ``` A question was raised in rust-lang#86369 (comment) about whether to return a lifetime that would reflect the lifetime of the underlying bytes (`impl Cursor<&'a [u8]> { fn after(&self) -> &'a [u8] }`). The downside of doing this would be that it would not be possible to implement these functions generically over `T: AsRef<[u8]>`. ## Update Based on the review, before* and after* methods where removed.

soerenmeier added C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jun 16, 2021

soerenmeier mentioned this issue Jun 16, 2021

Add io::Cursor::{remaining, remaining_slice, is_empty} #86037

Merged

jonas-schievink added the A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` label Jun 16, 2021

soerenmeier changed the title ~~Tracking Issue for io::Cursor::{remaining, is_empty}~~ Tracking Issue for io::Cursor::{remaining, remaining_slice, is_empty} Jun 17, 2021

seanmonstar mentioned this issue Jun 24, 2021

fuzz: fix e2e MockIo reading too much for the buffer hyperium/h2#547

Merged

soerenmeier mentioned this issue Jul 2, 2021

Remove unstable io::Cursor::remaining #86811

Merged

soerenmeier changed the title ~~Tracking Issue for io::Cursor::{remaining, remaining_slice, is_empty}~~ Tracking Issue for io::Cursor::{remaining_slice, is_empty} Sep 27, 2021

rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Oct 5, 2021

rfcbot removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jan 12, 2022

soerenmeier mentioned this issue Mar 15, 2023

Replace io::Cursor::{remaining_slice, is_empty} #109174

Merged

dtolnay changed the title ~~Tracking Issue for io::Cursor::{remaining_slice, is_empty}~~ Tracking Issue for io::Cursor::{split, split_mut} Jul 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking Issue for `io::Cursor::{split, split_mut}` #86369

Tracking Issue for `io::Cursor::{split, split_mut}` #86369

soerenmeier commented Jun 16, 2021 •

edited by dtolnay

Loading

yaahc commented Jun 16, 2021

soerenmeier commented Jun 16, 2021

yaahc commented Jun 16, 2021

soerenmeier commented Jun 17, 2021

yaahc commented Jun 17, 2021

seanmonstar commented Jun 24, 2021

soerenmeier commented Jun 24, 2021

seanmonstar commented Jun 25, 2021

soerenmeier commented Jun 25, 2021

seanmonstar commented Jun 25, 2021

soerenmeier commented Jun 26, 2021

yaahc commented Jul 1, 2021

jkugelman commented Sep 25, 2021 •

edited

Loading

soerenmeier commented Sep 27, 2021

soerenmeier commented Oct 5, 2021

yaahc commented Oct 5, 2021

rfcbot commented Oct 5, 2021 •

edited by Amanieu

Loading

BurntSushi commented Oct 6, 2021

soerenmeier commented Oct 7, 2021

seanmonstar commented Oct 7, 2021

m-ou-se commented Oct 27, 2021

m-ou-se commented Oct 27, 2021

m-ou-se commented Oct 27, 2021

jkugelman commented Oct 28, 2021 •

edited

Loading

m-ou-se commented Jan 12, 2022

rfcbot commented Jan 12, 2022

glandium commented Jul 8, 2022

soerenmeier commented Jul 8, 2022

Tracking Issue for io::Cursor::{split, split_mut} #86369

Tracking Issue for io::Cursor::{split, split_mut} #86369

Comments

soerenmeier commented Jun 16, 2021 • edited by dtolnay Loading

Steps / History

Unresolved Questions/Issues

yaahc commented Jun 16, 2021

soerenmeier commented Jun 16, 2021

yaahc commented Jun 16, 2021

soerenmeier commented Jun 17, 2021

yaahc commented Jun 17, 2021

seanmonstar commented Jun 24, 2021

soerenmeier commented Jun 24, 2021

seanmonstar commented Jun 25, 2021

soerenmeier commented Jun 25, 2021

seanmonstar commented Jun 25, 2021

soerenmeier commented Jun 26, 2021

yaahc commented Jul 1, 2021

jkugelman commented Sep 25, 2021 • edited Loading

soerenmeier commented Sep 27, 2021

soerenmeier commented Oct 5, 2021

yaahc commented Oct 5, 2021

rfcbot commented Oct 5, 2021 • edited by Amanieu Loading

BurntSushi commented Oct 6, 2021

soerenmeier commented Oct 7, 2021

seanmonstar commented Oct 7, 2021

m-ou-se commented Oct 27, 2021

m-ou-se commented Oct 27, 2021

m-ou-se commented Oct 27, 2021

jkugelman commented Oct 28, 2021 • edited Loading

m-ou-se commented Jan 12, 2022

rfcbot commented Jan 12, 2022

glandium commented Jul 8, 2022

soerenmeier commented Jul 8, 2022

Tracking Issue for `io::Cursor::{split, split_mut}` #86369

Tracking Issue for `io::Cursor::{split, split_mut}` #86369

soerenmeier commented Jun 16, 2021 •

edited by dtolnay

Loading

jkugelman commented Sep 25, 2021 •

edited

Loading

rfcbot commented Oct 5, 2021 •

edited by Amanieu

Loading

jkugelman commented Oct 28, 2021 •

edited

Loading