Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io: add Ready::ERROR and report error readiness #5781

Merged
merged 9 commits into from
Aug 16, 2023

Conversation

folkertdev
Copy link
Contributor

@folkertdev folkertdev commented Jun 8, 2023

Motivation

fixes #5716

The motivation is to await messages arriving in a UdpSocket's error queue. The error queue is a mechanism to asynchronously process errors on a socket. By separating them from the normal queue, error information does not interfere with standard IO, but at the same time better error messages can be given than just a single integer as the result of send/receive.

This potentially has many use cases. My particular one is awaiting when a send timestamp is available.

Solution

The solution is to provide Ready::ERROR. This is already supported in mio, but so far was not exposed in tokio. In combination with AsyncFd::ready with some arbitrary interest, this can be used to await error readiness.

One potentially controversial change is that error readiness is reported regardless of interest. That is in practice how error readiness works on operating systems: errors are always reported, no matter what the configured interest is.

            // error readiness is reported regardless of interest
            ready |= Ready::ERROR;

Testing

Send timestamping, my primary use case, only works on linux. I've provided a test for it, but it is only compiled/executed on linux.

I've also included a test that sends a UDP packet to a local socket where nobody is listening. Normally this send operation would just succeed, but by setting the IP_RECVERR socket option, the sending socket is notified via the error queue that there is nobody listening on the other side. Turns out this test also only works on linux.

I'm not sure what to do about other operating systems. because we're just re-exposing mio functionality here maybe it's fine to just have the linux tests on CI?

@folkertdev folkertdev force-pushed the ready-error branch 2 times, most recently from 67f9465 to 5cd4706 Compare June 9, 2023 08:32
@Darksonn Darksonn added A-tokio Area: The main tokio crate M-io Module: tokio/io labels Jun 9, 2023
@folkertdev folkertdev changed the title Report error readiness io: add Ready::ERROR and report error readiness Jun 22, 2023
@folkertdev folkertdev force-pushed the ready-error branch 2 times, most recently from bcbb35e to 2f6b71f Compare July 3, 2023 16:35
@folkertdev
Copy link
Contributor Author

Is this blocked on anything? anything I can do to help it move along?

@Darksonn
Copy link
Contributor

Darksonn commented Jul 3, 2023

Over the past month, I've been busy with other things, but I should be able to look at it soon now that we have entered July.

Comment on lines 257 to 260
// error readiness is reported regardless of interest
ready |= Ready::ERROR;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Does this mean that the fn ready methods on socket types always return Error unconditionally even if there are no Ready events?

We should probably have a test that Ready is not set if there have been no error events.

Copy link
Contributor Author

@folkertdev folkertdev Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the phrasing is misleading I think, I'll improve it. What is happening here is that with this change, an event with the error bit set (i.e. with the error readiness) matches any interest.

so no matter what interest a user configures, an event with error readiness is not discarded. That means the corresponding waker is woken up, and then can deal with the event as they see fit.

We already have some tests that perform an equality check on Ready values, e.g.

#[tokio::test]
async fn clear_ready_matching_clears_ready() {
    use tokio::io::{Interest, Ready};

    let (a, mut b) = socketpair();

    let afd_a = AsyncFd::new(a).unwrap();
    b.write_all(b"0").unwrap();

    let mut guard = afd_a
        .ready(Interest::READABLE | Interest::WRITABLE)
        .await
        .unwrap();

    // the readiness is just readable and writable, not error 
    assert_eq!(guard.ready(), Ready::READABLE | Ready::WRITABLE);

    guard.clear_ready_matching(Ready::READABLE);
    assert_eq!(guard.ready(), Ready::WRITABLE);

    guard.clear_ready_matching(Ready::WRITABLE);
    assert_eq!(guard.ready(), Ready::EMPTY);
}

This test passes, so the error readiness is not set in this case.

@folkertdev folkertdev requested a review from Darksonn July 8, 2023 15:31
@Darksonn
Copy link
Contributor

Darksonn commented Jul 8, 2023

The implementation looks fine, but my main concern is how the error readiness is always exposed. Are we sure we want to do that? What about portability? Linux with epoll is one thing, but what about kqueue? What about Windows?

@carllerche
Copy link
Member

I am out of town and reviewing on a mobile device so I may be reading it wrong.

I think the biggest question is whether we should always unconditionally report error readiness without the user asking for it on all platforms. I know that this is what epoll does (error interest is always implicit) but I don’t know if we can maintain that behavior across all platforms (windows, future wasi based platforms, etc…)

The easiest option is to conditionally define Interest::Error on platforms where we can support it and require the user to specify it when awaiting for readiness.

@folkertdev
Copy link
Contributor Author

This is a fair concern. I don't think always reporting the error readiness has real downsides, but may be missing something.

I like the idea of Interest::ERROR as a tokio user, I'd really prefer it, but ran into issues with the implementation. The core issue is that mio does not have a Interest::ERROR because it does not make sense there (this comment explains why tokio-rs/mio#1672 (comment)).

Tokio wraps the mio::Interest, but must be able to convert from the tokio to the mio version to tell mio what to do. If a user would specify just a tokio Interest::ERROR, what interest is mio supposed to register with the OS?

I've written more about it in the related issue #5716 (comment)

@Darksonn
Copy link
Contributor

Darksonn commented Jul 8, 2023

Tokio already supports cases where you are waiting for a different set than registered with mio. For example, if two threads are waiting for read and write readiness respectively, then we register with both on mio, but the read waiter is only notified if mio gives us a read readiness.

Tokio wraps the mio::Interest, but must be able to convert from the tokio to the mio version to tell mio what to do. If a user would specify just a tokio Interest::ERROR, what interest is mio supposed to register with the OS?

If you get the error readiness no matter what, then we can just remove it from the readiness set given to mio.

@folkertdev
Copy link
Contributor Author

ah, interesting. So if I use Interest::ERROR in my user code, behind the scenes this could turn into an arbitrary interest for mio (say Interest::READABLE) but then tokio will filter what it reports (so clearing the readable flag from the event) so that the user-visible event only shows the error readiness.

am I understanding the idea correctly?

@Darksonn
Copy link
Contributor

Darksonn commented Jul 8, 2023

You bring up a good point. How does clearing the error readiness work? How does epoll handle it?

@folkertdev
Copy link
Contributor Author

I've added Interest::ERROR. I've picked this representation

pub struct Interest {
    mio: Option<mio::Interest>,
    has_error_interest: bool,
}

In practice this will take 2 bytes in memory, up from 1 before. This representation makes some existing functions a bit longer, especially because some are const fn and not all helper functions on Option are const stable (or were not const stable in tokio's msrv).

Then, mio has this note on the is_error function on event:

The table below shows what flags are checked on what OS.

OS selector Flag(s) checked
epoll EPOLLERR
kqueue EV_ERROR and EV_EOF with fflags set to 0.

Based on https://docs.rs/mio/latest/mio/struct.Poll.html#implementation-notes only windows would never use the error field (but the function is still available on that platform).

Always providing the error interest and related functions can be convenient to cut down on conditional compilation. On the other hand it can be confusing. I went with just doing what mio does for now, but don't have a strong opinion here.


I've verified that this code works as I'd expect with the current code of this PR

let mut guard1 = async_fd_socket.ready(Interest::ERROR).await.unwrap();
guard1.clear_ready_matching(Ready::ERROR);

// blocks!
let mut guard2 = async_fd_socket.ready(Interest::ERROR).await.unwrap();

but this is all tokio of course. So I ran another test where I send some bytes to the async_fd_socket

let mut guard1 = async_fd_socket.ready(Interest::ERROR).await.unwrap();
guard1.clear_ready_matching(Ready::ERROR);

// .. send bytes to `async_fd_socket`

// blocks!
let mut guard2 = async_fd_socket.ready(Interest::ERROR).await.unwrap();

and that still blocks. So the error readiness does not persist between different returns of epoll_wait.

Is that what you meant with your comment about clearing error readiness @Darksonn?

@Darksonn
Copy link
Contributor

That answers how error readiness is cleared for AsyncFd. How about TcpStream? It doesn't have the same clear_ready_matching api.

@folkertdev
Copy link
Contributor Author

At the moment TcpStream is not a problem, because it does not have a public api for registering with a custom interest. #5796 wants to add custom interest functionality, and would run into the same issue for Interest::PRIORITY. So I don't think it's a problem for this PR specifically, but in general it will need a solution.

That solution could be to not allow a custom interest, or some equivalent of clear_ready_matching

@Darksonn
Copy link
Contributor

Does that mean that even if an error event occurs, the current implementation will not surface it via the TcpStream api?

@folkertdev
Copy link
Contributor Author

As far as I can see, yes. Because it is not possible to register a tokio TcpStream with Interest::ERROR, the Ready::ERROR readiness would never be visible to the user. it is filtered out by

        if interest.is_error() {
            ready |= Ready::ERROR;
        }

in Ready::from_interest

Copy link
Member

@carllerche carllerche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it looks mostly good to me. I left a comment on the Interest struct.

I think, for me, the primary question is, from a portability POV, what are the guarantees Tokio provides when one sets Error interest? The fact that error is overloaded makes me uncomfortable with providing this as a portable flag.

e.g. it seems to me that error interest should mean that if one receives error readiness, one can just drop the socket w/o performing further operations on it because the socket is in an error state. This doesn't hold for the case you want to use it for.

One option is, we include it for all platforms and document it as "this passes error interest to the underlying OS selector. Behavior is platform specific, read your platform's documentation"

@@ -11,43 +11,59 @@ use std::ops;
/// I/O resource readiness states.
#[cfg_attr(docsrs, doc(cfg(feature = "net")))]
#[derive(Clone, Copy, Eq, PartialEq)]
pub struct Interest(mio::Interest);
pub struct Interest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change this to a usize and define the bits yourself, similar to how Ready does it: https://github.com/tokio-rs/tokio/blob/master/tokio/src/io/ready.rs.

I would prefer not growing the struct size and avoid branches in the is_* checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a particular reason to choose for usize? Maybe because the interest value doesn't live long and is likely in a register, usize is no worse than u8 (which would currently fit, with 2 bits to spare)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usize doesn't matter really, it is just what mio::Interest uses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mio uses a NonZeroU8 https://github.com/tokio-rs/mio/blob/master/src/interest.rs#L17

#[derive(Copy, PartialEq, Eq, Clone, PartialOrd, Ord)]
pub struct Interest(NonZeroU8);

@folkertdev folkertdev force-pushed the ready-error branch 2 times, most recently from b1f4652 to e4e4a03 Compare July 10, 2023 21:16
@folkertdev
Copy link
Contributor Author

allright, I made the changes and CI is happy again.

For AsyncFd a note about behavior being platform-specific (which I've added) seems fine to me. You're likely looking at those docs because you are doing some lowlevel platform-specific stuff.

If eventually the tokio UdpSocket and TcpStream also get custom interest registeration support, then the situation might get a bit more complicated.

@folkertdev
Copy link
Contributor Author

e.g. it seems to me that error interest should mean that if one receives error readiness, one can just drop the socket w/o performing further operations on it because the socket is in an error state. This doesn't hold for the case you want to use it for.

That's right.

  • On linux/epoll the error mechanism has been hijacked to provide a sidechannel. The socket can continue to function while error information can be received and processed concurrently. Then the sidechannel is also used for non-error information like the timestamps.
  • I have very little experience with kqueue, but it looks like EV_ERROR does mean a fatal error. (no memory, invalid file descriptor, etc) https://man.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2#end
  • On windows, this interest has no effect whatsoever.

Given that, I'd be totally ok with a #cfg(any(target_os = "linux", target_os = "android")] on the error interest in tokio. Linux/epoll is the only tested platform right now, and it is where awaiting error readiness makes most sense. It also means that the documentation can be more targeted.

Lifting the conditional compilation restrictions can always happen at a later time, e.g. if a real usecase for error readiness with kqueue comes up.

does that sound like a good way forward @carllerche ?

@folkertdev
Copy link
Contributor Author

so, how do we make progress here?

tokio/src/io/interest.rs Outdated Show resolved Hide resolved
tokio/src/io/interest.rs Outdated Show resolved Hide resolved
Comment on lines 768 to 774
// Set the destination address. This address is invalid in this context. the OS will notice
// that nobody is listening on port 1234. Normally this is ignored (UDP is "fire and forget"),
// but because IP_RECVERR is enabled, the error will actually be reported to the sending socket
let mut dest_addr =
unsafe { std::mem::MaybeUninit::<libc::sockaddr_in>::zeroed().assume_init() };
dest_addr.sin_family = libc::AF_INET as _;
dest_addr.sin_port = 1234u16.to_be(); // Destination port
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this test will fail if some other test decides to use port 1234 for something?

Perhaps we should use a port number less than 1024 to avoid the case where tests using port 0 to pick a port will pick the port used by this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, testing this is kind of tricky. I've changed the port to 512 and added an explanation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment on lines 708 to 716
let fd = AsyncFd::new(socket).unwrap();

let buf = b"hello there";
fd.get_ref().send(buf).unwrap();

// the send timestamp should now be in the error queue
let guard = fd.ready(Interest::ERROR).await.unwrap();
assert_eq!(guard.ready(), Ready::ERROR);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we assert that the error interest is not set before we send the buffer? Currently all of our tests would pass if ERROR is always set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I added one of those select! with a timer and panic branch that is used in a bunch of the other tests.

Copy link
Member

@carllerche carllerche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for sticking w/ it!

@carllerche carllerche merged commit 10e141d into tokio-rs:master Aug 16, 2023
71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-io Module: tokio/io
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wait for messages in the error queue
3 participants