-
Notifications
You must be signed in to change notification settings - Fork 735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
io_uring for linux 5.1+ #923
Comments
@rvolgers Thanks for pointing this out to us. Over the last couple of days I've been (slowly) reading about this but from what I understand this is mostly aimed at disk I/O, with possible support for sockets in future (although what I read could be outdated and socket could already be supported). This is great because the currently solutions for disk I/O are far from perfect, but mio would still need a low-overhead cross platform way to do this. So I don't think we can use this to offer disk I/O (yet). For sockets and an epoll replacement the only benefit I see is performance as you mentioned, but I have yet to any "real world" benchmarks for this. I've read the commit message and it shows a ~2ns improvement, which I don't know is worth it. Because some people might use features of epoll outside of what mio provides this would be a serious breaking change. So this is something keep track of, but it will take a while for it to become useful for mio I think. |
Here is a benchmark from libuv: libuv/libuv#1947 (comment) |
The PDF claims support for socket IO (page 8). Timeouts are supported via timerfd according to a tweet by @deweerdt. Linux 5.2 also adds eventfd and fsync support. So this should have all the building blocks needed for a complete event loop. |
I think this will be a bit hard to force into mio's readiness-based model, and with Rust's ownership model in general. The buffers passed into the kernel must remain valid until the IO completes, which doesn't give them any statically knowable lifetime. My best thought at the moment is this would require an extra copy in the library to get around the lifetime issues: the buffers passed to the kernel are owned by mio, and on completion mio marks those handles as ready. Later, mio copies into the requestor's buffers when requested. I think this is what miow does for IOCP on windows, but I'm not 100% sure. At that point, we'd be trading off syscall overhead against the extra copy. Benchmarking this would be super interesting though! |
It also allows to pre-register buffers so the kernel can skip the step of mapping them into kernel space (or maybe this is a future direction they wanted to move in, I forget), in which case it's even more clear the ring owns the buffers. It also makes sense in a lot of other IO models that involve memory mapping or even DMA (although we are getting really far from mio now). One downside of having the ring own the buffers is that you still have to copy the data if you cannot do anything with it immediately, or you are blocking other tasks / threads from reusing that buffer. |
Oh I think I remember reading that. That does make the ownership clearer. But even if you don't pre-register buffers, I think the ring has to own the buffers as it's the only thing that can know when they are safe to drop. Does that seem right?
This is where I'm a bit hung up: I think you have to copy no matter what, at least in mio's model. You'd get a set of events back from To avoid that copy, the ring could hand out |
I implemented a simple POC. It uses |
FYI: linux-io-uring |
Is io-uring for TCP or mostly just files? |
From what I've read around it, it was mostly designed for files, but it can be used for TCP (or UDP or probably whatever file descriptors). There are likely to be some performance gains under high loads, as it allows doing multiple recvs/sends per single syscall (or in some extreme cases without syscalls completely). But without any experiments/measurements, my guess is the gains won't be as big as for the files. |
For files there was basically only the It's rapidly moving towards a general purpose async IO API though. Basic socket IO in particular has always worked fine, it's just that some of the more esoteric things you can do with sockets aren't available yet. It does seem to offer a nice performance boost for socket operations, especially if you use the functionality to pre-register buffers and file descriptors so the kernel doesn't have to grab a reference to them on every call. So that could be a reason to use it for socket operations, despite it only being available on bleeding edge Linux so far. There's some benchmark numbers in the Linux commits, and the main Linux dev working on it also posts updates / benchmarks etc on his twitter: https://twitter.com/axboe (Disclaimer: I've been keeping up with the kernel patches for |
@rvolgers did runtimes like Go and Node (i think using LibUV) open up a second thread to read from files in the past? where as with this it's now a lot easier to read files using asyncIO instead of a background thread? I think because of Disk IO using something like epoll would not be that useful on local files right? |
Kernels 5.5+ supports TCP |
I suppose you are aware of it, anyway, this seems interesting: https://github.com/spacejam/rio |
Here's a benchmark for a TCP echo server: https://twitter.com/hielkedv/status/1218891982636027905. |
io_uring is slowly losing its file io focus. It's pretty much evolving into "call any system call, but asynchronously". |
New set of benchmarks: https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.6-IO-uring-Tests. Last one from me, because I don't want to spam this issue. |
Io_uring will not be support in Mio v1.0, which will keep with the current kqueue/epoll model. But I aim to use it in Mio v2.0. |
Lending the ring's IO buffers to the user only blocks the ring if the ring plans on reusing the buffers immediately. If you want a permanent copy of the data, there's no need for copying. Just don't reuse that particular IO buffer and the user can keep it for as long as they want. If that requires allocating a new IO buffer just for that user, fine. I don't see a problem with that. For dataflows that reuse IO buffers, it seems to me like a different API should be exposed to the user anyway, reflecting this idea. |
I don't think Mio (v1) is going to support io_uring. The design for it is just too different. For people using Tokio please take a look at https://github.com/tokio-rs/tokio-uring. For people not using Tokio I'm working on a io_uring library, but progress is slow and it won't be part of Mio. |
Any chances to test your io-uring lib? |
@serzhiio I've just made it public at https://github.com/Thomasdezeeuw/a10, you'll need a fairly recent kernel, I'm using 6.1 myself. |
Did you find it faster or more efficient? |
I haven't had time to do proper performance testing, but with high amount of I/O I think it will be faster than epoll. |
The new io_uring API for generic asynchronous IO was merged for (currently unreleased) linux 5.1.
An overview of the API can be found here: http://kernel.dk/io_uring.pdf
While the overall API is designed with completion-based async io in mind, it also has
IORING_OP_POLL_ADD
which I think allows you to use it as "epoll, but more efficient" API?As I said linux 5.1 isn't even out yet, but it might be interesting to start thinking about how/if mio can use this.
The text was updated successfully, but these errors were encountered: