Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specialize io::copy to use copy_file_range, splice or sendfile #75272

Merged
merged 13 commits into from
Nov 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion library/std/src/fs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1656,7 +1656,7 @@ pub fn rename<P: AsRef<Path>, Q: AsRef<Path>>(from: P, to: Q) -> io::Result<()>
/// the length of the `to` file as reported by `metadata`.
///
/// If you’re wanting to copy the contents of one file to another and you’re
/// working with [`File`]s, see the [`io::copy`] function.
/// working with [`File`]s, see the [`io::copy()`] function.
///
/// # Platform-specific behavior
///
Expand Down
88 changes: 88 additions & 0 deletions library/std/src/io/copy.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
use crate::io::{self, ErrorKind, Read, Write};
use crate::mem::MaybeUninit;

/// Copies the entire contents of a reader into a writer.
///
/// This function will continuously read data from `reader` and then
/// write it into `writer` in a streaming fashion until `reader`
/// returns EOF.
///
/// On success, the total number of bytes that were copied from
/// `reader` to `writer` is returned.
///
/// If you’re wanting to copy the contents of one file to another and you’re
/// working with filesystem paths, see the [`fs::copy`] function.
///
/// [`fs::copy`]: crate::fs::copy
///
/// # Errors
///
/// This function will return an error immediately if any call to [`read`] or
/// [`write`] returns an error. All instances of [`ErrorKind::Interrupted`] are
/// handled by this function and the underlying operation is retried.
///
/// [`read`]: Read::read
/// [`write`]: Write::write
///
/// # Examples
///
/// ```
/// use std::io;
///
/// fn main() -> io::Result<()> {
/// let mut reader: &[u8] = b"hello";
/// let mut writer: Vec<u8> = vec![];
///
/// io::copy(&mut reader, &mut writer)?;
///
/// assert_eq!(&b"hello"[..], &writer[..]);
/// Ok(())
/// }
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
pub fn copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64>
where
R: Read,
W: Write,
{
cfg_if::cfg_if! {
if #[cfg(any(target_os = "linux", target_os = "android"))] {
crate::sys::kernel_copy::copy_spec(reader, writer)
} else {
generic_copy(reader, writer)
}
}
}

/// The general read-write-loop implementation of
/// `io::copy` that is used when specializations are not available or not applicable.
pub(crate) fn generic_copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a more descriptive name for this might be copy_fallback since we only use it when specialized implementations aren't available.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if fallback is the right word. From the perspective of the specializations it indeed is the fallback. But the specializations only apply in some narrow scenarios. In the more general case of arbitrary Read/Write pairs this is the default strategy.

where
R: Read,
W: Write,
{
let mut buf = MaybeUninit::<[u8; super::DEFAULT_BUF_SIZE]>::uninit();
// FIXME: #42788
//
// - This creates a (mut) reference to a slice of
// _uninitialized_ integers, which is **undefined behavior**
//
// - Only the standard library gets to soundly "ignore" this,
// based on its privileged knowledge of unstable rustc
// internals;
unsafe {
reader.initializer().initialize(buf.assume_init_mut());
}

let mut written = 0;
loop {
let len = match reader.read(unsafe { buf.assume_init_mut() }) {
Ok(0) => return Ok(written),
Ok(len) => len,
Err(ref e) if e.kind() == ErrorKind::Interrupted => continue,
Err(e) => return Err(e),
};
writer.write_all(unsafe { &buf.assume_init_ref()[..len] })?;
written += len as u64;
}
}
5 changes: 4 additions & 1 deletion library/std/src/io/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,8 @@ pub use self::buffered::IntoInnerError;
#[stable(feature = "rust1", since = "1.0.0")]
pub use self::buffered::{BufReader, BufWriter, LineWriter};
#[stable(feature = "rust1", since = "1.0.0")]
pub use self::copy::copy;
#[stable(feature = "rust1", since = "1.0.0")]
pub use self::cursor::Cursor;
#[stable(feature = "rust1", since = "1.0.0")]
pub use self::error::{Error, ErrorKind, Result};
Expand All @@ -279,9 +281,10 @@ pub use self::stdio::{_eprint, _print};
#[doc(no_inline, hidden)]
pub use self::stdio::{set_panic, set_print};
#[stable(feature = "rust1", since = "1.0.0")]
pub use self::util::{copy, empty, repeat, sink, Empty, Repeat, Sink};
pub use self::util::{empty, repeat, sink, Empty, Repeat, Sink};

mod buffered;
pub(crate) mod copy;
mod cursor;
mod error;
mod impls;
Expand Down
8 changes: 8 additions & 0 deletions library/std/src/io/stdio.rs
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,14 @@ impl Read for Stdin {
}
}

// only used by platform-dependent io::copy specializations, i.e. unused on some platforms
#[cfg(any(target_os = "linux", target_os = "android"))]
impl StdinLock<'_> {
pub(crate) fn as_mut_buf(&mut self) -> &mut BufReader<impl Read> {
&mut self.inner
}
}

#[stable(feature = "rust1", since = "1.0.0")]
impl Read for StdinLock<'_> {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
Expand Down
2 changes: 1 addition & 1 deletion library/std/src/io/tests.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use super::{repeat, Cursor, SeekFrom};
use crate::cmp::{self, min};
use crate::io::prelude::*;
use crate::io::{self, IoSlice, IoSliceMut};
use crate::io::{BufRead, Read, Seek, Write};
use crate::ops::Deref;

#[test]
Expand Down
73 changes: 1 addition & 72 deletions library/std/src/io/util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,78 +4,7 @@
mod tests;

use crate::fmt;
use crate::io::{self, BufRead, ErrorKind, Initializer, IoSlice, IoSliceMut, Read, Write};
use crate::mem::MaybeUninit;

/// Copies the entire contents of a reader into a writer.
///
/// This function will continuously read data from `reader` and then
/// write it into `writer` in a streaming fashion until `reader`
/// returns EOF.
///
/// On success, the total number of bytes that were copied from
/// `reader` to `writer` is returned.
///
/// If you’re wanting to copy the contents of one file to another and you’re
/// working with filesystem paths, see the [`fs::copy`] function.
///
/// [`fs::copy`]: crate::fs::copy
///
/// # Errors
///
/// This function will return an error immediately if any call to [`read`] or
/// [`write`] returns an error. All instances of [`ErrorKind::Interrupted`] are
/// handled by this function and the underlying operation is retried.
///
/// [`read`]: Read::read
/// [`write`]: Write::write
///
/// # Examples
///
/// ```
/// use std::io;
///
/// fn main() -> io::Result<()> {
/// let mut reader: &[u8] = b"hello";
/// let mut writer: Vec<u8> = vec![];
///
/// io::copy(&mut reader, &mut writer)?;
///
/// assert_eq!(&b"hello"[..], &writer[..]);
/// Ok(())
/// }
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
pub fn copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64>
where
R: Read,
W: Write,
{
let mut buf = MaybeUninit::<[u8; super::DEFAULT_BUF_SIZE]>::uninit();
// FIXME: #42788
//
// - This creates a (mut) reference to a slice of
// _uninitialized_ integers, which is **undefined behavior**
//
// - Only the standard library gets to soundly "ignore" this,
// based on its privileged knowledge of unstable rustc
// internals;
unsafe {
reader.initializer().initialize(buf.assume_init_mut());
}

let mut written = 0;
loop {
let len = match reader.read(unsafe { buf.assume_init_mut() }) {
Ok(0) => return Ok(written),
Ok(len) => len,
Err(ref e) if e.kind() == ErrorKind::Interrupted => continue,
Err(e) => return Err(e),
};
writer.write_all(unsafe { &buf.assume_init_ref()[..len] })?;
written += len as u64;
}
}
use crate::io::{self, BufRead, Initializer, IoSlice, IoSliceMut, Read, Write};

/// A reader which is always at EOF.
///
Expand Down
1 change: 1 addition & 0 deletions library/std/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,7 @@
#![feature(toowned_clone_into)]
#![feature(total_cmp)]
#![feature(trace_macros)]
#![feature(try_blocks)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little surprised this wasn't here already... As far as I know the feature is somewhat blocked, but it looks like we do use it elsewhere in the repo so I'm ok with adding it here too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only used it for cleanup in tests. But that was before I learned that other IO tests don't clean up the files they create either. So removing it and adding to the clutter instead would also be an option.

#![feature(try_reserve)]
#![feature(unboxed_closures)]
#![feature(unsafe_block_in_unsafe_fn)]
Expand Down
85 changes: 8 additions & 77 deletions library/std/src/sys/unix/fs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1191,88 +1191,19 @@ pub fn copy(from: &Path, to: &Path) -> io::Result<u64> {

#[cfg(any(target_os = "linux", target_os = "android"))]
pub fn copy(from: &Path, to: &Path) -> io::Result<u64> {
use crate::cmp;
use crate::sync::atomic::{AtomicBool, Ordering};

// Kernel prior to 4.5 don't have copy_file_range
// We store the availability in a global to avoid unnecessary syscalls
static HAS_COPY_FILE_RANGE: AtomicBool = AtomicBool::new(true);

unsafe fn copy_file_range(
fd_in: libc::c_int,
off_in: *mut libc::loff_t,
fd_out: libc::c_int,
off_out: *mut libc::loff_t,
len: libc::size_t,
flags: libc::c_uint,
) -> libc::c_long {
libc::syscall(libc::SYS_copy_file_range, fd_in, off_in, fd_out, off_out, len, flags)
}

let (mut reader, reader_metadata) = open_from(from)?;
let max_len = u64::MAX;
let (mut writer, _) = open_to_and_set_permissions(to, reader_metadata)?;

let has_copy_file_range = HAS_COPY_FILE_RANGE.load(Ordering::Relaxed);
let mut written = 0u64;
while written < max_len {
let copy_result = if has_copy_file_range {
let bytes_to_copy = cmp::min(max_len - written, usize::MAX as u64) as usize;
let copy_result = unsafe {
// We actually don't have to adjust the offsets,
// because copy_file_range adjusts the file offset automatically
cvt(copy_file_range(
reader.as_raw_fd(),
ptr::null_mut(),
writer.as_raw_fd(),
ptr::null_mut(),
bytes_to_copy,
0,
))
};
if let Err(ref copy_err) = copy_result {
match copy_err.raw_os_error() {
Some(libc::ENOSYS | libc::EPERM | libc::EOPNOTSUPP) => {
HAS_COPY_FILE_RANGE.store(false, Ordering::Relaxed);
}
_ => {}
}
}
copy_result
} else {
Err(io::Error::from_raw_os_error(libc::ENOSYS))
};
match copy_result {
Ok(0) if written == 0 => {
// fallback to work around several kernel bugs where copy_file_range will fail to
// copy any bytes and return 0 instead of an error if
// - reading virtual files from the proc filesystem which appear to have 0 size
// but are not empty. noted in coreutils to affect kernels at least up to 5.6.19.
// - copying from an overlay filesystem in docker. reported to occur on fedora 32.
return io::copy(&mut reader, &mut writer);
}
Ok(0) => return Ok(written), // reached EOF
Ok(ret) => written += ret as u64,
Err(err) => {
match err.raw_os_error() {
Some(
libc::ENOSYS | libc::EXDEV | libc::EINVAL | libc::EPERM | libc::EOPNOTSUPP,
) => {
// Try fallback io::copy if either:
// - Kernel version is < 4.5 (ENOSYS)
// - Files are mounted on different fs (EXDEV)
// - copy_file_range is broken in various ways on RHEL/CentOS 7 (EOPNOTSUPP)
// - copy_file_range is disallowed, for example by seccomp (EPERM)
// - copy_file_range cannot be used with pipes or device nodes (EINVAL)
assert_eq!(written, 0);
return io::copy(&mut reader, &mut writer);
}
_ => return Err(err),
}
}
}
use super::kernel_copy::{copy_regular_files, CopyResult};

match copy_regular_files(reader.as_raw_fd(), writer.as_raw_fd(), max_len) {
CopyResult::Ended(result) => result,
CopyResult::Fallback(written) => match io::copy::generic_copy(&mut reader, &mut writer) {
Ok(bytes) => Ok(bytes + written),
Err(e) => Err(e),
},
}
Ok(written)
}

#[cfg(any(target_os = "macos", target_os = "ios"))]
Expand Down
Loading