Skip to content

Commit

Permalink
Rollup merge of rust-lang#60971 - rbtcollins:docs-perf, r=rbtcollins,…
Browse files Browse the repository at this point in the history
…GuillaumeGomez

Add DocFS layer to rustdoc

* Move fs::create_dir_all calls into DocFS to provide a clean
  extension point if async extension there is needed.
* Convert callsites of create_dir_all to ensure_dir to reduce syscalls.
* Convert fs::write usage to DocFS.write
  (which also removes a lot of try_err! usage for easier reading)
* Convert File::create calls to use Vec buffers and then DocFS.write
  in order to both consistently reduce syscalls as well as make
  deferring to threads cleaner.
* Convert OpenOptions usage similarly - I could find no discussion on
  the use of create_new for that one output file vs all the other
  files render creates, if link redirection attacks are a concern
  DocFS will provide a good central point to introduce systematic
  create_new usage.
* DocFS::write defers to rayon for IO on Windows producing a modest
  speedup: before this patch on my development workstation:

$ time cargo +mystg1 doc -p winapi:0.3.7
 Documenting winapi v0.3.7
    Finished dev [unoptimized + debuginfo] target(s) in 6m 11s

real    6m11.734s
user    0m0.015s
sys     0m0.000s

Afterwards:
$ time cargo +mystg1 doc -p winapi:0.3.7
   Compiling winapi v0.3.7
 Documenting winapi v0.3.7
    Finished dev [unoptimized + debuginfo] target(s) in 49.53s

real    0m49.643s
user    0m0.000s
sys     0m0.015s

I haven't measured how much time is in the compilation logic vs in the
IO and outputting etc, but this takes it from frustating to tolerable
for me, at least for now.
  • Loading branch information
Centril authored Jun 21, 2019
2 parents 929b48e + 65f1295 commit c199f4f
Show file tree
Hide file tree
Showing 5 changed files with 286 additions and 137 deletions.
1 change: 1 addition & 0 deletions Cargo.lock
Original file line number Diff line number Diff line change
Expand Up @@ -3254,6 +3254,7 @@ dependencies = [
"minifier 0.0.30 (registry+https://github.com/rust-lang/crates.io-index)",
"parking_lot 0.7.1 (registry+https://github.com/rust-lang/crates.io-index)",
"pulldown-cmark 0.5.2 (registry+https://github.com/rust-lang/crates.io-index)",
"rustc-rayon 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
"tempfile 3.0.5 (registry+https://github.com/rust-lang/crates.io-index)",
]

Expand Down
1 change: 1 addition & 0 deletions src/librustdoc/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,6 @@ path = "lib.rs"
[dependencies]
pulldown-cmark = { version = "0.5.2", default-features = false }
minifier = "0.0.30"
rayon = { version = "0.2.0", package = "rustc-rayon" }
tempfile = "3"
parking_lot = "0.7"
116 changes: 116 additions & 0 deletions src/librustdoc/docfs.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
//! Rustdoc's FileSystem abstraction module.
//!
//! On Windows this indirects IO into threads to work around performance issues
//! with Defender (and other similar virus scanners that do blocking operations).
//! On other platforms this is a thin shim to fs.
//!
//! Only calls needed to permit this workaround have been abstracted: thus
//! fs::read is still done directly via the fs module; if in future rustdoc
//! needs to read-after-write from a file, then it would be added to this
//! abstraction.

use errors;

use std::fs;
use std::io;
use std::path::Path;
use std::sync::Arc;
use std::sync::mpsc::{channel, Receiver, Sender};

macro_rules! try_err {
($e:expr, $file:expr) => {{
match $e {
Ok(e) => e,
Err(e) => return Err(E::new(e, $file)),
}
}};
}

pub trait PathError {
fn new<P: AsRef<Path>>(e: io::Error, path: P) -> Self;
}

pub struct ErrorStorage {
sender: Option<Sender<Option<String>>>,
receiver: Receiver<Option<String>>,
}

impl ErrorStorage {
pub fn new() -> ErrorStorage {
let (sender, receiver) = channel();
ErrorStorage {
sender: Some(sender),
receiver,
}
}

/// Prints all stored errors. Returns the number of printed errors.
pub fn write_errors(&mut self, diag: &errors::Handler) -> usize {
let mut printed = 0;
// In order to drop the sender part of the channel.
self.sender = None;

for msg in self.receiver.iter() {
if let Some(ref error) = msg {
diag.struct_err(&error).emit();
printed += 1;
}
}
printed
}
}

pub struct DocFS {
sync_only: bool,
errors: Arc<ErrorStorage>,
}

impl DocFS {
pub fn new(errors: &Arc<ErrorStorage>) -> DocFS {
DocFS {
sync_only: false,
errors: Arc::clone(errors),
}
}

pub fn set_sync_only(&mut self, sync_only: bool) {
self.sync_only = sync_only;
}

pub fn create_dir_all<P: AsRef<Path>>(&self, path: P) -> io::Result<()> {
// For now, dir creation isn't a huge time consideration, do it
// synchronously, which avoids needing ordering between write() actions
// and directory creation.
fs::create_dir_all(path)
}

pub fn write<P, C, E>(&self, path: P, contents: C) -> Result<(), E>
where
P: AsRef<Path>,
C: AsRef<[u8]>,
E: PathError,
{
if !self.sync_only && cfg!(windows) {
// A possible future enhancement after more detailed profiling would
// be to create the file sync so errors are reported eagerly.
let contents = contents.as_ref().to_vec();
let path = path.as_ref().to_path_buf();
let sender = self.errors.sender.clone().unwrap();
rayon::spawn(move || {
match fs::write(&path, &contents) {
Ok(_) => {
sender.send(None)
.expect(&format!("failed to send error on \"{}\"", path.display()));
}
Err(e) => {
sender.send(Some(format!("\"{}\": {}", path.display(), e)))
.expect(&format!("failed to send non-error on \"{}\"", path.display()));
}
}
});
Ok(())
} else {
Ok(try_err!(fs::write(&path, contents), path))
}
}
}
Loading

0 comments on commit c199f4f

Please sign in to comment.