Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizing docs generation #420

Closed
jcotton42 opened this issue Jan 21, 2021 · 32 comments
Closed

Optimizing docs generation #420

jcotton42 opened this issue Jan 21, 2021 · 32 comments
Labels
enhancement New feature or request

Comments

@jcotton42
Copy link

jcotton42 commented Jan 21, 2021

I had brought up the size of the generated rustdoc for windows-rs on the Rust Community Discord and rustdoc team member @jyn514 mentioned that adding the #![doc(html_no_source)] to the crate should suppress the HTML rendering of the crate source.

https://doc.rust-lang.org/stable/rustdoc/the-doc-attribute.html#html_no_source

@kennykerr
Copy link
Collaborator

Thanks!

@kennykerr kennykerr added the enhancement New feature or request label Jan 21, 2021
@jyn514
Copy link

jyn514 commented Jan 21, 2021

I can't build the docs on linux:

  = note: /usr/bin/ld: cannot find -loleaut32
          /usr/bin/ld: cannot find -lkernel32
          /usr/bin/ld: cannot find -loleaut32
          /usr/bin/ld: cannot find -lkernel32
          /usr/bin/ld: cannot find -lole32
          /usr/bin/ld: cannot find -lkernel32
          /usr/bin/ld: cannot find -lkernel32
          /usr/bin/ld: cannot find -lkernel32
          /usr/bin/ld: cannot find -loleaut32

I'd be interested how long rustdoc takes to run after you add html_no_source. Could someone generate some timing info for me?

$ rustup install nightly
$ time -v cargo +nightly rustdoc -- -Z time-passes

and then upload windows-PID.mm_profdata.
(time -v tells the total memory usage; if it's not on windows anything else that gives the max RSS works too)

@kennykerr
Copy link
Collaborator

Hey @jyn514 thanks for the help! @rylev may have some tips for how to build on Linux - I have no idea.

Note that you need to build a separate repo. The docs are too huge that I didn't want to include them inside the windows-rs repo. Instead, you can clone the https://github.com/microsoft/windows-docs-rs repo and build the crates/bindings package. Something like this:

C:\git\windows-docs-rs\crates\bindings>cargo doc --no-deps

This takes about 30 minutes and 20GB of memory. I updated the lib.rs file to look like this:

#![doc(html_no_source)]
::windows::include_bindings!();

Note that time -v is not available on Windows. I ran the following:

cargo +nightly rustdoc -- -Z time-passes

It took about 52 mins and it's peak working set (memory) is over 21GB.

I could not find a windows-PID.mm_profdata anywhere.

@jyn514
Copy link

jyn514 commented Jan 21, 2021

Hmm, it should be there somewhere - do you see any file ending in .mm_profdata? It may be in the workspaces root, not the directory you ran cargo in.

If not, what version of rustdoc are you using? I think the format of the self-profile files changed a few months ago.

@kennykerr
Copy link
Collaborator

Not from what I can tell:

image

@kennykerr
Copy link
Collaborator

C:\git\windows-docs-rs>rustdoc --version
rustdoc 1.49.0 (e1884a8e3 2020-12-29)

@jyn514
Copy link

jyn514 commented Jan 21, 2021

Oh shoot, I'm sorry, it should have been -Z self-profile, not -Z time-passes. It's ok, time-passes gave some useful info too. In particular it shows most of the time is spent on get_blanket_impls, which I've been meaning to improve for a while. It doesn't show where the memory usage is coming from, though, it already uses 18 GB before looking at impls.

You posted the default version of rustdoc - can you post the nightly version instead? rustdoc +nightly --version.

@kennykerr
Copy link
Collaborator

C:\git\windows-docs-rs>rustdoc +nightly --version
rustdoc 1.51.0-nightly (a4cbb44ae 2021-01-20)

OK, I'll rerun with self-profile.

@kennykerr
Copy link
Collaborator

Hmm, it's rather large. 😉

image

@kennykerr
Copy link
Collaborator

Not sure if it will work, but I've uploaded to OneDrive. Let me know if you can download it.

@jyn514
Copy link

jyn514 commented Jan 21, 2021

That worked, thanks! Downloading it now, I'll let you know what I find :)

@jyn514
Copy link

jyn514 commented Jan 21, 2021

By the way, you can analyze it yourself with measureme: https://github.com/rust-lang/measureme/

@jyn514
Copy link

jyn514 commented Jan 21, 2021

A surprisingly long time is being spent not in rustdoc, but in the compiler itself:

image

In particular, resolve_crate and item_types_checking both take a lot longer than I would expect.

@kennykerr
Copy link
Collaborator

The Rust compiler certainly has a lot of trouble with the resulting 300M windows.rs containing the definitions for the entire Windows API. While it is a lot of code, the Rust compiler has a long way to go before it can compete with the like of the C++ compiler in terms of throughput.

@rylev
Copy link
Contributor

rylev commented Jan 22, 2021

For those following along at home, the summarize summary can be seen here.

@kennykerr kennykerr changed the title Surpressing source HTML in generated rustdoc Optimizing docs generation Jan 22, 2021
@kennykerr
Copy link
Collaborator

kennykerr commented Jan 22, 2021

I have applied the html_no_source update - thanks for the suggestion!

GitHub no longer refuses to accept some of the files but continues to warn about the huge search-index.js file:

remote: warning: File docs/doc/search-index.js is 57.14 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB

The size of this file also seems to hurt page load and search performance in the browser.

@jyn514
Copy link

jyn514 commented Jan 22, 2021

Yeah, it's a known issue: rust-lang/rust#31387. I'll look into it after fixing rust-lang/rust#81251.

@kennykerr
Copy link
Collaborator

Thanks again for all your help.

@jyn514
Copy link

jyn514 commented Jan 25, 2021

FYI, what we've found is that the actual CPU time is not all that high (comparable to e.g. docs.rs), the problem is that compiling shoots up to 20 GB of memory usage which sets almost everyone's computer thrashing. @rylev compiled this on a computer with more than that amount of memory and it took about 2.5 minutes. https://rust-lang.zulipchat.com/#narrow/stream/247081-t-compiler.2Fperformance/topic/windows-rs.20perf/near/223954279

@kennykerr
Copy link
Collaborator

Was that on a Windows box? IO performance is very different between Windows and Linux. My dev box has 32GB of memory and it took around 30 minutes without much disk thrashing. It was definitely CPU bound and largely chugging away on a single thread. But maybe I missed something.

@jyn514
Copy link

jyn514 commented Jan 25, 2021

Oh, that's really strange. Maybe @rylev measured the wrong thing? It definitely took him no more than 3 minutes:
bindings-18880.summarize.log

@rylev
Copy link
Contributor

rylev commented Jan 26, 2021

Sorry, I should have made this more clear to @jyn514 that I was testing a subset of the full windows API surface. So it's perfectly conceivable that it would take ~30 minutes on my machine as well. I also have 32 GB of RAM.

@rylev
Copy link
Contributor

rylev commented Jan 29, 2021

FYI: The changes I made in rust-lang/rust#81419 and rust-lang/rust#81414 reduce cargo build build times of the bindings in this repo from ~43m45s to ~32m45s! I've not measured the impact on cargo doc runs but I imagine it will also have a fairly large impact.

@rylev
Copy link
Contributor

rylev commented Jan 29, 2021

If anyone is interested what the timings look like for a smaller subset of the bindings, here they are.

@kennykerr
Copy link
Collaborator

That's fantastic. Looks like some more low-hanging fruit... 😄

@kennykerr
Copy link
Collaborator

Turns out its much slower than I first imagined. I was looking at the summary time reported by cargo but it's lying. Process Explorer reports that's its off by quite a bit. 😉

image

Also take a look at how much memory its using.

@kennykerr
Copy link
Collaborator

I've done all I can to optimize this from my end. Going to close this issue unless someone has a reason to keep it open.

@jyn514
Copy link

jyn514 commented Feb 9, 2021

Hmm, I don't know if anyone has permissions to transfer this from microsoft/windows-rs to rust-lang/rust but if so that would be ideal.

@kennykerr
Copy link
Collaborator

I've asked someone at GitHub to take a look.

@kennykerr
Copy link
Collaborator

Unfortunately, GitHub doesn't support this. Happy to leave the issue open if you'd prefer.

@jyn514
Copy link

jyn514 commented Feb 9, 2021

Hmm that's weird, there's definitely a "transfer issue" button on rustc-dev-guide for me. Maybe GitHub doesn't support transferring between organizations?
Screenshot_20210209-125532_Firefox

Anyway, I think most of the rustdoc side of this is tracked elsewhere so I'd be ok with closing this.

@kennykerr
Copy link
Collaborator

Yes, you can transfer within an organization - I do that regularly - but you cannot do so across organizations (at least according to my GitHub contact).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants