Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo doc walks whole project directory (not just source files) #10790

Closed
aidanhs opened this issue Jun 25, 2022 · 2 comments
Closed

cargo doc walks whole project directory (not just source files) #10790

aidanhs opened this issue Jun 25, 2022 · 2 comments
Labels
A-rebuild-detection Area: rebuild detection and fingerprinting C-bug Category: bug Command-doc Performance Gotta go fast!

Comments

@aidanhs
Copy link
Member

aidanhs commented Jun 25, 2022

Problem

Given a directory structure like

Cargo.toml
Cargo.lock
src/
bigdir/

(where bigdir has millions of files)

cargo build will work fine, but cargo doc will hang. Inspecting what it's doing with strace, it appears to be walking the entirety of bigdir/.

Putting bigdir/ in package.exclude fixes the issue, but given that cargo build works fine, it shouldn't be doing this in the first place.

Steps

$ cargo new --bin tmp
[...]
$ cd tmp
$ mkdir indexes
$ cd indexes
$ git clone git clone git@github.com:rust-lang/crates.io-index.git # big enough to demo on my spinning disk
[...]
$ cp -r crates.io-index/ crates.io-index1 # copy a few times to inflate the file count
$ cp -r crates.io-index/ crates.io-index2
$ cp -r crates.io-index/ crates.io-index3
$ cp -r crates.io-index/ crates.io-index4
$ cd ..
$ /usr/bin/time find indexes/ | wc -l # takes 1s just to list ~500k files on my spinning disk
0.68user 1.01system 0:01.71elapsed 99%CPU (0avgtext+0avgdata 3660maxresident)k
0inputs+0outputs (0major+339minor)pagefaults 0swaps
540041
$ cargo build && cargo doc # prime the caches so they don't have any work to do
[...]
$ cargo build # this is a no-op and is instant
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
$ cargo doc # this is a no-up but takes 3s
    Finished dev [unoptimized + debuginfo] target(s) in 3.01s
$ rm -rf indexes
$ cargo doc # although this should be a no-op, it seems to detect a change now indexes has gone? it's still much faster
 Documenting tmp v0.1.0 (/tmp/tmp.Vtq7sa9TrR/tmp)
    Finished dev [unoptimized + debuginfo] target(s) in 0.51s
$ cargo doc # a true no-ip
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s

Possible Solution(s)

No response

Notes

Here's a stack trace of cargo while it's doing the walking. The symbols are mangled, but still readable.

#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00005620e9a3eb06 in std::sys::unix::fs::try_statx::statx () at library/std/src/sys/unix/weak.rs:182
#2  std::sys::unix::fs::try_statx () at library/std/src/sys/unix/fs.rs:176
#3  0x00005620e9a3a0ba in std::sys::unix::fs::stat () at library/std/src/sys/unix/fs.rs:1298
#4  0x00005620e9279085 in _RNvXs6_CshI6Vgi4jARJ_7walkdirINtB5_11FilterEntryNtB5_8IntoIterNCNvMNtNtCsb7X0uq5T2Va_5cargo7sources4pathNtB13_10PathSource4walk0ENtNtNtNtCs3GHRnyuwEiJ_4core4iter6traits8iterator8Iterator4nextB17_ () at library/core/src/panicking.rs:181
#5  0x00005620e9248895 in _RNvMNtNtCsb7X0uq5T2Va_5cargo7sources4pathNtB2_10PathSource4walk () at library/core/src/panicking.rs:181
#6  0x00005620e9247ffb in _RNvMNtNtCsb7X0uq5T2Va_5cargo7sources4pathNtB2_10PathSource14list_files_git () at library/core/src/panicking.rs:181
#7  0x00005620e9246d58 in _RNvMNtNtCsb7X0uq5T2Va_5cargo7sources4pathNtB2_10PathSource10list_files () at library/core/src/panicking.rs:181
#8  0x00005620e9248d17 in _RNvMNtNtCsb7X0uq5T2Va_5cargo7sources4pathNtB2_10PathSource18last_modified_file () at library/core/src/panicking.rs:181
#9  0x00005620e92498bf in _RNvXs0_NtNtCsb7X0uq5T2Va_5cargo7sources4pathNtB5_10PathSourceNtNtNtB9_4core6source6Source11fingerprint ()
    at library/core/src/panicking.rs:181
#10 0x00005620e93a6183 in _RNvNtNtNtCsb7X0uq5T2Va_5cargo4core8compiler11fingerprint15pkg_fingerprint () at library/core/src/panicking.rs:181
#11 0x00005620e939f843 in _RNvNtNtNtCsb7X0uq5T2Va_5cargo4core8compiler11fingerprint9calculate.llvm.2113376480358362997 ()
    at library/core/src/panicking.rs:181
#12 0x00005620e939be0b in _RNvNtNtNtCsb7X0uq5T2Va_5cargo4core8compiler11fingerprint14prepare_target () at library/core/src/panicking.rs:181
#13 0x00005620e944d52e in _RNvNtNtCsb7X0uq5T2Va_5cargo4core8compiler7compile () at library/core/src/panicking.rs:181
#14 0x00005620e91d74cf in _RNvMNtNtNtCsb7X0uq5T2Va_5cargo4core8compiler7contextNtB2_7Context7compile () at library/core/src/panicking.rs:181
#15 0x00005620e9468b92 in _RNvNtNtCsb7X0uq5T2Va_5cargo3ops13cargo_compile10compile_ws () at library/core/src/panicking.rs:181
#16 0x00005620e94688f6 in _RNvNtNtCsb7X0uq5T2Va_5cargo3ops13cargo_compile7compile () at library/core/src/panicking.rs:181
#17 0x00005620e94b49cb in _RNvNtNtCsb7X0uq5T2Va_5cargo3ops9cargo_doc3doc () at library/core/src/panicking.rs:181
#18 0x00005620e901f6a0 in _RNvNtNtCsjwPD2hnFnDn_5cargo8commands3doc4exec () at library/core/src/panicking.rs:181
#19 0x00005620e903a1e6 in _RNvNtCsjwPD2hnFnDn_5cargo3cli4main () at library/core/src/panicking.rs:181
#20 0x00005620e900b19e in _RNvCsjwPD2hnFnDn_5cargo4main () at library/core/src/panicking.rs:181
#21 0x00005620e904f833 in _RINvNtNtCslaI8ac5hNbd_3std10sys_common9backtrace28___rust_begin_short_backtraceFEuuECsjwPD2hnFnDn_5cargo ()
    at library/core/src/panicking.rs:181
#22 0x00005620e9068409 in _RNCINvNtCslaI8ac5hNbd_3std2rt10lang_startuE0CsjwPD2hnFnDn_5cargo.llvm.2848788588271945103 ()
    at library/core/src/panicking.rs:181
#23 0x00005620e9a25eae in core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once () at library/core/src/ops/function.rs:280
#24 std::panicking::try::do_call () at library/std/src/panicking.rs:492
#25 std::panicking::try () at library/std/src/panicking.rs:456
#26 std::panic::catch_unwind () at library/std/src/panic.rs:137
#27 std::rt::lang_start_internal::{{closure}} () at library/std/src/rt.rs:128
#28 std::panicking::try::do_call () at library/std/src/panicking.rs:492
#29 std::panicking::try () at library/std/src/panicking.rs:456
#30 std::panic::catch_unwind () at library/std/src/panic.rs:137
#31 std::rt::lang_start_internal () at library/std/src/rt.rs:128
#32 0x00005620e900d222 in main () at library/core/src/panicking.rs:181

Version

$ cargo --version --verbose
cargo 1.63.0-nightly (a5e08c470 2022-06-23)
release: 1.63.0-nightly
commit-hash: a5e08c4703f202e30cdaf80ca3e7c00baa59c496
commit-date: 2022-06-23
host: x86_64-unknown-linux-gnu
libgit2: 1.4.2 (sys:0.14.2 vendored)
libcurl: 7.83.1-DEV (sys:0.4.55+curl-7.83.1 vendored ssl:OpenSSL/1.1.1n)
os: Ubuntu 18.04 (bionic) [64-bit]
@aidanhs aidanhs added the C-bug Category: bug label Jun 25, 2022
@aidanhs aidanhs changed the title cargo doc walks whole project directory (not just soure files) cargo doc walks whole project directory (not just source files) Jun 25, 2022
@ehuss
Copy link
Contributor

ehuss commented Jun 25, 2022

Thanks for the report!

This is essentially #9931. It is currently not possible to know which source files belong to the crate when running rustdoc. This is blocked on rust-lang/rust#91982.

@ehuss ehuss added Command-doc A-rebuild-detection Area: rebuild detection and fingerprinting Performance Gotta go fast! labels Jun 25, 2022
@aidanhs
Copy link
Member Author

aidanhs commented Jun 25, 2022

Aha, thanks! Will close this issue in favor of that one then.

@aidanhs aidanhs closed this as completed Jun 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rebuild-detection Area: rebuild detection and fingerprinting C-bug Category: bug Command-doc Performance Gotta go fast!
Projects
None yet
Development

No branches or pull requests

2 participants