-
Notifications
You must be signed in to change notification settings - Fork 895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should download and install concurrently #731
Comments
I don't know if it would be profitable to do multiple network requests in parallel, or multiple file ops in parallel, but it would definitely be profitable to be doing networking while doing file I/O. Today rustup does installation in two phases: first it acquires all the resources off the network; then it installs them. It does this to eliminate the uncertainty of the network failing during installation. The actual file I/O though in rustup does have a transactional system that is supposed to be able to roll back, and it definitely does in the test suite, and I've seen it do rollbacks live. I am not super-confident that it is bulletproof, though we could try interleaving downloading and installation and see how it goes. Adding parallelism here would make status messages nondeterministic and more confusing. There are definitely opportunities for improvement here, though I'm not sure I'm ready to pull the trigger yet without thinking about the constraints more. If somebody wanted to give it a shot I'd be happy to review. |
For reference, on my 100Mbps connection, downloading the tarball for As for output status/messages, maybe something like indicatif could work? |
One thing I think thats worth noting is that the failure modes leading to a partial install go up with concurrent download-and-install : the transactional system is only an approximation of one - it has no write ahead journal, nor any ability to recover after interruptions. My recommendation would be to eliminate the transactional system in favour of a interrupt-safe eventually correct system: use the manifests and installed metadata to (coarsely, not per-individual-file!) cleanup after interrupted executions; this would permit streaming installation where the archive doesn't have to get written to disk at all. |
@rbtcollins That would certainly be a better approach. Unfortunately wg-rustup doesn't have a lot of time for large architectural changes like this right now. :( |
See also #2417 |
I did an experiment to see how much performance is being left on the table in the current architecture. See https://github.com/dtolnay/fast-rustup. rustup: $ rustup toolchain remove nightly-2024-01-01
$ time rustup toolchain install nightly-2024-01-01
17.9 seconds fast-rustup: $ rustup toolchain remove nightly-2024-01-01
$ time target/release/fast-rustup nightly-2024-01-01
5.4 seconds This is tested on my laptop where I get 90+ MiB/s from static.rust-lang.org. Right now it just supports the "default" profile. It installs the same contents as rustup aside from what looks like some bookkeeping differences. Only in rustup/lib/rustlib: components
Only in rustup/lib/rustlib: manifest-cargo-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: manifest-clippy-preview-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: manifest-rustc-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: manifest-rust-docs-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: manifest-rustfmt-preview-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: manifest-rust-std-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: multirust-channel-manifest.toml
Only in rustup/lib/rustlib: multirust-config.toml
Only in rustup/lib/rustlib: rust-installer-version
Only in fast-rustup/lib/rustlib/x86_64-unknown-linux-gnu/lib: self-contained If someone wants to help see if we can make it even faster, I would welcome PRs. Especially the filesystem I/O: currently a single component's contents are still being written to the filesystem serially. One file must finish being written before the next file from the same component begins being written. I am not an expert in filesystem performance characteristics and I have not tried looking into whether there is a way to do this better, but maybe buffering files in memory with a SPMC threadpool to perform the filesystem I/O. |
Two considerations:
|
Currently rustup is scarily sequential. It could easily download more stuff while installing the previously downloaded stuff.
The text was updated successfully, but these errors were encountered: