Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash the sysroot files metadata #16

Merged
merged 1 commit into from
Feb 11, 2024

Conversation

saethlin
Copy link
Contributor

@saethlin saethlin commented Feb 10, 2024

Worst-case (filesystem on a USB drive with the cache recently purged) time to hash the whole sysroot file contents looks like ~1 second. That seems a bit steep. Worst-case time to hash just the modification time and size of every file looks like 225 ms.

That's not perfect; a modification to a file that doesn't change size or modification time will cause spurious rebuilds from saving a file in an editor without changes, and if a file write happens to not change size or modification time we may fail to rebuild. Still a huge upgrade from nothing.

@RalfJung
Copy link
Owner

Some experimentation on my own machine says it takes ~22 ms to hash the sysroot sources, and ~8 ms to stat everything in it.

I assume that was in a pretty fast SSD though. I wonder how it fares on slower storage...
Do you happen to have a spinning disk, e.g. a USB drive, that you can test this on?

Cargo.lock Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
@saethlin
Copy link
Contributor Author

I assume that was in a pretty fast SSD though. I wonder how it fares on slower storage...
Do you happen to have a spinning disk, e.g. a USB drive, that you can test this on?

With a sync && echo 3 > /proc/sys/vm/drop_caches to purge all filesystem/disk caching, it's ~600 ms on fast storage, and ~1000 ms on a USB drive. After the first run, the USB drive is ~28 ms.

Trying to come up with a realistic workload, I ran MIRI_LIB_SRC=/home/ben/usb/library cargo +miri miri setup then let crater-at-home run for a few minutes. It's a rather disk-heavy workload, and I'm hoping this puts the cache into a more realistic state. That time is ~120 ms.

@RalfJung
Copy link
Owner

Okay, so that is noticeable, but it's a worst-case in terms of disk performance.

Still, hashing the file sizes and mtimes should almost always be sufficient, shouldn't it?

@saethlin
Copy link
Contributor Author

Well, it'll surely be a lot better than what we have now, and probably better than what Cargo does. And the worst-case looks like 107 ms (NVMe) or 225 ms (USB) which is an advantage.

Copy link
Owner

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, let's just add a few for comments.

src/lib.rs Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
@RalfJung RalfJung merged commit 3702f0b into RalfJung:master Feb 11, 2024
13 checks passed
@RalfJung
Copy link
Owner

Thanks. :-)

@saethlin saethlin deleted the hash-the-sysroot branch February 11, 2024 19:29
@saethlin saethlin changed the title Hash the sysroot contents Hash the sysroot files metadata Feb 23, 2024
@@ -84,6 +85,29 @@ fn make_writeable(p: &Path) -> Result<()> {
Ok(())
}

/// Hash the metadata and size of every file in a directory, recursively.
pub fn hash_recursive(path: &Path, hasher: &mut DefaultHasher) -> Result<()> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason you made this public? Doesn't seem to me like we should expose this function from this library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. I think I just didn't notice.

bors added a commit to rust-lang/miri that referenced this pull request Mar 4, 2024
we properly rebuild the sysroot now when MIRI_LIB_SRC contents change

Thanks to RalfJung/rustc-build-sysroot#16
RalfJung pushed a commit to RalfJung/rust that referenced this pull request Mar 9, 2024
we properly rebuild the sysroot now when MIRI_LIB_SRC contents change

Thanks to RalfJung/rustc-build-sysroot#16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants