Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip loading of local cache data when possible #17495

Merged
merged 10 commits into from
Nov 10, 2022

Conversation

Eric-Arellano
Copy link
Contributor

Closes #17448.

@Eric-Arellano Eric-Arellano changed the title WIP: Skip loading of local cache data when possible Skip loading of local cache data when possible Nov 9, 2022
@Eric-Arellano Eric-Arellano marked this pull request as ready for review November 9, 2022 19:03
Comment on lines +235 to +236
async fn download_digest_to_local<
FRemote: Fn(Bytes) -> Result<(), String> + Send + Sync + 'static,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was simply extracted from load_bytes_with

)
})?;

f_remote(bytes.clone())?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this clone seems like wasted performance in some cases, but will leave it for a followup

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bytes.clone() is very cheap, because it is reference counted. But yea, I think you're right that it is no longer necessary here.

/// Ensure that a directory is locally loadable, which will download it from the Remote store as
/// a sideeffect (if one is configured).
///
pub async fn ensure_local_has_recursive_directory(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was inlined into the new function ensure_local_has_files

Comment on lines +992 to +998
let missing_file_digests = self
.local
.get_missing_digests(EntryType::File, file_digests)
.await?;
if missing_file_digests.is_empty() {
return Ok(());
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is new

Comment on lines -974 to +1012
.map(|file_digest| self.ensure_local_has_file(file_digest))
.collect::<Vec<_>>(),
.map(|file_digest| async move {
if let Err(e) = remote
.download_digest_to_local(self.local.clone(), file_digest, EntryType::File, |_| Ok(()))
.await
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the key change. We now only download, don't load.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

@@ -229,6 +229,59 @@ impl RemoteStore {
.await
.map(|&()| ())
}

/// Download the digest to the local byte store from this remote store. The function `f_remote`
/// can be used to validate/transform the bytes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// can be used to validate/transform the bytes.
/// can be used to validate the bytes.

Bytes is immutable, so the function can only validate.

Comment on lines 247 to 249
// TODO: Now that we always copy from the remote store to the local store before executing
// the caller's logic against the local store, `remote::ByteStore::load_bytes_with` no
// longer needs to accept a function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a pointer to #17065 here? I have some more ideas that I will likely put on that ticket today.

)
})?;

f_remote(bytes.clone())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bytes.clone() is very cheap, because it is reference counted. But yea, I think you're right that it is no longer necessary here.

pub async fn ensure_local_has_recursive_directory(
/// Ensure that the files are locally loadable. This will download them from the remote store as
/// a side effect, if one is configured.
pub async fn ensure_local_has_files(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to: ensure_local_has_digests, ensure_local_has, ensure_downloaded, etc.

Somewhat implicit in here is that we also want to guarantee that the Directory protos have been persisted locally... so this method should probably also call record_digest_trie on each loaded tree to ensure that they exist locally as well. It's likely a bug that we weren't doing that before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that at the end of callling download_digest_to_local on every missing file digest, we then also use a future::try_join_all that calls ensure_directory_digest_persisted on every directory digest inputted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we then also use a future::try_join_all that calls ensure_directory_digest_persisted on every directory digest inputted?

Yea, either at the beginning or the end is fine. It's also possible to use try_join! to do it concurrently.

But you should be able to use record_digest_trie rather than ensure_directory_digest_persisted, because you will have already called load_digest_trie at the beginning of the method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuhood I want to make sure I got it right why it's safe to use record_digest_trie before we hit the remote store. That function is only about storing the directory entries, which don't need to come from the remote? We only download file entries from the remote?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That function is only about storing the directory entries, which don't need to come from the remote? We only download file entries from the remote?

Right: record_digest_trie only stores the Directory entries. Unlike files, Directory entries may be in one of three places: 1. memory, 2. local, 3. remote... record_digest_trie ensures that they are actually on disk locally, rather than potentially only in memory or remote.

}
directory::Entry::Symlink(_) | directory::Entry::Directory(_) => (),
});
self.record_digest_trie(trie, true).await?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuhood this sets initial_lease to true. I wasn't sure what that was about

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... me neither. I'm pretty sure that we should remove that parameter, but haven't spent the time to confirm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment here to explain why we do this (or in the method signature) would be great.

Comment on lines 1028 to 1029
// let _ = future::try_join_all(
// directory_digests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, stale code from iterating

}
directory::Entry::Symlink(_) | directory::Entry::Directory(_) => (),
});
self.record_digest_trie(trie, true).await?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment here to explain why we do this (or in the method signature) would be great.

@Eric-Arellano
Copy link
Contributor Author

@stuhood should this be cherry-picked to 2.15?

@Eric-Arellano Eric-Arellano enabled auto-merge (squash) November 10, 2022 18:08
@Eric-Arellano Eric-Arellano merged commit 19de895 into pantsbuild:main Nov 10, 2022
@Eric-Arellano Eric-Arellano deleted the fix-cache-perf branch November 10, 2022 18:24
@stuhood
Copy link
Member

stuhood commented Nov 10, 2022

@stuhood should this be cherry-picked to 2.15?

I believe so, yea. A 95% performance difference for a cache lookup amounts to a bug.

@stuhood stuhood added this to the 2.15.x milestone Nov 10, 2022
Eric-Arellano added a commit to Eric-Arellano/pants that referenced this pull request Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adjust cache_content_behavior="fetch" to skip loading of data
2 participants