Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trie prefetcher: cleanup blocking get #7723

Closed
Tracked by #7634
jakmeier opened this issue Sep 29, 2022 · 1 comment · Fixed by #8287
Closed
Tracked by #7634

trie prefetcher: cleanup blocking get #7723

jakmeier opened this issue Sep 29, 2022 · 1 comment · Fixed by #8287
Assignees
Labels
A-storage Area: storage and databases C-housekeeping Category: Refactoring, cleanups, code quality

Comments

@jakmeier
Copy link
Contributor

This busy loop was a quick hack to write simpler code under time pressure. Let's replace it with a proper design, such as a Condvar..

pub(crate) fn blocking_get(&self, key: CryptoHash) -> Option<Arc<[u8]>> {
loop {
match self.0.lock().expect(POISONED_LOCK_ERR).slots.get(&key) {
Some(PrefetchSlot::Done(value)) => return Some(value.clone()),
Some(_) => (),
None => return None,
}
std::thread::sleep(std::time::Duration::from_micros(1));
}
}

Also, when calling this from a main thread, we should check the shard cache when blocking_get returns None. Due to forks, there can be multiple main threads and in that case the fetched value will be in the shard cache. Only if that also fails should we go and fetch the data from DB.

match prefetcher.prefetching.blocking_get(hash.clone()) {
Some(value) => value,
// Only main thread (this one) removes values from staging area,
// therefore blocking read will usually not return empty unless there
// was a storage error. Or in the case of forks and parallel chunk
// processing where one chunk cleans up prefetched data from the other.
// In any case, we can try again from the main thread.
None => {
self.metrics.prefetch_retry.inc();
self.read_from_db(hash)?
}

The only other call-site is here:

PrefetcherResult::Pending => {
// yield once before calling `block_get` that will check for data to be present again.
std::thread::yield_now();
self.prefetching
.blocking_get(hash.clone())
.or_else(|| {
// `blocking_get` will return None if the prefetch slot has been removed
// by the main thread and the value inserted into the shard cache.
let mut guard = self.shard_cache.0.lock().expect(POISONED_LOCK_ERR);
guard.get(hash)
})
.ok_or_else(|| {
// This could only happen if this thread started prefetching a value
// while also another thread was already prefetching it. When the
// other thread finishes, the main thread takes it out, and moves it to
// the shard cache. And then this current thread gets delayed for long
// enough that the value gets evicted from the shard cache again before
// this thread has a chance to read it.
// In this rare occasion, we shall abort the current prefetch request and
// move on to the next.
StorageError::StorageInconsistentState(format!(
"Prefetcher failed on hash {hash}"
))
})
}

@jakmeier jakmeier added C-housekeeping Category: Refactoring, cleanups, code quality A-storage Area: storage and databases T-storage labels Sep 29, 2022
@jakmeier
Copy link
Contributor Author

jakmeier commented Dec 5, 2022

to clarify a bit more, there are actually two issues in here:

  1. remove the busy loop: refactor: clean up busy waiting in prefetcher blocking get #8215
  2. avoid an unnecessary storage lookup from the main thread in the scenario of forks: feat: try reading shard_cache if prefetcher blocking_get returns None #8287

@pugachAG pugachAG self-assigned this Dec 5, 2022
near-bulldozer bot pushed a commit that referenced this issue Dec 22, 2022
Part of #7723

This PR replaces busy waiting loop with update notifications on every underlying map update. It introduces single condition variable which is notified on every map update. One considered alternative is to maintain separate condition variable per slot to have more granular notifications. In practice this doesn't make any difference since additional iterations for irrelevant keys wouldn't cause any noticeable performance overhead, but it results in more complex and harder to reason about code.
near-bulldozer bot pushed a commit that referenced this issue Jan 4, 2023
…#8287)

This is the second part of #7723: avoid an unnecessary storage lookup from the main thread in the scenario of forks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-storage Area: storage and databases C-housekeeping Category: Refactoring, cleanups, code quality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants