trie prefetcher: cleanup blocking get #7723

jakmeier · 2022-09-29T12:19:46Z

This busy loop was a quick hack to write simpler code under time pressure. Let's replace it with a proper design, such as a Condvar..

nearcore/core/store/src/trie/prefetching_trie_storage.rs

Lines 330 to 339 in aad3bf2

    
               pub(crate) fn blocking_get(&self, key: CryptoHash) -> Option<Arc<[u8]>> { 
        
                   loop { 
        
                       match self.0.lock().expect(POISONED_LOCK_ERR).slots.get(&key) { 
        
                           Some(PrefetchSlot::Done(value)) => return Some(value.clone()), 
        
                           Some(_) => (), 
        
                           None => return None, 
        
                       } 
        
                       std::thread::sleep(std::time::Duration::from_micros(1)); 
        
                   } 
        
               }

Also, when calling this from a main thread, we should check the shard cache when blocking_get returns None. Due to forks, there can be multiple main threads and in that case the fetched value will be in the shard cache. Only if that also fails should we go and fetch the data from DB.

nearcore/core/store/src/trie/trie_storage.rs

Lines 528 to 538 in aad3bf2

    
           match prefetcher.prefetching.blocking_get(hash.clone()) { 
        
               Some(value) => value, 
        
               // Only main thread (this one) removes values from staging area, 
        
               // therefore blocking read will usually not return empty unless there 
        
               // was a storage error. Or in the case of forks and parallel chunk 
        
               // processing where one chunk cleans up prefetched data from the other. 
        
               // In any case, we can try again from the main thread. 
        
               None => { 
        
                   self.metrics.prefetch_retry.inc(); 
        
                   self.read_from_db(hash)? 
        
               }

The only other call-site is here:

nearcore/core/store/src/trie/prefetching_trie_storage.rs

Lines 240 to 264 in aad3bf2

    
           PrefetcherResult::Pending => { 
        
               // yield once before calling `block_get` that will check for data to be present again. 
        
               std::thread::yield_now(); 
        
               self.prefetching 
        
                   .blocking_get(hash.clone()) 
        
                   .or_else(|| { 
        
                       // `blocking_get` will return None if the prefetch slot has been removed 
        
                       // by the main thread and the value inserted into the shard cache. 
        
                       let mut guard = self.shard_cache.0.lock().expect(POISONED_LOCK_ERR); 
        
                       guard.get(hash) 
        
                   }) 
        
                   .ok_or_else(|| { 
        
                       // This could only happen if this thread started prefetching a value 
        
                       // while also another thread was already prefetching it. When the 
        
                       // other thread finishes, the main thread takes it out, and moves it to 
        
                       // the shard cache. And then this current thread gets delayed for long 
        
                       // enough that the value gets evicted from the shard cache again before 
        
                       // this thread has a chance to read it. 
        
                       // In this rare occasion, we shall abort the current prefetch request and 
        
                       // move on to the next. 
        
                       StorageError::StorageInconsistentState(format!( 
        
                           "Prefetcher failed on hash {hash}" 
        
                       )) 
        
                   }) 
        
           }

The text was updated successfully, but these errors were encountered:

jakmeier · 2022-12-05T13:32:16Z

to clarify a bit more, there are actually two issues in here:

remove the busy loop: refactor: clean up busy waiting in prefetcher blocking get #8215
avoid an unnecessary storage lookup from the main thread in the scenario of forks: feat: try reading shard_cache if prefetcher blocking_get returns None #8287

Part of #7723 This PR replaces busy waiting loop with update notifications on every underlying map update. It introduces single condition variable which is notified on every map update. One considered alternative is to maintain separate condition variable per slot to have more granular notifications. In practice this doesn't make any difference since additional iterations for irrelevant keys wouldn't cause any noticeable performance overhead, but it results in more complex and harder to reason about code.

…#8287) This is the second part of #7723: avoid an unnecessary storage lookup from the main thread in the scenario of forks.

jakmeier added C-housekeeping Category: Refactoring, cleanups, code quality A-storage Area: storage and databases T-storage labels Sep 29, 2022

This was referenced Sep 29, 2022

fix: properly stop prefetching background threads #7712

Merged

Tracking issue: Leftover after Sweatcoin launch #7634

Open

pugachAG self-assigned this Dec 5, 2022

This was referenced Dec 9, 2022

refactor: improve trie prefetching #8205

Merged

refactor: clean up busy waiting in prefetcher blocking get #8215

Merged

pugachAG mentioned this issue Jan 4, 2023

feat: try reading shard_cache if prefetcher blocking_get returns None #8287

Merged

pugachAG linked a pull request Jan 4, 2023 that will close this issue

feat: try reading shard_cache if prefetcher blocking_get returns None #8287

Merged

near-bulldozer bot closed this as completed in #8287 Jan 4, 2023

near-bulldozer bot pushed a commit that referenced this issue Jan 4, 2023

feat: try reading shard_cache if prefetcher blocking_get returns None (…

8c3dca9

…#8287) This is the second part of #7723: avoid an unnecessary storage lookup from the main thread in the scenario of forks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trie prefetcher: cleanup blocking get #7723

trie prefetcher: cleanup blocking get #7723

jakmeier commented Sep 29, 2022

jakmeier commented Dec 5, 2022 •

edited by pugachAG

Loading

trie prefetcher: cleanup blocking get #7723

trie prefetcher: cleanup blocking get #7723

Comments

jakmeier commented Sep 29, 2022

jakmeier commented Dec 5, 2022 • edited by pugachAG Loading

jakmeier commented Dec 5, 2022 •

edited by pugachAG

Loading