-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Size aware cache #924
Size aware cache #924
Conversation
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Signed-off-by: Cody Littley <cody@eigenlabs.org>
common/queue/queue.go
Outdated
@@ -0,0 +1,18 @@ | |||
package queue | |||
|
|||
// Queue is an interface for a generic queue. It's absurd there isn't an equivalent in the standard golang libraries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May check out https://github.com/emirpasic/gods, prefer not to rebuild base library if it's available already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've switched over to this library.
// the weight of the key-value pairs it can hold, rather than the number of key-value pairs. | ||
// | ||
// Unless otherwise noted, Cache implementations are not required to be thread safe. | ||
type WeightCalculator[K comparable, V any] func(key K, value V) uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this generality or just give the cache a total memory limit will work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generalization is needed to support the current use case. There isn't a trivial and cheap way we can determine the size of an arbitrary data structure, so we need the ability to compute the size for different data types. For things like blobs that are just an array of bytes, it's easy. For things like the chunks that are in a complex data structure, we need an abstraction like this.
|
||
// FIFOCache is a cache that evicts the least recently added item when the cache is full. Useful for situations | ||
// where time of addition is a better predictor of future access than time of most recent access. | ||
type FIFOCache[K comparable, V any] struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have test to indicate FIFO is better than LRU for relay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's just that FIFO is a lot easier to implement. I think it's premature to declare either to be faster at the moment, since we don't know for certain what the access pattern is going to be.
Signed-off-by: Cody Littley <cody@eigenlabs.org>
relay/cache/cache.go
Outdated
// WithWeightCalculator sets the weight calculator for the cache. May only be called | ||
// when the cache is empty. The weight calculator should be an idempotent function that | ||
// always returns the same output given the same input. | ||
WithWeightCalculator(weightCalculator WeightCalculator[K, V]) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be better to drop this method and pass it in from the constructor since it's required to have this calculator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no weight calculator was provided, it would give every entry a weight of 1
. But I can understand how the API is cleaner by moving it to the constructor. Change made.
relay/server.go
Outdated
// BlobCacheSize is the maximum number of items in the blob cache. | ||
BlobCacheSize int | ||
// BlobCacheSize is the maximum size of the blob cache, in bytes. | ||
BlobCacheSize uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May call it BlobCacheSizeBytes
and then it's self-explaining
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
relay/cache/cache.go
Outdated
// By default, the weight of a key-value pair is 1. Cache capacity is always specified in terms of | ||
// the weight of the key-value pairs it can hold, rather than the number of key-value pairs. | ||
// | ||
// Unless otherwise noted, Cache implementations are not required to be thread safe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why? It's quite often the use cases will need thread safety.
Also please move this line down to the Cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the cache accessor, we are forced to use a mutex in order to perform atomic operations on the cache and the map of in-progress operations. Since that is currently our only use case, adding thread safety to the cache would be redundant and would provide no immediate benefit.
I agree that having a thread safe cache may be useful if we want to reuse it as a general purpose cache. I see two options:
- we make the cache inherently thread safe, even though not all use cases may need thread safety.
- we create a wrapper object that implements
Cache
that adds thread safety to another cache implementation, e.g.func NewThreadSafeCache(cache Cache) Cache
. We could either write this now, or implement it later when we need it.
Of these options, which would you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this intends to be a generic cache (per comment of Cache
). If this has intended use case for cache accessor, then it's fine to keep it as it is.
For the two options, they both look fine. Mutex locking is very cheap so there is no concern even if some use cases don't need concurrency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently there are no intended use cases outside of the cache accessor. Let's wait to add locking until we have a use case that justifies it.
Signed-off-by: Cody Littley <cody@eigenlabs.org>
Why are these changes needed?
Makes relay caches aware of the size of their content (necessary for sane caching with if we want to support both large blobs and small blobs).
This should not be reviewed until #918 merges.
Checks