add `cache_info` #43

ericphanson · 2023-11-20T22:44:29Z

closes #28

I think this should have minimal performance impact, since we simply increment an integer on hits/misses. We are already in the lock in each case, so we don't need to take it.

Some choices:

I reset the stats on empty!, like how functools clears both entries and stats on empty_cache().
I only count hits/misses on get and get! since those are on the only viable methods to use for caching. e.g. setindex! doesn't care about the existing value so it doesn't make sense to count it as a hit/miss. And getindex errors if the value is not present, so it doesn't really seem like that should count as a cache miss (rather, it is exceptional behavior). This aligns with the README documentation that only describes get and get! as fulfilling a caching interface.
I use a similar show method to functools, since it clearly communicates how to access the entries. I switched the order slightly compared to functools (currentsize before maxsize because it makes more sense to me).
cache_info provides an immutable snapshot rather than a reference to the cache. This way there isn't a need for locking on access to to the cache info fields.
I use "properties" rather than "fields" in the docstring/documentation so if needed we can change the fields while exposing getproperty overloads for api compatibility. I don't forsee that we would need this though.

To check it is correct according to this api, check that every branch of get and get! either increments a hit or a miss, and does so within the lock. (I believe I've done so).

ericphanson · 2023-11-21T00:54:29Z

BTW, functools can't really help us decide what should count as a hit/miss (just get/get! as I've done here, or also other methods), because functools only supports a @memoize-type function-wrapping approach. So the only api to access the cache with functools is calling the function, meaning essentially get!.

codecov-commenter · 2023-11-21T11:33:47Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (0d2635d) 77.06% compared to head (534f84e) 78.45%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master      #43      +/-   ##
==========================================
+ Coverage   77.06%   78.45%   +1.39%     
==========================================
  Files           2        2              
  Lines         279      297      +18     
==========================================
+ Hits          215      233      +18     
  Misses         64       64

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ericphanson · 2023-12-02T18:18:42Z

I think the case get/get! doesn't cover is e.g. if someone writes out the definition of those:

function weird_cache_user(cache, x)
    if haskey(cache, x)
          return cache[x]
    else
         cache[x] = f(x)
    end
end

Here, we are clearly using cache as a cache, and the outcome of this function "should" be a hit if x was in the cache and a miss otherwise. To actually implement that, one could either count a hit/miss from haskey, or a hit from getindex and a miss from setindex!. The latter seems problematic as one could imagine setindex being used to update a cached value, or force a value into cache, which shouldn't really be a miss.

I think having haskey act as a hit/miss could make sense, but it also is a bit inintuitively-mutating for haskey. One could imagine using haskey to check cache status or something, in a way one would not want hit/miss counts.

Therefore, I think having get / get! being the only functions that update hits/misses still makes sense, since those seem like the only times we reliably know if it was actually a cache hit/miss vs something else.

That's really the only design question I had open, and I've managed to convince myself it is right 😅. So then there's the question of "do we want this feature", and from #28 it sounds like the answer to yes. And there's the question "is it implemented correctly", and I checked again and indeed every hit/miss is inside the lock, and every return path of get and get! increments hit or miss.

So therefore I think we should reverse the inertia here, and I'll say if there's no objection within a week, I'll merge & tag this. (if there is review before then, much appreciated!).

Jutho · 2023-12-09T13:44:27Z

I believe this can be merged and the new version can be tagged. Thanks for all this nice work @ericphanson.

ericphanson added 2 commits November 20, 2023 23:34

add cache_info

7b55a53

typos

6b3fc6f

Merge branch 'master' into eph/cache-info

534f84e

ericphanson requested a review from Jutho November 21, 2023 11:32

ericphanson mentioned this pull request Nov 22, 2023

CacheInfo reporting #28

Closed

Jutho approved these changes Dec 9, 2023

View reviewed changes

ericphanson merged commit 3125353 into JuliaCollections:master Dec 9, 2023
12 checks passed

ericphanson deleted the eph/cache-info branch December 9, 2023 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add `cache_info` #43

add `cache_info` #43

ericphanson commented Nov 20, 2023

ericphanson commented Nov 21, 2023

codecov-commenter commented Nov 21, 2023

ericphanson commented Dec 2, 2023 •

edited

Loading

Jutho commented Dec 9, 2023

add cache_info #43

add cache_info #43

Conversation

ericphanson commented Nov 20, 2023

ericphanson commented Nov 21, 2023

codecov-commenter commented Nov 21, 2023

Codecov Report

ericphanson commented Dec 2, 2023 • edited Loading

Jutho commented Dec 9, 2023

add `cache_info` #43

add `cache_info` #43

ericphanson commented Dec 2, 2023 •

edited

Loading