Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stale data when using Expiry #298

Closed
kojilin opened this issue Feb 15, 2019 · 11 comments
Closed

Stale data when using Expiry #298

kojilin opened this issue Feb 15, 2019 · 11 comments

Comments

@kojilin
Copy link

kojilin commented Feb 15, 2019

Recently we start to use Expiry but found some data keep in memory even it exceeds our setting.
After dump we saw stale data all has write time near 150 years later. And after checking source, I think it's from Async.MAXIMUM_EXPIRY.

Below is my sample code.

        final AsyncLoadingCache<String, String> cache = Caffeine
                .newBuilder()
                .expireAfter(new Expiry<String, String>() {
                    @Override
                    public long expireAfterCreate(@Nonnull String key, @Nonnull String value,
                                                  long currentTime) {
                        return Duration.ofDays(1).toNanos();
                    }

                    @Override
                    public long expireAfterUpdate(@Nonnull String key, @Nonnull String value, long currentTime,
                                                  long currentDuration) {
                        return currentDuration;
                    }

                    @Override
                    public long expireAfterRead(@Nonnull String key, @Nonnull String value, long currentTime,
                                                long currentDuration) {
                        return currentDuration;
                    }
                })
                .buildAsync((key, executor) -> CompletableFuture.supplyAsync(() -> {
                    return key + "'s value";
                }, executor));

I think there is a possibility that LocalAsyncLoadingCache#get -> BoundedLocalCache#computeIfAbsent may have race condition with create/update.
If there is a read operation and stop at isComputingAsync until task thread finished(but keep it stop at BoundedLocalCache#replace).
2019-02-16 1 28 01
and
2019-02-16 1 30 32

Read thread may enter the block with long duration and make expireAfterRead return MAXIMUM_EXPIRY, so setVariableTime may write now+MAXIMUM_EXPIRY.

2019-02-16 1 29 47

@ben-manes
Copy link
Owner

Thanks. I've been meaning to switch reads to use a compare-and-swap to update the timestamp, but thought it was a benign race so it was in the backlog. I hadn't thought of this case, obviously.

Do you think you can write a test case to demonstrate the race? I'm hoping to make a release this weekend when I merge the async branch (fix test cases). I'll try to fix this too for that release.

@kojilin
Copy link
Author

kojilin commented Feb 15, 2019

I can't reproduce without pausing thread, a little hard to write test case.

@ben-manes
Copy link
Owner

Oh, take a look at Awaitility which is really handy for these cases. For example,

public void getFunc_absent_cancelled(AsyncCache<Integer, Integer> cache,
CacheContext context) {
AtomicBoolean done = new AtomicBoolean();
CompletableFuture<Integer> valueFuture = cache.get(context.absentKey(), k -> {
Awaits.await().until(done::get);
return null;
});
valueFuture.whenComplete((r, e) -> done.set(true));
valueFuture.cancel(true);
Awaits.await().untilTrue(done);
await().until(() -> context, both(hasMissCount(1)).and(hasHitCount(0)));
await().until(() -> context, both(hasLoadSuccessCount(0)).and(hasLoadFailureCount(1)));
assertThat(valueFuture.isDone(), is(true));
assertThat(cache.getIfPresent(context.absentKey()), is(nullValue()));
}

but I can also give it a shot when I get free over the weekend. Its a 3 day'r in the U.S. so I should be able to find the time.

@kojilin
Copy link
Author

kojilin commented Feb 15, 2019

I see. I tried it but looks like problem is Completable just finish but before calling BoundedLocalCache#replace, LocalAsyncLoadingCache#get -> BoundedLocalCache#computeIfAbsent need to check isComputingAsync and into block.(I mean I didn't find the place can let it await during completion of Completable and BoundedLocalCache#replace.

@ben-manes
Copy link
Owner

okay, thanks. I think this is an easy fix if I understand it correctly, as I've been meaning to get around to making the update on read stricter. Since you found its not easily testable, we can try testing using a thrashing approach, like the MultiThreadedTest does.

@ben-manes
Copy link
Owner

I have a possible fix, but I need to manually verify like you did to see it fail and (maybe) succeed. Then devise if I can get a test case by thrashing it with threads until it breaks. Hopefully have this fixed and released over the weekend.
https://github.com/ben-manes/caffeine/tree/expiry

@ben-manes
Copy link
Owner

ben-manes commented Feb 23, 2019

I wrote a test, got it to fail, and verified that my change passes it. This was using Awaitility to setup the at Expiry calls.

Technically I shouldn't call expireAfterRead if the currentDuration is greater than the user settable maximum (220yrs in-flight, 150yrs cap for user setting). That is both more correct if there is logic in the expiry (perhaps even expensive) and avoids this problem. It also breaks my test as I can't wait.

I'll take another crack at the test, but otherwise I'm close enough to promise a release over the weekend.

@kojilin
Copy link
Author

kojilin commented Feb 24, 2019

Thanks, didn't aware my speculation can replay like the test case. 🙇
I will try new version after new release.

@ben-manes
Copy link
Owner

Released 2.7

@kojilin
Copy link
Author

kojilin commented Mar 14, 2019

Hi, just wanna give the result, seems the problem is solved. Thanks.

@ben-manes
Copy link
Owner

Great! Thank you for investigating and reporting the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants