Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't clear the GroupContactCache until we're ready to insert the new version #21384

Merged
merged 1 commit into from
Sep 17, 2021

Conversation

mattwire
Copy link
Contributor

@mattwire mattwire commented Sep 6, 2021

Overview

See technical details below. We appear to be clearing group_contact_cache multiple times and may be partially responsible for deadlocks I'm still seeing on a client site.

Before

Clear the contact cache and reset cache date part way through build of temp table for new cache.

After

Cache invalidated before build, cleared before update.

Technical Details

Previously we were clearing the cache twice:

  1. CRM_Contact_BAO_GroupContactCache::load() calls buildGroupContactTempTable() which calls insertGroupContactsIntoTempTable() which called clearGroupContactCache() which removed data from the cache table and cleared the cache date.
  2. updateCacheFromTempTable() had it's own version of the query to remove data from the cache table but the data should have already been removed by 1. above and will only exist if we have processes running in parallel to rebuild the caches.

Now we explicitly clear the cache once only:

    1. CRM_Contact_BAO_GroupContactCache::load() calls buildGroupContactTempTable() which calls insertGroupContactsIntoTempTable().
  1. clearGroupContactCache() is called explicitly but does not reset the cache date and does not need to run within a transaction.
  2. updateCacheFromTempTable() does not need to do any cache clearing and just inserts directly.

Comments

@eileenmcnaughton

@civibot
Copy link

civibot bot commented Sep 6, 2021

(Standard links)

@civibot civibot bot added the master label Sep 6, 2021
@mattwire mattwire changed the title WIP Don't clear the GroupContactCache until we're ready to insert the new version Don't clear the GroupContactCache until we're ready to insert the new version Sep 6, 2021
@eileenmcnaughton
Copy link
Contributor

@mattwire Looking at it I think maybe we should instead just remove self::clearGroupContactCache($groupID); from insertGroupContactsIntoTempTable

insertGroupContactsIntoTempTable is called from buildGroupContactTempTable (only)

buildGroupContactTempTable is called from 2 places - both call updateCacheFromTempTable afterwards - so in both cases they will clear & update the temp table after buildGroupContactTempTable has run

Obviously this code has history but I think the only reason to call self::clearGroupContactCache($groupID); from insertGroupContactsIntoTempTable would be as a form of 'I'm just about to do the expensive group calculation so hold my drink' signalling. However, I thinking we have other locking mechanisms....

@mattwire
Copy link
Contributor Author

mattwire commented Sep 7, 2021

@eileenmcnaughton Yes, I've made lot's of iterations over this code and I know you have recently too. So I think this is just a further iteration now it's a bit clearer.

I do like the idea of having the actual clear in it's own function rather than hidden inside the updateCacheFromTempTable() - it feels like that's easier to understand and allows for further optimisation later. But as long as we can get the improvement I don't really mind?

@eileenmcnaughton
Copy link
Contributor

@mattwire OK - I think I'm following now.

So what we are 'losing' in this approach is the idea that we don't set the cache_date to NULL unless we are sure it has cleared properly.

But I think the theory is that it doesn't matter - as long as it is 'invalid' it won't be used again until it is updated. So we don't need that bit.

So I think I'm OK with this but ideally you would deploy it in production for a wee while & then we merge.

As an aside - there was an idea that it is possible to spawn a process after the browser is released & we could transfer the temp table to the cache table in a shutdown process after results have been delivered. However, I haven't managed to figure out if that really is possible or just something @totten thought was possible.

@mattwire
Copy link
Contributor Author

@eileenmcnaughton This has been in production for 10 days now and is working well.

I'm still seeing some deadlocks but they're now consistently when the DELETE FROM civicrm_group_contact_cache query is run - before it was a bit more random. So still further improvements to be made but this definitely seems to be a good improvement.

Marking as merge ready based on your previous comments and that it has now had some production run-time.

@mattwire mattwire added the merge ready PR will be merged after a few days if there are no objections label Sep 17, 2021
@eileenmcnaughton
Copy link
Contributor

OK I read through this again and I think it's right. Let's merge

If we can figure out what is conflicting with the DELETE FROM civicrm_group_contact_cache query runs we might be able to improve the locking mechanism

@eileenmcnaughton
Copy link
Contributor

@pfigel fyi

@eileenmcnaughton eileenmcnaughton merged commit 4210cc4 into civicrm:master Sep 17, 2021
@mattwire mattwire deleted the groupcontactcache branch September 18, 2021 09:11
@mattwire
Copy link
Contributor Author

Thanks @eileenmcnaughton I may not be able to get further than #21430 in investigating as funding for this round has pretty much run out. But will see if there are any further clues in the logs etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
master merge ready PR will be merged after a few days if there are no objections
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants