(dev/core#217) PrevNext - Implement Redis. Decouple Query::getCachedContacts(). #12665

totten · 2018-08-14T23:41:55Z

Overview

NOTE: To avoid merge-conflicts, this PR extends/depends on #12664 -- which modifies similar sections of code. This is the final excerpt from #12377. Some commits have been rearranged/squashed to simplify the appearance, though the end result is the same.

This patch implements the first non-SQL storage mechanism for storing prev-next cache used by contact-search screens. It also includes a change in the CRM_Core_PrevNextCache_Interface and some small cleanups (as separate commits).

Before

The function CRM_Contact_BAO_Query::getCachedContacts() is hard-coded to only fetch CID's from the SQL-based cache.
The only implementation of CRM_Core_PrevNextCache_Interface is Sql.

After

The function CRM_Contact_BAO_Query::getCachedContacts() is more flexible -- accepting any CID's provided by CRM_Core_PrevNextCache_Interface::fetch().
There are two implementations of CRM_Core_PrevNextCache_Interface -- Sql and Redis.
There is an option ("Administer => Misc => PrevNext Cache") to choose the backend. By default, it auto-detects the best available backend.

(There's more discussion of changes in the commit messages.)

Technical Details

I had initially expected to store the entire result-set in Redis as one cache key, e.g.

$cache->set("{$prefix}/prevnext/{$cacheKey}", implode(',', $allMatchingContactIds);

$allMatchingContactIds = explode(',', $cache->get("{$prefix}/prevnext/{$cacheKey}"));

With a large result set (think "1 million contacts"), this would produce a large cache record -- and any page-view would have to fetch the entire list. But then I found that Redis and php-redis expose several data structures -- such hashes, lists, sets, and sorted-sets. The sorted-set API lets us track a list of integers, preserve the ordering, and fetch results on a paginated basis (ZRANGE / zRange()).

It seemed like the most performant approach would be to split the data into three pieces:

Store the main list of matching CID's as a sorted-set ({$prefix}/prevnext/{$cacheKey}/all)
Store the list of selected CID's as a sorted-set ({$prefix}/prevnext/{$cacheKey}/sel) with the same ordering (scores/weights) as the main list
Store the auxiliary data about each CID to a hash ({$prefix}/prevnext/{$cacheKey}/data)

If you open redis-cli while using the search interface, you can inspect the content of these keys, e.g.

#### Browse available cache keys
127.0.0.1:6380> keys */prevnext/*
1) "dmaster/prevnext/civicrm search 5221f7065db967c2256b4f2725912694_5264/all"
2) "dmaster/prevnext/civicrm search 5221f7065db967c2256b4f2725912694_5264/sel"
3) "dmaster/prevnext/civicrm search 5221f7065db967c2256b4f2725912694_5264/data"

#### Get list of all matching contacts from "all"
127.0.0.1:6380> ZRANGE "dmaster/prevnext/civicrm search 5221f7065db967c2256b4f2725912694_5264/all" 0 -1
  1) "51"
  2) "147"
  3) "123"
   ...

#### Get list of manually selected contacts from "sel"
127.0.0.1:6380> ZRANGE "dmaster/prevnext/civicrm search 5221f7065db967c2256b4f2725912694_5264/sel" 0 -1
1) "149"
2) "16"
3) "25"

#### Get auxiliary data from "data"
127.0.0.1:6380> hgetall "dmaster/prevnext/civicrm search 5221f7065db967c2256b4f2725912694_5264/data"
  1) "51"
  2) "Adams family"
  3) "147"
  4) "Adams, Andrew"
   ...

Comments

During development of the original #12377, I used this worksheet to organize fairly deep testing of the prev-next functionality. I haven't re-tested after the various cherry-picks/merges, but it could still be a good guide for thorough r-run testing.

civibot · 2018-08-14T23:41:57Z

(Standard links)

If this is your first pull-request for CiviCRM, please browse CONTRIBUTING.md for information about the development and testing processes.
If you are reviewing this pull-request, you may wish to consult the test sites and the Review Standards (long template, short template).

eileenmcnaughton · 2018-08-17T08:22:15Z

@totten so if you can rebase this I can review but I think it needs to be further split into

the rest of the cleanup
activate Redis caching for PrevNext

The reason being that we will need to performance test the second one which might take a bit longer to get answers on

eileenmcnaughton · 2018-08-27T03:29:58Z

@totten ^^

eileenmcnaughton · 2018-08-28T21:54:12Z

@totten can you rebase this?

(dev/core#217) Query::getCachedContacts - Use swappable fetch() instead of SQL JOIN The general context of this code is roughly as follows: * We've already filled up the prevnext cache with a bunch of contact-IDs. * The user wants to view a page of 50 contacts. * We want to lookup full information about 50 specific contacts for this page. It does makes sense to use `CRM_Contact_BAO_Query` for looking up the "full information" about contacts. However, the function `Query::getCachedContacts()` is hard-coded to read from the SQL-based prevnext cache. Before ------ * In `getCachedContacts()`, it grabbed the full SQL for `CRM_Contact_BAO_Query` and munged the query to: * Add an extra JOIN on `civicrm_prevnext_cache` (with a constraint on `cacheKey`) * Respect pagination (LIMIT/OFFSET) * Order results based on their position in the prevnext cache After ----- * In `CRM_Core_PrevNextCache_Interface`, the `fetch()` function provides one page-worth of contact IDs (in order). The `fetch()` function is tested by `E2E_Core_PrevNextTest`. * In `getCachedContacts()`, it doesn't know anything about `civicrm_prevnext_cache` or `cacheKey` or pagination. Instead, it just accepts CIDs for one page-worth of contacts. It returns contacts in the same order that was given. (dev/core#217) Implement Redis driver for PrevNext handling (dev/core#217) PrevNext - Cleanup parameter name in Sql::markSelection The new name is prettier and matches the names in `CRM_Core_PrevNextCache_{Interface,Redis}`. (dev/core#217) PrevNext - Add settings for admin to choose backend The auto-detection is a good default policy. However, this is new functionality. If some bug gets through the review/RC cycles, then this option provides an escape path.

totten · 2018-09-11T23:10:47Z

@eileenmcnaughton It's rebased now, so this of commits is now shorter.

(dev/core#217) Query::getCachedContacts - Use swappable fetch() instead of SQL JOIN The general context of this code is roughly as follows: * We've already filled up the prevnext cache with a bunch of contact-IDs. * The user wants to view a page of 50 contacts. * We want to lookup full information about 50 specific contacts for this page. It does makes sense to use `CRM_Contact_BAO_Query` for looking up the "full information" about contacts. However, the function `Query::getCachedContacts()` is hard-coded to read from the SQL-based prevnext cache. Before ------ * In `getCachedContacts()`, it grabbed the full SQL for `CRM_Contact_BAO_Query` and munged the query to: * Add an extra JOIN on `civicrm_prevnext_cache` (with a constraint on `cacheKey`) * Respect pagination (LIMIT/OFFSET) * Order results based on their position in the prevnext cache After ----- * In `CRM_Core_PrevNextCache_Interface`, the `fetch()` function provides one page-worth of contact IDs (in order). The `fetch()` function is tested by `E2E_Core_PrevNextTest`. * In `getCachedContacts()`, it doesn't know anything about `civicrm_prevnext_cache` or `cacheKey` or pagination. Instead, it just accepts CIDs for one page-worth of contacts. It returns contacts in the same order that was given. (dev/core#217) Implement Redis driver for PrevNext handling (dev/core#217) PrevNext - Cleanup parameter name in Sql::markSelection The new name is prettier and matches the names in `CRM_Core_PrevNextCache_{Interface,Redis}`. (dev/core#217) PrevNext - Add settings for admin to choose backend The auto-detection is a good default policy. However, this is new functionality. If some bug gets through the review/RC cycles, then this option provides an escape path.

eileenmcnaughton · 2018-09-18T22:40:47Z

@totten I have found a bug in this - when no results are found an sql error is generated in the getCachedContacts query - this fixes it but a better fix may be possible as I think (but haven't verified) that an early return would make sense in the event of no cids at this point

Bad query:

SELECT contact_a.id as contact_id, contact_a.contact_type as `contact_type`, contact_a.contact_sub_type as `contact_sub_type`, contact_a.sort_name as `sort_name`, contact_a.display_name as `display_name`, contact_a.do_not_email as `do_not_email`, contact_a.do_not_phone as `do_not_phone`, contact_a.do_not_mail as `do_not_mail`, contact_a.do_not_sms as `do_not_sms`, contact_a.do_not_trade as `do_not_trade`, contact_a.is_opt_out as `is_opt_out`, contact_a.legal_identifier as `legal_identifier`, 
contact_a.external_identifier as `external_identifier`, contact_a.nick_name as `nick_name`, contact_a.legal_name as `legal_name`, contact_a.image_URL as `image_URL`, contact_a.preferred_communication_method as `preferred_communication_method`, contact_a.preferred_language as `preferred_language`, contact_a.preferred_mail_format as `preferred_mail_format`, contact_a.first_name as `first_name`, contact_a.middle_name as `middle_name`, contact_a.last_name as `last_name`, contact_a.prefix_id as `prefix_id`, 
contact_a.suffix_id as `suffix_id`, contact_a.formal_title as `formal_title`, contact_a.communication_style_id as `communication_style_id`, contact_a.job_title as `job_title`, contact_a.gender_id as `gender_id`, contact_a.birth_date as `birth_date`, contact_a.is_deceased as `is_deceased`, contact_a.deceased_date as `deceased_date`, contact_a.household_name as `household_name`, IF ( contact_a.contact_type = 'Individual', NULL, contact_a.organization_name ) as organization_name, contact_a.sic_code as `sic_code`, contact_a.is_deleted as 
`contact_is_deleted`, IF ( contact_a.contact_type = 'Individual', contact_a.organization_name, NULL ) as current_employer, civicrm_address.id as address_id, civicrm_address.street_address as `street_address`, civicrm_address.supplemental_address_1 as `supplemental_address_1`, civicrm_address.supplemental_address_2 as `supplemental_address_2`, civicrm_address.supplemental_address_3 as `supplemental_address_3`, civicrm_address.city as `city`, civicrm_address.postal_code_suffix as `postal_code_suffix`, civicrm_address.postal_code as `postal_code`, civicrm_address.geo_code_1 as `geo_code_1`, civicrm_address.geo_code_2 as `geo_code_2`, civicrm_address.state_province_id as state_province_id, civicrm_address.country_id as country_id, civicrm_phone.id as phone_id, civicrm_phone.phone_type_id as phone_type_id, civicrm_phone.phone as `phone`, civicrm_email.id as email_id, civicrm_email.email as `email`, civicrm_email.on_hold as `on_hold`, civicrm_im.id as im_id, civicrm_im.provider_id as provider_id, civicrm_im.name as `im`, civicrm_worldregion.id as worldregion_id, civicrm_worldregion.name as `world_region`, 
(CASE
END
) AS _wgt 
 FROM civicrm_contact contact_a 
LEFT JOIN civicrm_address ON ( contact_a.id = civicrm_address.contact_id AND civicrm_address.is_primary = 1 ) LEFT JOIN civicrm_email ON (contact_a.id = civicrm_email.contact_id AND civicrm_email.is_primary = 1) LEFT JOIN civicrm_phone ON (contact_a.id = civicrm_phone.contact_id AND civicrm_phone.is_primary = 1)  
LEFT JOIN civicrm_im ON (contact_a.id = civicrm_im.contact_id AND civicrm_im.is_primary = 1) 
 LEFT JOIN civicrm_country ON civicrm_address.country_id = civicrm_country.id  
LEFT JOIN civicrm_worldregion ON civicrm_country.region_id = civicrm_worldregion.id  
WHERE  (  (  (  ( contact_a.sort_name LIKE 'kjkjhkhjkhjkjhjk%' ) )  )  )  AND (contact_a.is_deleted = 0) AND contact_a.id IN ()  
ORDER BY _wgt

eileenmcnaughton · 2018-11-04T21:04:48Z

test this please

eileenmcnaughton · 2018-11-04T21:39:33Z

@totten we've had this in production now for over a month without problems. I just tested the Redis implementation on staging. I exported the first name of all contacts with surnames starting with 'mc' - this gave me a set of contacts of a little over 180,000 which TBH I expected to be a bit slow / painful.

I am surprised to report that both with and without Redis on I would describe the experience as 'snappy'.

I also tried exporting contributions for that same number of contacts. It was anything but snappy and without altering indexes I couldn't get it to complete without a timeout (obviously 180k contacts represents more than 180k donations so the search is getting quite big now). I tried an index change and DID get it to complete - this is what I tried

ALTER TABLE civicrm_contact DROP KEY `index_is_deleted_sort_name`,
ADD KEY `index_sort_name_is_deleted_id` (`sort_name`,`is_deleted`,`id`);

The upshot of the testing being that this points to some new opportunities for improving speed but I think this patch is good to merge once the issue in my comment of Sep 19 is addressed

…ad of SQL JOIN The general context of this code is roughly as follows: * We've already filled up the prevnext cache with a bunch of contact-IDs. * The user wants to view a page of 50 contacts. * We want to lookup full information about 50 specific contacts for this page. It does makes sense to use `CRM_Contact_BAO_Query` for looking up the "full information" about contacts. However, the function `Query::getCachedContacts()` is hard-coded to read from the SQL-based prevnext cache. Before ------ * In `getCachedContacts()`, it grabbed the full SQL for `CRM_Contact_BAO_Query` and munged the query to: * Add an extra JOIN on `civicrm_prevnext_cache` (with a constraint on `cacheKey`) * Respect pagination (LIMIT/OFFSET) * Order results based on their position in the prevnext cache After ----- * In `CRM_Core_PrevNextCache_Interface`, the `fetch()` function provides one page-worth of contact IDs (in order). The `fetch()` function is tested by `E2E_Core_PrevNextTest`. * In `getCachedContacts()`, it doesn't know anything about `civicrm_prevnext_cache` or `cacheKey` or pagination. Instead, it just accepts CIDs for one page-worth of contacts. It returns contacts in the same order that was given.

The new name is prettier and matches the names in `CRM_Core_PrevNextCache_{Interface,Redis}`.

The auto-detection is a good default policy. However, this is new functionality. If some bug gets through the review/RC cycles, then this option provides an escape path.

totten · 2018-11-27T23:46:36Z

@eileenmcnaughton I've rebased and included https://gerrit.wikimedia.org/r/#/c/wikimedia/fundraising/crm/civicrm/+/461243/1/CRM/Contact/Selector.php as part of 2ca46d4#diff-143585e32dadc0e4857410eaad9206b4R585 (albeit with a ternary ? : notation).

I vaguely recall having some more discussion about whether to place the emptiness-check inside of or outside of getCachedContacts(); however, given that there is only one caller for getCachedContacts(), it doesn't seem like a big deal, so we'll just follow along with your patch.

Aside: re-reading this after some time, it's tempting to soften the transition policy -- though the fact that you've been using it in prod for a few months is a valid counter-point. If you're inclined toward 👍 on the more conservative policy, I can open it as a separate PR - but it's not a blocker here.

totten · 2018-11-28T00:37:07Z

jenkins, test this please

eileenmcnaughton · 2018-11-28T00:45:21Z

@totten so in fact we have been running in production still on msql caching - but I'm testing with Redis on staging. I'm just running some performance benchmarks at the moment - maybe we can get some Redis users to test the rc & we can add the conservative approach if need be? - I think @herbdool uses Redis

My understanding is that MemCache & APCCache are unaffected as they don't support the Redis change

totten · 2018-11-28T01:18:24Z

Aah, that's fine by me. It'd be great to get feedback from @herbdool or another Redis user. I've put the other commit (for more conservative transition policy) into its own PR (so that we can do it - or not - async).

You are correct that this only affects Redis -- there is no PrevNext driver for memcache or APC.

eileenmcnaughton · 2018-11-28T01:57:59Z

Breaking - I have some performance testing results

There are from hitting
civicrm/contact/search?_qf_Basic_display=true&qfKey=9c23cd81fab6d00c4172aa2f49613464_3875&force=1&crmPID=3

100 times with concurrency of 1 & of 5 (no noticeable difference) and each done on

array caching - blue
redis caching + sql prevnext caching (orangey)
redis caching + redis prevenext caching (green)

Note that I altered the page between each run. 3 runs per variant and I eliminated the slowest 20 % which have no obvious pattern but stretch the y axis making it hard to read

eileenmcnaughton · 2018-11-28T02:05:42Z

test this please

eileenmcnaughton · 2018-11-28T02:05:51Z

I think fails are unrelated

totten · 2018-11-28T21:03:52Z

For the record -- that's a really neat graph! 👍

Trying to simplify/analyze a bit, it seems like (with the data-set/use-case/environment under test):

The ArrayCache+SQL/blue-dots hover around 900ms (long-tail going higher, and dense block a little lower).
The Redis+SQL/orange-dots hover around 850ms (long-tail going higher, and dense block a little lower).
- That's roughly 50ms (6%) improvement over ArrayCache+SQL.
The Redis+Redis/green-dots hover around 790ms (long-tail going higher, and dense block a little lower).
- That's roughly 110ms (12%) improvement over ArrayCache+SQL.
- That's roughly 60ms (7%) improvement over Redis+SQL.

eileenmcnaughton · 2018-11-28T21:08:32Z

@totten yeah that sounds about right - I did some separate analysis on Redis on contact edit and I got about 12% performance lift there.

It DOES matter how your site is configured / provisioned. Locally I got a 38% improvement with Redis but it was actually worse once I added in concurrency. Obviously the people who set up Redis on our server did a better job than I did :-)

From a user experience POV - it takes longer locally (no Redis) to load the edit screen on a single contact than on our Redis enabled server. Obviously there are a LOT of differences between the 2 set ups - latency & DB size work against the remote whereas the server is much better provisioned & has Redis enabled. However, the difference is significant from a usablity POV - ie. the difference between 'snappy' and 'I got bored & flipped to a different tab'

eileenmcnaughton · 2018-11-28T21:10:38Z

@totten also locally I would note that while it is notably faster locally I have had to turn it off - just because there are things that cv flush misses with Redis turned on that make it a bit more painful to develop against when tinkering with things that might be cached (I think cached settings is one but I am not 100% sure what I was doing when I turned it off)

The setting `prevNextBackend` was introduced in PR civicrm#12665. The PR was originally written for an earlier version (circa 5.6) and eventually merged in a later version (circa 5.9). The metadata should match the version-number of the actual release.

civicrm-builder added the master label Aug 14, 2018

totten force-pushed the master-prevnext-redis branch from cb88479 to 120a87e Compare August 16, 2018 23:30

totten force-pushed the master-prevnext-redis branch from 120a87e to eb1e5ce Compare September 11, 2018 23:09

totten added 4 commits November 27, 2018 15:16

(dev/core#217) Implement Redis driver for PrevNext handling

751f3d9

(dev/core#217) PrevNext - Cleanup parameter name in Sql::markSelection

514813c

The new name is prettier and matches the names in `CRM_Core_PrevNextCache_{Interface,Redis}`.

(dev/core#217) PrevNext - Add settings for admin to choose backend

e28bf65

The auto-detection is a good default policy. However, this is new functionality. If some bug gets through the review/RC cycles, then this option provides an escape path.

totten force-pushed the master-prevnext-redis branch from eb1e5ce to e28bf65 Compare November 27, 2018 23:17

eileenmcnaughton added the merge on pass label Nov 28, 2018

totten mentioned this pull request Nov 28, 2018

(dev/core#217) PrevNext - More conservative transition (5.7=>5.8=>5.9) #13164

Merged

eileenmcnaughton merged commit cb5d518 into civicrm:master Nov 28, 2018

totten deleted the master-prevnext-redis branch November 28, 2018 21:04

totten mentioned this pull request Nov 28, 2018

(dev/core#217) Search.setting.php - Fix mismatched metadata #13168

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(dev/core#217) PrevNext - Implement Redis. Decouple Query::getCachedContacts(). #12665

(dev/core#217) PrevNext - Implement Redis. Decouple Query::getCachedContacts(). #12665

totten commented Aug 14, 2018 •

edited

Loading

civibot bot commented Aug 14, 2018

eileenmcnaughton commented Aug 17, 2018

eileenmcnaughton commented Aug 27, 2018

eileenmcnaughton commented Aug 28, 2018

totten commented Sep 11, 2018

eileenmcnaughton commented Sep 18, 2018 •

edited

Loading

eileenmcnaughton commented Nov 4, 2018

eileenmcnaughton commented Nov 4, 2018 •

edited

Loading

totten commented Nov 27, 2018

totten commented Nov 28, 2018

eileenmcnaughton commented Nov 28, 2018

totten commented Nov 28, 2018

eileenmcnaughton commented Nov 28, 2018 •

edited

Loading

eileenmcnaughton commented Nov 28, 2018

eileenmcnaughton commented Nov 28, 2018

totten commented Nov 28, 2018 •

edited

Loading

eileenmcnaughton commented Nov 28, 2018 •

edited

Loading

eileenmcnaughton commented Nov 28, 2018 •

edited

Loading

(dev/core#217) PrevNext - Implement Redis. Decouple Query::getCachedContacts(). #12665

(dev/core#217) PrevNext - Implement Redis. Decouple Query::getCachedContacts(). #12665

Conversation

totten commented Aug 14, 2018 • edited Loading

Overview

Before

After

Technical Details

Comments

civibot bot commented Aug 14, 2018

eileenmcnaughton commented Aug 17, 2018

eileenmcnaughton commented Aug 27, 2018

eileenmcnaughton commented Aug 28, 2018

totten commented Sep 11, 2018

eileenmcnaughton commented Sep 18, 2018 • edited Loading

eileenmcnaughton commented Nov 4, 2018

eileenmcnaughton commented Nov 4, 2018 • edited Loading

totten commented Nov 27, 2018

totten commented Nov 28, 2018

eileenmcnaughton commented Nov 28, 2018

totten commented Nov 28, 2018

eileenmcnaughton commented Nov 28, 2018 • edited Loading

eileenmcnaughton commented Nov 28, 2018

eileenmcnaughton commented Nov 28, 2018

totten commented Nov 28, 2018 • edited Loading

eileenmcnaughton commented Nov 28, 2018 • edited Loading

eileenmcnaughton commented Nov 28, 2018 • edited Loading

totten commented Aug 14, 2018 •

edited

Loading

eileenmcnaughton commented Sep 18, 2018 •

edited

Loading

eileenmcnaughton commented Nov 4, 2018 •

edited

Loading

eileenmcnaughton commented Nov 28, 2018 •

edited

Loading

totten commented Nov 28, 2018 •

edited

Loading

eileenmcnaughton commented Nov 28, 2018 •

edited

Loading

eileenmcnaughton commented Nov 28, 2018 •

edited

Loading