-
-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(dev/core#217) PrevNext - Implement Redis. Decouple Query::getCachedContacts(). #12665
(dev/core#217) PrevNext - Implement Redis. Decouple Query::getCachedContacts(). #12665
Conversation
(Standard links)
|
cb88479
to
120a87e
Compare
@totten so if you can rebase this I can review but I think it needs to be further split into
The reason being that we will need to performance test the second one which might take a bit longer to get answers on |
@totten ^^ |
@totten can you rebase this? |
(dev/core#217) Query::getCachedContacts - Use swappable fetch() instead of SQL JOIN The general context of this code is roughly as follows: * We've already filled up the prevnext cache with a bunch of contact-IDs. * The user wants to view a page of 50 contacts. * We want to lookup full information about 50 specific contacts for this page. It does makes sense to use `CRM_Contact_BAO_Query` for looking up the "full information" about contacts. However, the function `Query::getCachedContacts()` is hard-coded to read from the SQL-based prevnext cache. Before ------ * In `getCachedContacts()`, it grabbed the full SQL for `CRM_Contact_BAO_Query` and munged the query to: * Add an extra JOIN on `civicrm_prevnext_cache` (with a constraint on `cacheKey`) * Respect pagination (LIMIT/OFFSET) * Order results based on their position in the prevnext cache After ----- * In `CRM_Core_PrevNextCache_Interface`, the `fetch()` function provides one page-worth of contact IDs (in order). The `fetch()` function is tested by `E2E_Core_PrevNextTest`. * In `getCachedContacts()`, it doesn't know anything about `civicrm_prevnext_cache` or `cacheKey` or pagination. Instead, it just accepts CIDs for one page-worth of contacts. It returns contacts in the same order that was given. (dev/core#217) Implement Redis driver for PrevNext handling (dev/core#217) PrevNext - Cleanup parameter name in Sql::markSelection The new name is prettier and matches the names in `CRM_Core_PrevNextCache_{Interface,Redis}`. (dev/core#217) PrevNext - Add settings for admin to choose backend The auto-detection is a good default policy. However, this is new functionality. If some bug gets through the review/RC cycles, then this option provides an escape path.
120a87e
to
eb1e5ce
Compare
@eileenmcnaughton It's rebased now, so this of commits is now shorter. |
(dev/core#217) Query::getCachedContacts - Use swappable fetch() instead of SQL JOIN The general context of this code is roughly as follows: * We've already filled up the prevnext cache with a bunch of contact-IDs. * The user wants to view a page of 50 contacts. * We want to lookup full information about 50 specific contacts for this page. It does makes sense to use `CRM_Contact_BAO_Query` for looking up the "full information" about contacts. However, the function `Query::getCachedContacts()` is hard-coded to read from the SQL-based prevnext cache. Before ------ * In `getCachedContacts()`, it grabbed the full SQL for `CRM_Contact_BAO_Query` and munged the query to: * Add an extra JOIN on `civicrm_prevnext_cache` (with a constraint on `cacheKey`) * Respect pagination (LIMIT/OFFSET) * Order results based on their position in the prevnext cache After ----- * In `CRM_Core_PrevNextCache_Interface`, the `fetch()` function provides one page-worth of contact IDs (in order). The `fetch()` function is tested by `E2E_Core_PrevNextTest`. * In `getCachedContacts()`, it doesn't know anything about `civicrm_prevnext_cache` or `cacheKey` or pagination. Instead, it just accepts CIDs for one page-worth of contacts. It returns contacts in the same order that was given. (dev/core#217) Implement Redis driver for PrevNext handling (dev/core#217) PrevNext - Cleanup parameter name in Sql::markSelection The new name is prettier and matches the names in `CRM_Core_PrevNextCache_{Interface,Redis}`. (dev/core#217) PrevNext - Add settings for admin to choose backend The auto-detection is a good default policy. However, this is new functionality. If some bug gets through the review/RC cycles, then this option provides an escape path.
@totten I have found a bug in this - when no results are found an sql error is generated in the getCachedContacts query - this fixes it but a better fix may be possible as I think (but haven't verified) that an early return would make sense in the event of no cids at this point Bad query:
|
test this please |
@totten we've had this in production now for over a month without problems. I just tested the Redis implementation on staging. I exported the first name of all contacts with surnames starting with 'mc' - this gave me a set of contacts of a little over 180,000 which TBH I expected to be a bit slow / painful. I am surprised to report that both with and without Redis on I would describe the experience as 'snappy'. I also tried exporting contributions for that same number of contacts. It was anything but snappy and without altering indexes I couldn't get it to complete without a timeout (obviously 180k contacts represents more than 180k donations so the search is getting quite big now). I tried an index change and DID get it to complete - this is what I tried
The upshot of the testing being that this points to some new opportunities for improving speed but I think this patch is good to merge once the issue in my comment of Sep 19 is addressed |
…ad of SQL JOIN The general context of this code is roughly as follows: * We've already filled up the prevnext cache with a bunch of contact-IDs. * The user wants to view a page of 50 contacts. * We want to lookup full information about 50 specific contacts for this page. It does makes sense to use `CRM_Contact_BAO_Query` for looking up the "full information" about contacts. However, the function `Query::getCachedContacts()` is hard-coded to read from the SQL-based prevnext cache. Before ------ * In `getCachedContacts()`, it grabbed the full SQL for `CRM_Contact_BAO_Query` and munged the query to: * Add an extra JOIN on `civicrm_prevnext_cache` (with a constraint on `cacheKey`) * Respect pagination (LIMIT/OFFSET) * Order results based on their position in the prevnext cache After ----- * In `CRM_Core_PrevNextCache_Interface`, the `fetch()` function provides one page-worth of contact IDs (in order). The `fetch()` function is tested by `E2E_Core_PrevNextTest`. * In `getCachedContacts()`, it doesn't know anything about `civicrm_prevnext_cache` or `cacheKey` or pagination. Instead, it just accepts CIDs for one page-worth of contacts. It returns contacts in the same order that was given.
The new name is prettier and matches the names in `CRM_Core_PrevNextCache_{Interface,Redis}`.
The auto-detection is a good default policy. However, this is new functionality. If some bug gets through the review/RC cycles, then this option provides an escape path.
eb1e5ce
to
e28bf65
Compare
@eileenmcnaughton I've rebased and included https://gerrit.wikimedia.org/r/#/c/wikimedia/fundraising/crm/civicrm/+/461243/1/CRM/Contact/Selector.php as part of 2ca46d4#diff-143585e32dadc0e4857410eaad9206b4R585 (albeit with a ternary I vaguely recall having some more discussion about whether to place the emptiness-check inside of or outside of Aside: re-reading this after some time, it's tempting to soften the transition policy -- though the fact that you've been using it in prod for a few months is a valid counter-point. If you're inclined toward 👍 on the more conservative policy, I can open it as a separate PR - but it's not a blocker here. |
jenkins, test this please |
@totten so in fact we have been running in production still on msql caching - but I'm testing with Redis on staging. I'm just running some performance benchmarks at the moment - maybe we can get some Redis users to test the rc & we can add the conservative approach if need be? - I think @herbdool uses Redis My understanding is that MemCache & APCCache are unaffected as they don't support the Redis change |
Aah, that's fine by me. It'd be great to get feedback from @herbdool or another Redis user. I've put the other commit (for more conservative transition policy) into its own PR (so that we can do it - or not - async). You are correct that this only affects Redis -- there is no PrevNext driver for memcache or APC. |
Breaking - I have some performance testing results There are from hitting 100 times with concurrency of 1 & of 5 (no noticeable difference) and each done on
Note that I altered the page between each run. 3 runs per variant and I eliminated the slowest 20 % which have no obvious pattern but stretch the y axis making it hard to read |
test this please |
I think fails are unrelated |
For the record -- that's a really neat graph! 👍 Trying to simplify/analyze a bit, it seems like (with the data-set/use-case/environment under test):
|
@totten yeah that sounds about right - I did some separate analysis on Redis on contact edit and I got about 12% performance lift there. It DOES matter how your site is configured / provisioned. Locally I got a 38% improvement with Redis but it was actually worse once I added in concurrency. Obviously the people who set up Redis on our server did a better job than I did :-) From a user experience POV - it takes longer locally (no Redis) to load the edit screen on a single contact than on our Redis enabled server. Obviously there are a LOT of differences between the 2 set ups - latency & DB size work against the remote whereas the server is much better provisioned & has Redis enabled. However, the difference is significant from a usablity POV - ie. the difference between 'snappy' and 'I got bored & flipped to a different tab' |
@totten also locally I would note that while it is notably faster locally I have had to turn it off - just because there are things that cv flush misses with Redis turned on that make it a bit more painful to develop against when tinkering with things that might be cached (I think cached settings is one but I am not 100% sure what I was doing when I turned it off) |
The setting `prevNextBackend` was introduced in PR civicrm#12665. The PR was originally written for an earlier version (circa 5.6) and eventually merged in a later version (circa 5.9). The metadata should match the version-number of the actual release.
Overview
This patch implements the first non-SQL storage mechanism for storing prev-next cache used by contact-search screens. It also includes a change in the
CRM_Core_PrevNextCache_Interface
and some small cleanups (as separate commits).Before
CRM_Contact_BAO_Query::getCachedContacts()
is hard-coded to only fetch CID's from the SQL-based cache.CRM_Core_PrevNextCache_Interface
isSql
.After
CRM_Contact_BAO_Query::getCachedContacts()
is more flexible -- accepting any CID's provided byCRM_Core_PrevNextCache_Interface::fetch()
.CRM_Core_PrevNextCache_Interface
--Sql
andRedis
.(There's more discussion of changes in the commit messages.)
Technical Details
I had initially expected to store the entire result-set in Redis as one cache key, e.g.
With a large result set (think "1 million contacts"), this would produce a large cache record -- and any page-view would have to fetch the entire list. But then I found that Redis and php-redis expose several data structures -- such hashes, lists, sets, and sorted-sets. The sorted-set API lets us track a list of integers, preserve the ordering, and fetch results on a paginated basis (
ZRANGE
/zRange()
).It seemed like the most performant approach would be to split the data into three pieces:
{$prefix}/prevnext/{$cacheKey}/all
){$prefix}/prevnext/{$cacheKey}/sel
) with the same ordering (scores/weights) as the main list{$prefix}/prevnext/{$cacheKey}/data
)If you open
redis-cli
while using the search interface, you can inspect the content of these keys, e.g.Comments
During development of the original #12377, I used this worksheet to organize fairly deep testing of the prev-next functionality. I haven't re-tested after the various cherry-picks/merges, but it could still be a good guide for thorough
r-run
testing.