Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRM-20858: Fix Merging of Unselected Custom Fields #10831

Conversation

MiyaNoctem
Copy link
Contributor

@MiyaNoctem MiyaNoctem commented Aug 8, 2017

Overview

Custom field values were being migrated even when not specifically told to during the merge process. This was happening even though tests designed to verify this were running ok.

Before

If a contact that has values stored in a custom group table is merged into a contact that doesn't have a record in that table, all values in the source target will be merged into the target contact, even if those values were not explicitly selected in the merge operation.

This happened because part of the merging operation updates all database references from the source contact to the target contact on all tables. When the target contact has a record on a custom group table, this update will fail, since the table's configuration enforces the entity_id (the reference to the contacts table) to be unique. However, when the target contact has no record on that custom group table while the source contact does have a record, and the merge is performed, the original record will be updated to reference the target contact, and hence inherit ALL values of the source contact, even if none of the values were selected to be merged.

After

The solution checks for custom tables that have fields that were selected in the merge, and passes this list of tables as an argument to CRM_Dedupe_Merger::moveContactBelongings() method. This list is used to skip custom tables that have no fields being merged, and to create an empty record on the custom table for contacts that don't have a record already there, and have some fields of the custom table being merged into the contact.

Technical Details

The existing test that was supposed to capture this problem had some errors when setting up the contacts. When it called on the 'create' API action, it was sending the wrong custom field's name. This caused all contacts to have no record in the custom group table. Hence, when the contact's were merged, the problem did not present itself, as no references were updated (none existed). Fixing this caused all contacts to have records in the custom group table, thus the problem was still not detected.

Two more tests were added to verify merging into a contact with no record in a custom group table did not merge field values, unless explicitly selected. The first one to verify that merging a contact with a record on a custom group table does not merge its values into a target contact with no record in the custom group table; the second one to check that if only some fields of a custom group table are selected to be merged, only those values are copied, leaving all others unset.

Comments

This work is based on what was started on PR #10658.


1. I think we'll need some more error-handling. The code should probably throw an exception if any of those API calls fail?
-Done

2. There are two calls to 'moveContactBelongings' - won't the first call be affected by the additional code you've added that 'ignores' changes in they're not in the array?
- Handled
PhpStorm's tabs are now fixed to spaces.
If a contact that has values stored in a custom group table is merged into a
contact that doesn't have a record in that table, all values in the source
target will be merged into the target contact, even if those values were
not explicity selected to the merge operation.

The merging operation updates all database references from the source contact
to the target contact on all tables. When the target contact has a record on
a custom group table, this update will fail, since the table's configuration
enforces the entity_id (the reference to the contacts table) to be unique.
However, when the target contact has no record on that custom group table, the
source contact does have a record and the merge is performed, the original
record will be updated to reference the target contact, and hence inherit ALL
values of the source contact, even if none of the values were selected to be
merged.

Added two tests, the first one to verify that merging a contact with a record
on a custom group table does not merge its values into a target contact with
no record in the custom group table; the second one to check that if only some
fields of a custom group table are selected to be merged, only those values
are copied, leaving all others unset.
When a Custom Group had more than one custom field and only some of those
fields were selected to be merged, all the fields in the custom group were
merged. This happened exclusively on contacts that had no record on the
custom group's table.

Fixed this by inserting a record on the custom group table before updating
references from source contact to target contact, this way, all values in the
custom group will start empty and only submitted values will get assigned.

Also refactored the solution to make it easier to read and used more descriptive
variable and function names.
@MiyaNoctem MiyaNoctem force-pushed the CRM-20858-fix-merge-of-unselected-custom-fields branch from 29aa8f7 to 4f75197 Compare August 9, 2017 14:34
@MiyaNoctem MiyaNoctem changed the title CRM-20858: (WIP) Fix Merging of Unselected Custom Fields CRM-20858: Fix Merging of Unselected Custom Fields Aug 9, 2017
Test for merging custom fields with is_view flag on was failing. This was
happening because the same field was being used on prior tests, setting the
is_view flag at the start of the failing test. Fixed by creating a new custom
field with the is_view flag set as 1, and using that field on the test.
@JKingsnorth
Copy link
Contributor

I've started reviewing this, and I'll finish it tomorrow. Functionality-wise it looks good so far - thanks @Eaiman and @MiyaNoctem!

Copy link
Contributor

@JKingsnorth JKingsnorth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Eaiman and @MiyaNoctem for your work on this!

I've tested the functionality against the original steps to recreate and it works well.

I've also tested:

  • Taking across the fields on the merge screen when there are none on the main contact: works.
  • Taking across the values on the merge screen when there is a value on the main contact: overwrites the values - works.
  • Not taking across a custom text field: works, not taken across.
  • Not taking across 'Activities': works, not taken across.

I've also tested a force batch merge, and it seems to work as I would expect it to:

  • If there is a value on the duplicate but not the main: the duplicate value is taken across.
  • If there is a value on both: the main value is preserved.

A few questions regarding the code:

I don't think the file permissions should be changing to 755?

With the try-catch blocks - is 'skipping' the custom field the best approach? Shouldn't be at least also present a warning or error to the user? What is the normal procedure for when an internal API call, that we are expecting to succeed, fails?

Do we still need to call moveContactBelongings twice here (with different params)? Can this be optimised?
https://github.com/MiyaNoctem/civicrm-core/blob/4b047375cf185613b1ccd8a768eaf1fa646a82de/CRM/Dedupe/Merger.php#L1542-L1550
I don't want to over complicate things here though, I'd rather we get the issue fixed (without breaking other stuff) and look at optimisations later.

Finally, @Eileen I know you've done a lot of work with batch merging - does this all look OK from your point of view?

moveContactBelongings() was bing called twice, first to move selected related
tables, and the second to move all other tables that reference contacts table.

Refactored moveContactBelongings() method so that it can be called only once,
by calculating in one go the tables that have to be taken into account when
merging, according to what has been selected to move.  Basically, it searches
for all tables related to the contact, removes related tables, and then
includes only those related tables selected to be merged.
The try catch blocks were apparently added because not all fields passed were
custom fields, generating exceptions when the API call was made to obtain its
option group. This happened because actually all fields selected for merge
were being inserted into the $submittedCustomFields array, instead of just
custom fields. Plus, obtaining the field's ID on a field that was not custom
rendered strange strings to be used on search of the field, as this ID is
obtained by getting a substring from the field's key, that should have the
format 'move_custom_<fieldid>' (fields that are not custom don't have that
format).
Left out selected related tables from affected tables array, which caused
several tests to fail. Merged tables array into affected tables array.
@MiyaNoctem
Copy link
Contributor Author

@JKingsnorth, thanks for your feedback!

I've changed back file permissions to 644. I was also able to refactor the moveContactBelongings() method so it is called only once, by calculating in one go all tables that should be affected by the merge.

Finally, I removed the try/catch blocks used when building the list of affected custom tables, and found they were there because the values passed to the function where not all an actual ID for a custom field, so I had to fix how this array of custom fields' IDs was being populated too. The try/catch blocks are no longer there.

Waiting for Jenkins now!

@JKingsnorth
Copy link
Contributor

Thanks @MiyaNoctem - I'll run through the tests again tomorrow, including one for memberships, since I see there's a change to how that's handled too, but this sounds good.

My ping was wrong in the last comment, I meant to ask @eileenmcnaughton - I know you've done a lot of work with batch merging, does this all look OK from your point of view?

@JKingsnorth
Copy link
Contributor

Manually retested the original issue: OK
Manually tested that rel tables are not migrated when not selected: OK
Manually tested that rel tables are migrated when selected: OK

I manually tested the issue described in CRM-12695 - but I found that that issue has regressed anyway, and it is broken in 4.7.23. I've reopened that issue.

Could someone confirm? Since this PR simplifies the code around CRM-12695 I think we should proceed with the merge here, and tackle the other issue separately? (It is not really related to this PR).

These lines are no longer required, since the variable is no longer used:
https://github.com/MiyaNoctem/civicrm-core/blob/597697a7e5e43a2e99376974b4d15e3a6017aad4/CRM/Dedupe/Merger.php#L447-L448

Other than that I think we're good to go.

@MiyaNoctem
Copy link
Contributor Author

@JKingsnorth, I've removed the variable. Just waiting for Jenkins now, but should be ok.

@eileenmcnaughton
Copy link
Contributor

@JKingsnorth I'm going to re-close CRM-12695 - can you open a new one. It's really confusing re-opening issues when the are fixed in one version & then recur later (ahem did someone mention unit tests)

@eileenmcnaughton
Copy link
Contributor

I'm going to merge this based on @JKingsnorth review and the additional unit tests. For context I consider John to be an expert in this area of code and in testing it well (which he has done on other occassions).

The change affect merge functionality, both via the form and via the batch merge. There are already a number of unit tests covering this code from the batch merge perspective, as well as a lesser number from the form perspective.

@agh1 you may have a vested interest in the regression on CRM-12695 observed in conjunction with this PR (but not caused by it.)

@eileenmcnaughton eileenmcnaughton merged commit 4c46020 into civicrm:master Aug 31, 2017
@JKingsnorth
Copy link
Contributor

OK @eileenmcnaughton, @agh1 I've created a new issue here: https://issues.civicrm.org/jira/browse/CRM-21151 regarding the cascading of memberships after merge. (I shall do this in future rather than reopening - makes sense!)

Thanks everyone for your work on this. Great to finally get the fix in place!

@JKingsnorth
Copy link
Contributor

@MiyaNoctem FYI, this may have caused a regression, as reported here https://issues.civicrm.org/jira/browse/CRM-21773 .

@mlutfy mlutfy added this to the 4.7.32 milestone Mar 6, 2018
@mlutfy mlutfy removed this from the 4.7.32 milestone Mar 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants