Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phase out CIVICRM_TEMP_FORCE_UTF8 #13658

Merged
merged 1 commit into from
Feb 25, 2019
Merged

Conversation

mfb
Copy link
Contributor

@mfb mfb commented Feb 21, 2019

Overview

Currently, temporary tables do not set utf8 as their default character set unless CRM_Utils_SQL_TempTable::setUtf8() is called. However, I cannot think of a reason to allow non-UTF-8 tables.

Before

Temporary tables default to whatever is the default on the MySQL server (e.g. latin1).

After

We could make them UTF-8 by default, and we could also prevent them from having some other default character set, as a simplification. A custom character set and/or collation can still be set for specific columns.

@civibot
Copy link

civibot bot commented Feb 21, 2019

(Standard links)

@eileenmcnaughton
Copy link
Contributor

@totten I'm OK with this - I thought that not making it always true in the first place was too cautious since from what I can tell all temp tables were created without being set to utf8 & then individually the ones that resulted in bug reports were changed. I think we should merge this & possibly fix up some more temp tables to use this class (but not make larger changes prior to the next rc being cut)

@mfb
Copy link
Contributor Author

mfb commented Feb 24, 2019

Personally I don't think it's necessary to add this to civicrm.settings.php.template if we're planning to deprecate it, and it also defaults true in the code.

@eileenmcnaughton
Copy link
Contributor

@mfb I agree. I'm on the fence about deprecating the function too.

However, changing to

$t->utf8 = CRM_Utils_Constant::value('CIVICRM_TEMP_FORCE_UTF8', TRUE);

Makes sense to me

@eileenmcnaughton
Copy link
Contributor

or even

 $t->utf8 = TRUE;

@mfb
Copy link
Contributor Author

mfb commented Feb 24, 2019

The method just seemed like overengineering to me, but not bad per se.

@mfb mfb marked this pull request as ready for review February 24, 2019 22:04
@eileenmcnaughton
Copy link
Contributor

@mfb this is odd - maybe try rebasing?

[GitScan\Exception\ProcessErrorException]
Process failed:
[[ COMMAND: git apply --check --ignore-whitespace ]]
[[ CWD: /home/jenkins/bknix-dfl/build/core-13658-6sbeu/sites/all/modules/civicrm ]]
[[ EXIT CODE: 1 ]]
[[ STDOUT ]]
[[ STDERR ]]
error: patch failed: CRM/Utils/SQL/TempTable.php:93
error: CRM/Utils/SQL/TempTable.php: patch does not apply

@eileenmcnaughton
Copy link
Contributor

test this please

@mfb
Copy link
Contributor Author

mfb commented Feb 25, 2019

At some point soon, I think it would be ok to completely remove CIVICRM_TEMP_FORCE_UTF8 and remove the (so far) two calls to setUtf8() since they aren't necessary.

@eileenmcnaughton
Copy link
Contributor

test this please

@seamuslee001
Copy link
Contributor

Jenkins re test this please

1 similar comment
@seamuslee001
Copy link
Contributor

Jenkins re test this please

@seamuslee001
Copy link
Contributor

Jenkins re test this please

@eileenmcnaughton
Copy link
Contributor

@seamuslee001 perhaps the reason it keeps failing is it keeps failing

@seamuslee001
Copy link
Contributor

@eileenmcnaughton but i don't think its the same test its failing on tho

@mfb
Copy link
Contributor Author

mfb commented Feb 25, 2019

jenkins test this please

@eileenmcnaughton eileenmcnaughton merged commit 41cef03 into civicrm:master Feb 25, 2019
@totten
Copy link
Member

totten commented Feb 25, 2019

I think we should merge this & possibly fix up some more temp tables to use this class (but not make larger changes prior to the next rc being cut)

@mfb @eileenmcnaughton - One suggestion: if you haven't already, then update on your production systems to set CIVICRM_TEMP_FORCE_UTF8 to TRUE, e.g. tweak civicrm.settings.php:

define('CIVICRM_TEMP_FORCE_UTF8', TRUE);

I agree with y'all that the policy change here sounds sensible. I merely used that default because it was hard to demonstrate its safety (e.g. it has shallow+wide impact which is hard to measure). But the sooner we get some systems using that policy (with an organically wide range of use-cases), the more reasonably we can infer that it's safe.

@mfb
Copy link
Contributor Author

mfb commented Feb 25, 2019

I have set this in production and will see if anything weird happens - although I think this constant is not used that often since various tables don't actually use the API.

Not using utf8 for temporary tables can definitely lead to weird bugs if the mysql server uses something aside from utf8 as its default character set (e.g. latin1) and there is non-ASCII text in the table.

@eileenmcnaughton
Copy link
Contributor

I've made the change on the small site I deal with - not WMF as making a change needs more steps there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants