Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syncer(dm) : fix default collation with upstream in create table statement (#3575) #3753

Conversation

ti-chi-bot
Copy link
Member

This is an automated cherry-pick of #3575

What problem does this PR solve?

fix https://github.com/pingcap/ticdc/issues/3420

in this case:
sql1: create database if not exists test
sql2: create database test CHARACTER SET=utf8mb4
sql3: create table test.t1 (id int)
sql4: create table test.t1 (id int) CHARSET=utf8mb4

if syncer send the same ddl to downstream tidb, database or table will has diffrent collation from upstream mysql, because the default collation is different between tidb and mysql.

What is changed and how it works?

it will adjust collation when sync create database or create table statement.

for database, it will add collation by binlog query event and handle ddl like these:
sql1: create database if not exists test COLLATE = utf8mb4_general_ci
sql2: create database test CHARACTER SET=utf8mb4 COLLATE = utf8mb4_general_ci

for table has no charset and collation , as we can not get collation from binlog, so there will have issue about timeliness, because dm can set start position at increment mode. so we will check and add warning log in dm.
in this case, downstream db will use collation from database for table, so user can use dm sync dabase ddl sql, but if not, user should make sure dabase's collation is right if create dabase manually. so the ddl will be handled like that:
sql3: create table test.t1 (id int) and warning log detect create table risk which use implicit charset and collation

for table has charset but no collation, we will add default collation by SHOW CHARACTER SET, so the dll will be handled like that:
sql4: create table test.t1 (id int) CHARSET=utf8mb4 COLLATE = utf8mb4_general_ci

Check List

Tests

  • Unit test

Release note

Fix a bug when upstream use implicit collation in create database or create table. 
It will add collation when there is no charset and collation in create database and it will not add collation in create table as table can inherit database. 
By the way, it will not directly alter downstream schema, so that user need make sure consistence when create database or table manually.

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member Author

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/cherry-pick-not-approved release-note Denotes a PR that will be considered when it comes time to generate release notes. area/dm Issues or PRs related to DM. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 6, 2021
@ti-chi-bot
Copy link
Member Author

@ti-chi-bot: This cherry pick PR is for a release branch and has not yet been approved by release team.
Adding the do-not-merge/cherry-pick-not-approved label.

To merge this cherry pick, it must first be approved by the collaborators.

AFTER it has been approved by collaborators, please ping the release team in a comment to request a cherry pick review.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. type/cherry-pick-for-release-5.3 This PR is cherry-picked to release-5.3 from a source PR. labels Dec 6, 2021
@WizardXiao
Copy link
Contributor

/run-all-tests

@lance6716
Copy link
Contributor

/assign @nongfushanquan

@ti-chi-bot
Copy link
Member Author

@ti-chi-bot: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 23, 2021
@lance6716 lance6716 closed this Jan 6, 2022
@lance6716
Copy link
Contributor

this feature should be combined with a switch in configuration file, they will be introduced in v5.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dm Issues or PRs related to DM. do-not-merge/cherry-pick-not-approved needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/LGT2 Indicates that a PR has LGTM 2. type/cherry-pick-for-release-5.3 This PR is cherry-picked to release-5.3 from a source PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants