Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle aliasing of collation names #11433

Merged

Conversation

dbussink
Copy link
Contributor

@dbussink dbussink commented Oct 4, 2022

With MySQL 8.0.30 and later, utf8mb3 is always reported as the charset in output for SHOW CREATE TABLE which is what schemadiff uses. We already today normalize all charset output to use the utf8mb3 name to avoid any ambiguity in what is intended.

We didn't do this though for collations. Today if schemadiff is fed schemas generated both with MySQL 8.0.30 and older versions to compare, it would indicate there's a difference when there is none.

The change here always normalizes to use the more explicit utf8mb3_ names for the collation if it can be found, based on the charset aliases configured. This ensures that comparisons between such schemas don't see accidental or stray diffs that are not really changes.

Related Issue(s)

This is part of the #10203 schemadiff work.

Checklist

  • "Backport me!" label has been added if this change should be backported
  • Tests were added or are not required
  • Documentation was added or is not required

@dbussink dbussink added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: Query Serving labels Oct 4, 2022
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Oct 4, 2022

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

With MySQL 8.0.30 and later, utf8mb3 is always reported as the charset
in output for `SHOW CREATE TABLE` which is what `schemadiff` uses. We
already today normalize all `charset` output to use the `utf8mb3` name
to avoid any ambiguity in what is intended.

We didn't do this though for collations. Today if `schemadiff` is fed
schemas generated both with MySQL 8.0.30 and older versions to compare,
it would indicate there's a difference when there is none.

The change here always normalizes to use the more explicit `utf8mb3_`
names for the collation if it can be found, based on the charset aliases
configured. This ensures that comparisons between such schemas don't see
accidental or stray diffs that are not really changes.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
@dbussink dbussink force-pushed the dbussink/handle-collation-aliasing branch from 5d04bce to 970e381 Compare October 4, 2022 12:26
Copy link
Member

@GuptaManan100 GuptaManan100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@GuptaManan100 GuptaManan100 merged commit 25b9271 into vitessio:main Oct 5, 2022
@GuptaManan100 GuptaManan100 deleted the dbussink/handle-collation-aliasing branch October 5, 2022 08:35
dbussink added a commit that referenced this pull request Oct 5, 2022
With MySQL 8.0.30 and later, utf8mb3 is always reported as the charset
in output for `SHOW CREATE TABLE` which is what `schemadiff` uses. We
already today normalize all `charset` output to use the `utf8mb3` name
to avoid any ambiguity in what is intended.

We didn't do this though for collations. Today if `schemadiff` is fed
schemas generated both with MySQL 8.0.30 and older versions to compare,
it would indicate there's a difference when there is none.

The change here always normalizes to use the more explicit `utf8mb3_`
names for the collation if it can be found, based on the charset aliases
configured. This ensures that comparisons between such schemas don't see
accidental or stray diffs that are not really changes.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants