Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schemadiff: foreign key validation (tables and columns) #11944

Merged
merged 10 commits into from
Jan 23, 2023

Conversation

shlomi-noach
Copy link
Contributor

@shlomi-noach shlomi-noach commented Dec 13, 2022

Description

We improve schemadiff's support/validation for foreign keys as follows:

  • schemadiff analyzes foreign key dependency across the schema
  • it sorts tables by fk dependency (ie parents first, then children)
  • it validates that there are no circular dependencies (only circle allowed is a self-referencing table)
  • schemadiff: validate columns referenced by FOREIGN KEY #10359 already validates that the local columns covered by a foreign key do in fact exist
  • we now also check that the referenced (parent) table exists
  • and that referenced columns in the parent table - exist
  • and that the foreign key lists same count of columns local-vs-referenced
  • and that the column types local-vs-referenced match: InnoDB only allows matching an int to an int, for example, and does not allow int-to-bigint. It does allow different VARCHAR length.

Most of this takes place at the schema level, not at the table level.

Many tests added.

Still todo (in a different PR):

  • validate indexes: each foreign key must have an appropriate index that covered columns in local table, and an appropriate index that covers referenced columns in parent table
    • an appropriate index covers the foreign key columns in same order as they appear in the foreign key. It may further extend to cover other columns, but the foreign key columns must be the prefix of the index column list.
  • apply(): when adding a foreign key and no appropriate index exists, implicitly add it
  • apply(): when creating a new table with foreign key and no appropriate index exists, implicitly add it
  • apply(): do not allow drop key if it leaves a foreign key without an appropriate index
  • validate uniqueness of constraints name across the schema (foreign keys and check constraints alike)

Related Issue(s)

Foreign keys tracking issue: #11975
schemadiff tracking issue: #10203
Followup to #10359

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…nKeyColumnCountError, MismatchingForeignKeyColumnTypeError

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach shlomi-noach added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: Query Serving labels Dec 13, 2022
@shlomi-noach shlomi-noach requested review from dbussink and a team December 13, 2022 08:48
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Dec 13, 2022

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

@shlomi-noach shlomi-noach changed the title schemadiff: foreign key validation (tables and column) schemadiff: foreign key validation (tables and columns) Dec 13, 2022
@shlomi-noach
Copy link
Contributor Author

Foreign keys tracking issue: #11975

sqlescape.EscapeID(e.ReferencedTable), sqlescape.EscapeID(e.ReferencedColumn), sqlescape.EscapeID(e.Constraint), sqlescape.EscapeID(e.Table))
}

type MismatchingForeignKeyColumnCountError struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name suggestion: ForeignKeyColumnCountMismatchError. I don't know if users ever see this though so it may not matter too much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed

Table string
}

func (e *ForeignKeyDependencyUnresolvedError) Error() string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a unit test for this new error type. 4 new errors, but only 1 new unit test.

Copy link
Contributor Author

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach
Copy link
Contributor Author

@deepthi for another look please 🙏

@shlomi-noach shlomi-noach requested a review from a team January 22, 2023 07:44
e.ReferencedColumnCount, sqlescape.EscapeID(e.Constraint), sqlescape.EscapeID(e.Table), e.ColumnCount)
}

type MismatchingForeignKeyColumnTypeError struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't catch this in the previous review. It will be good to rename this as well - to ForeignKeyColumnTypeMismatchError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed!

…smatchError

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach shlomi-noach requested review from a team and dbussink and removed request for dbussink January 22, 2023 08:44
@shlomi-noach
Copy link
Contributor Author

/cc @dbussink for review

Copy link
Contributor

@dbussink dbussink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly some questions but nothing blocking.

// we stop when we have been unable to find a table in an iteration.
fkParents := map[string]bool{}
iterationLevel := 0
for ; ; iterationLevel++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably write this slight different like this:

for {
     iterationLevel++ 
     ...
}

Reads simpler to me, but no strong opinion really on this.

The other question though is, would it be possible to craft a schema manually that triggers an infinite loop here? Should we have a safeguard for a max level we support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to craft a schema manually that triggers an infinite loop

The algorithm addresses infinite loops. It leaves out any infinitely dependent views. There's a unit test for that:

func TestNewSchemaFromQueriesLoop(t *testing.T) {
// v7 and v8 depend on each other
queries := append(createQueries,
"create view v7 as select * from v8, t2",
"create view v8 as select * from t1, v7",
)
_, err := NewSchemaFromQueries(queries)
assert.Error(t, err)
assert.EqualError(t, err, (&ViewDependencyUnresolvedError{View: "v7"}).Error())
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just clarifying, that the rewrite would be:

for {
     ...
     iterationLevel++ 
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

// Now validate foreign key columns:
// - referenced table columns must exist
// - foreign key columns must match in count and type to referenced table columns
// - referenced table as an appropriate index over referenced columns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as an => has an?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

}
}

// TODO(shlomi): find a valid index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is having an index a logical strict requirement or more an implementation detail of how MySQL uses it to enforce constraints?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MySQL requires you to have an index over foreign key columns; a future PR enforces that. If you create a foreign key without creating an index, MySQL implicitly creates one on your behalf. The key must cover all the FK columns as a prefix. See future PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The future PR being #12026

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach shlomi-noach merged commit cbe0338 into vitessio:main Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants