-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
schemadiff: validate views' referenced columns #12147
schemadiff: validate views' referenced columns #12147
Conversation
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
In a followup PR: optimize such that we only analyze columns tables/views that are referenced by some view in the first place; there is no need to analyze e.g. a table that is never read by a view. This will allow us to save some memory & cpu. |
go/vt/schemadiff/diff_test.go
Outdated
@@ -667,6 +667,7 @@ func TestDiffSchemas(t *testing.T) { | |||
// validate schema1 unaffected by Apply | |||
assert.Equal(t, schema1SQL, schema1.ToSQL()) | |||
|
|||
require.NotNil(t, schema2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this check useful? We already check earlier that there's no error returned when schema2
is created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
go/vt/schemadiff/schema.go
Outdated
@@ -750,3 +759,258 @@ func (s *Schema) Apply(diffs []EntityDiff) (*Schema, error) { | |||
} | |||
return dup, nil | |||
} | |||
|
|||
// TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this TODO
reference to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehe, it references the fact that is is already done... removing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
go/vt/schemadiff/schema.go
Outdated
// TODO | ||
|
||
func (s *Schema) ValidateViewReferences() error { | ||
var allerrors error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually use errs
as a the name for this kinda thing to collect errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
go/vt/schemadiff/schema.go
Outdated
for _, e := range s.Entities() { | ||
entityColumns, err := s.getEntityColumnNames(e.Name(), availableColumns) | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this add to the error list instead? And then we continue through the rest of the entities? If this is a totally unexpected error, maybe good to add a comment about that and why it's safe to return immediately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
go.mod
Outdated
@@ -113,6 +113,7 @@ require ( | |||
github.com/kr/text v0.2.0 | |||
github.com/nsf/jsondiff v0.0.0-20210926074059-1e845ec5d249 | |||
github.com/openark/golib v0.0.0-20210531070646-355f37940af8 | |||
go.uber.org/multierr v1.8.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also update to 1.9.0
here since that's the latest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
} | ||
return true, nil | ||
}, view.Select) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI - sqlparser.Walk
will only return the error you return from inside the visitor function, and since you don't return any errors from that, no errors will ever make it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@systay ah, great! Thank you. I'll keep the check as it is, for safety, but good to know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@systay Right, but the errors are gathered here in errs
? And those are at the end returned if the walker itself doesn't error (which is not expected).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbussink yeah, that was my point. Not really necessary to catch the returned error from sqlparser.Walk
since that is not how we are dealing with the errors. When I know it will never return an error, I usually write.
_ = sqlparser.Walk(...)
OTOH - it's probably good defensive programming to do as @shlomi-noach is doing here and catching and checking the error anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@systay Right, guess as someone now knowing the details too much about how Walk
is implemented, I wouldn't know that it can't return an error and would still write it defensively 😄.
Looking for a 2nd review/approval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can/should avoid importing the external multierr
package but it otherwise LGTM.
Not adding a formal approval as we should, IMO, get @systay or @harshit-gangal to review as they are in the query serving team and working on the overall Vitess View support work. This way we can ensure that we're moving forward together and all the pieces will fit together nicely.
@systay or @harshit-gangal , could you please review? Please note if you cannot and I can add a formal approval here if needed. Thanks!
go/vt/schemadiff/schema.go
Outdated
for _, e := range s.Entities() { | ||
entityColumns, err := s.getEntityColumnNames(e.Name(), availableColumns) | ||
if err != nil { | ||
errs = multierr.Append(errs, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the Vitess concurrency package here along with vterrors.Aggregate
? You can see that used throughout the code base if you look for .AggrError(vterrors.Aggregate)
. Between the concurrency
and vterrors
packages we should have whatever related functionality you need here for error handling. For example:
vitess/go/vt/wrangler/traffic_switcher.go
Lines 146 to 161 in 390ddad
func (ts *trafficSwitcher) ForAllSources(f func(source *workflow.MigrationSource) error) error { | |
var wg sync.WaitGroup | |
allErrors := &concurrency.AllErrorRecorder{} | |
for _, source := range ts.sources { | |
wg.Add(1) | |
go func(source *workflow.MigrationSource) { | |
defer wg.Done() | |
if err := f(source); err != nil { | |
allErrors.RecordError(err) | |
} | |
}(source) | |
} | |
wg.Wait() | |
return allErrors.AggrError(vterrors.Aggregate) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it does use locking which is unnecessary here. Dunno if we care about the overhead of that here? That logic seems designed for concurrent error gathering which isn't what we're doing here.
Using multierr
seems simpler here and it's already an indirect dependency? Not a really strong opinion though, we can also use this but it seems a bit off for what it was designed for.
I think this logic here is temporary anyway, since once golang/go#53435 is available with Go 1.20 we probably want to switch to that anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @dbussink
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, race condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How strongly do people feel about this? Looking at the two implementation multierr
does seem to be more fitting to our purpose, but I don't feel strongly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you only need multiple errors you can use vterrors.Aggregate()
, no need for the concurrency piece (as noted, geared toward N goroutines) like we do here e.g.:
vitess/go/vt/vttablet/tabletmanager/vreplication/vcopier.go
Lines 597 to 621 in 3f55e40
var terrs []error | |
for !empty { | |
select { | |
case result := <-resultCh: | |
switch result.state { | |
case vcopierCopyTaskCancel: | |
// A task cancelation probably indicates an expired context due | |
// to a PlannedReparentShard or elapsed copy phase duration, | |
// neither of which are error conditions. | |
case vcopierCopyTaskComplete: | |
// Get the latest lastpk, purely for logging purposes. | |
lastpk = result.args.lastpk | |
case vcopierCopyTaskFail: | |
// Aggregate non-nil errors. | |
terrs = append(terrs, result.err) | |
} | |
default: | |
empty = true | |
} | |
} | |
if len(terrs) > 0 { | |
terr := vterrors.Aggregate(terrs) | |
log.Warningf("task error in workflow %s: %v", vc.vr.WorkflowName, terr) | |
return fmt.Errorf("task error: %v", terr) | |
} |
I don't feel strongly about it though. Up to you.
…t views reading from DUAL Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Incorporated #12189 into this PR, as it deals with view analysis |
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
I've switched away from |
Looking for another review/approval 🙏 |
ping @vitessio/query-serving for review |
ping https://github.com/orgs/vitessio/teams/query-serving for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything else LGTM!
go.mod
Outdated
@@ -187,7 +187,7 @@ require ( | |||
github.com/tidwall/pretty v1.2.0 // indirect | |||
go.opencensus.io v0.24.0 // indirect | |||
go.uber.org/atomic v1.10.0 // indirect | |||
go.uber.org/multierr v1.9.0 // indirect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason for this downgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was upgraded in a previous PR, that was eventually not using multierr
. I just ran go mod tidy
and it auto-downgraded, so... I don't know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shlomi-noach might be because the direct dependency was dropped? A go get -u go.uber.org/multierr
probably would add this back correctly (it should be there, so good catch @GuptaManan100).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fixed.
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Stalling this PR a bit as I'm to look into the |
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Replaced by #12565 ; will close shortly. |
…12551) * WIP: schemadiff, analyzing diff dependencies Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * ongoing work; many tests added Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * partitioning tests Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * function refactor Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * minor reordering of variables Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * copyright year Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * stages to be commited Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * math: permutations Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * mathutil Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * sort table alter options on diff Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * alter reordering: first drop foreign key, then drop key Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * remove tags logic from EquivalenceRelation Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * removed go/mathutil/permutations.go Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * refactor/rename Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * cleanup unused code Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * OrderedDiffs Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * added an impossible-order test Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * clarify comment Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * AllSequentialExecutionDeps() Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * test: expected order of entities in ordered diffs Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * function comments Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * remove unused code Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * code comments; privatize functions; remove unused code Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * code comments. Privatize functoin Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * test deferred till #12147 is merged Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * code comments Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * fix permutation logic, add tests Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * unindent using early returns Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * code comment Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * reposition some functions in file (no actualy changes) Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * relocated the functions back because the diff became really confusing Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * grammar Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * code comments, minor optimization Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * typo Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * Added tests, minor edits per review. changed language Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * handling empty diff slice; improved comments Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * Clarify test reasoning Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * grammar Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --------- Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Description
Up till now,
schemadiff
views validation and analysis was limited to:As of this PR,
schemadiff
also validates columns used and returned by views:SELECT
and inWHERE
clause do in fact exist in tables/views referenced by the view. Error when an unknown column is encountered.create view v as select t.* from t, t2 ...
) or unqualified (create view v as select * from t1,t2
) ; infer list of columns ; support cascaded views based on star expression views.Plenty tests added to validate behavior.
Joint with @dbussink and based on his preliminary work.
Related Issue(s)
Checklist
Deployment Notes