-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vttablet refactor phase 2 #6461
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Introduce a new Seconds type with explicit conversions to and from time.Duration. Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
shutdown mysqls in parallel where possible. Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
We saw another shard_sync panic in the tests due to a mismatch between tablet type and master term start time because they get fetched at different times instead of together. Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
assert.Equal(t, want, shr) | ||
} | ||
|
||
func testStream(t *testing.T, hs *healthStreamer) (<-chan *querypb.StreamHealthResponse, context.CancelFunc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t
is not used in this method
return err | ||
} | ||
if !si.HasMaster() { | ||
return fmt.Errorf("no master tablet for shard %v/%v", tablet.Keyspace, tablet.Shard) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason this file is using fmt.Error instead of vterrors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only moved this code from somewhere else. All these older functions are still using fmt.Errorf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not see any test file for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are end2end tests for this. We should ideally have unit tests also, but it's legacy code.
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. A few minor nits.
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
}) | ||
} | ||
|
||
func (hs *healthStreamer) ApppendDetails(details []*kv) []*kv { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see any test for this method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no framework to test the /debug/status
pages right now. But I tested it manually though.
if rm.markerFile == "" { | ||
return | ||
} | ||
if stopped { | ||
if file, err := os.Create(rm.markerFile); err == nil { | ||
file.Close() | ||
} | ||
} else { | ||
os.Remove(rm.markerFile) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When markerFile be empty? If it is only for test, can we change the test to not have empty markerFile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
markerFile
can be empty if the mysql is unmanaged. There are end2end tests for testing the overall behavior. I've written some unit tests to cover the rest of the corner cases that are not covered by the e2e tests.
return err | ||
} | ||
if !si.HasMaster() { | ||
return fmt.Errorf("no master tablet for shard %v/%v", tablet.Keyspace, tablet.Shard) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not see any test file for this.
if ts.tm.UpdateStream != nil { | ||
if topo.IsRunningUpdateStream(ts.tablet.Type) { | ||
ts.tm.UpdateStream.Enable() | ||
} else { | ||
ts.tm.UpdateStream.Disable() | ||
} | ||
} | ||
|
||
if ts.tm.VREngine != nil { | ||
if ts.tablet.Type == topodatapb.TabletType_MASTER { | ||
ts.tm.VREngine.Open(ts.tm.BatchCtx) | ||
} else { | ||
ts.tm.VREngine.Close() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UpdateStream and VREngine are not set in the test. See if can be added in test tm initialize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unit test framework doesn't support these. But there are many end2end tests for this.
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
This is a the second part of the vttablet refactor. We essentially rewrite entire portions of the TabletManager, So, it may be easier to review the new functionality directly instead of looking at the diffs.
At a high level, this refactor achieves the following goals:
/debug/status
.Details
healthStreamer
This is a new component in TabletServer. It owns streaming of health info to subscribers like VTGate. It receives updates from stateManager whenever there is a material change in serving state.
replTracker
This component encapsulates the heartbeat reader, writer, and a new poller. It uses only one of these methods to report on replication health. If it's the poller, it gets invoked everytime a status check is performed. This is triggered from the stateManager.
stateManager
This component was introduced in a previous PR. It has now been enhanced to fetch replication info from replTracker, and feed it to healthStreamer. If the replication is unhealthy, stateManager also goes into a non-serving state. These substates are exported in
/debug/status
, and also there is a json export at the new path/debug/status_details
.replManager
This component watches over replication and fixes it as needed. This job was previously done by the
replicationReporter
.tmState
This was functionality from state_change.go. There are now clear entry points for different state changes, like
RefreshFromTopo
,ChangeTabletType
, etc.