Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tabletserver: improved state transitions with new stateManager #6396

Merged
merged 19 commits into from
Jul 14, 2020

Conversation

sougou
Copy link
Contributor

@sougou sougou commented Jun 30, 2020

The tabletserver can now remember its desired state and continuously retry if the first attempt fails. This responsibility was previously with tabletmanager. This change allows us to delete the corresponding code from tabletmanager.

Most sub-component function calls have been normalized to simplify state management. All such functions are now idempotent.

The new stateManager invokes the necessary functions to reach any required state knowing that the functions are idempotent. This allows it to act in a manner that is agnostic of what the current state is.

Strict ordering rules have been defined for how to open and close various services.

@@ -190,6 +191,10 @@ func (c Connector) Connect(ctx context.Context) (*mysql.Conn, error) {

// MysqlParams returns the connections params
func (c Connector) MysqlParams() (*mysql.ConnParams, error) {
if c.connParams == nil {
// This is only possible during tests.
return nil, fmt.Errorf("parameters are empty")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vterrors FTW

"sync"
"time"

"golang.org/x/net/context"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"golang.org/x/net/context" -> "context"

const (
// StateNotConnected is the state where tabletserver is not
// connected to an underlying mysql instance.
StateNotConnected = iota
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really nice if states are not just plain int64, but their own type, built on int64. It helps understanding and gives compiler protection for a bunch of situations.

assert.Equal(t, int64(StateNotConnected), sm.State())
}

func verifySubcomponent(t *testing.T, component interface{}, order int64, state testState) {
Copy link
Collaborator

@systay systay Jun 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super-nit: I think it would be nice if we changed the order of these, so that order comes before component. That way, assertions line up better:

verifySubcomponent(t, 1, sm.messager,  testStateClosed)
verifySubcomponent(t, 2, sm.te,  testStateClosed)
verifySubcomponent(t, 3, sm.txThrottler,  testStateClosed)
verifySubcomponent(t, 4, sm.qe,  testStateClosed)
verifySubcomponent(t, 5, sm.watcher,  testStateClosed)
...

@sougou sougou force-pushed the ss-ts1-state branch 2 times, most recently from c42ab86 to c6d295f Compare July 3, 2020 01:06
sougou added 18 commits July 11, 2020 14:01
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Tests still need fixing

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Looks like tm explicitly sets it expecting tablet server to
unset it. We'll need to revisit this.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Also address a few early comments.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
If the requested type is RESTORE, force StateNotConnected.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Some tests became flaky because they expect the state to change
immediately after a call. This change brings the behavior to
be closer to the existing one. But the tabletserver will continue
to retry if an request failed.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Copy link
Member

@harshit-gangal harshit-gangal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Some nits.

Comment on lines +97 to +101
// InitDBConfig initializes the target parameters for the Engine.
func (vse *Engine) InitDBConfig(keyspace string) {
vse.keyspace = keyspace
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can be renamed to setKeyspace as it is only setting keyspace.

Comment on lines +97 to +101
// InitDBConfig initializes the target parameters for the throttler.
func (t *TxThrottler) InitDBConfig(target querypb.Target) {
t.target = target
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: InitDBConfig -> setTarget

Comment on lines 237 to +241
// Open must be called before sending requests to QueryEngine.
func (qe *QueryEngine) Open() error {
if qe.isOpen {
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check be not performed under lock or is it protected by higher order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state manager ensures that Open and Close are not called concurrently. A lock is needed only if the variable is accessed by other functions, which is the case in the other situations.

There are a few places where I do this check inside a lock even though it's not needed. No harm.

Comment on lines +472 to +473
case <-tmr.C:
log.Fatal("Shutdown took too long. Crashing")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be tested because it will crash the test :).

Copy link
Contributor Author

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The InitDBConfig is a naming convention that implies the completion of an incomplete "New" instantiation. So, giving it other meaningful names will cause this semantic to get lost.

I'm hoping to get rid of this two-phase initialization as part of my refactor. Let's see if I succeed.

Comment on lines 237 to +241
// Open must be called before sending requests to QueryEngine.
func (qe *QueryEngine) Open() error {
if qe.isOpen {
return nil
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state manager ensures that Open and Close are not called concurrently. A lock is needed only if the variable is accessed by other functions, which is the case in the other situations.

There are a few places where I do this check inside a lock even though it's not needed. No harm.

Comment on lines +472 to +473
case <-tmr.C:
log.Fatal("Shutdown took too long. Crashing")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be tested because it will crash the test :).

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
@sougou sougou merged commit b6c7ffc into vitessio:master Jul 14, 2020
@sougou sougou deleted the ss-ts1-state branch July 14, 2020 02:03
@deepthi deepthi added this to the v7.0 milestone Jul 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants