Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vttablet: fast and reliable state transitions #7011

Merged
merged 18 commits into from
Nov 12, 2020

Conversation

sougou
Copy link
Contributor

@sougou sougou commented Nov 10, 2020

Fixes #6982

Robust query kill

  • dbconn.go: query killer closes the connection before killing the server side query.
  • A different non-connection error is returned, which causes the code to not retry. But the error contains errno 2013 which clients can interpret as a killed connection.

StatefulConnection: ID -> ReservedID

The StatefulConnection had its own ID function that was hiding the underlying ID from dbConn. It has been renamed so that a StatefulConnection can be given to the QueryList for killing connections on demand.

Multiple Query Lists

  • The previous streamQList in queryEngine has been replaced by three distinct lists: olapql, statelessql and statefulql, and they are all in TabletServer.
    • olapql is same as the previous streamQList.
    • statelessql is for queries from the regular conn pool
    • statefulql is for queries from the stateful pool
  • The previous /stream_queryz URL has been renamed to /livequeryz. This URL now shows queries executing from all three query lists. An additional column has been added to show the pool of the query.
  • Changed the displayer to redact values if the redact flag is set.

TransactionShutdownSeconds -> ShutdownSeconds

This variable has been renamed to reflect the new meaning. The old flag continues to exist for backward compatibility and sets the same underlying variable.

timer.SleepContext

This has become a common pattern. So, I created this wrapper. We can later change other places to leverage this new function.

StateManager handles the grace period

  • If a transition is going from master to non-master or non-serving, the state manager enforces the grace period. If grace period is exceeded, stateful and stateless connections are killed, which allows the transactions to close and all pending requests to terminate. This will cause a prompt shut down.
  • If a transition is going from non-master to master, then stateful queries are immediately shutdown, which is enough for the tx pool to go read-write.

TxEngine simplified

The TxEngine had a race condition where it was possible for a Begin to slip through after shutdown has been initiated. The code was also too complex because it was trying to handle some unnecessary concurrency. There is no need for concurrency because the state manager enforces synchronization of calls. The refactored code is much simpler and avoids the Begin race.

pools.Numbered.GetByFilter

It was necessary for the tx pool to extract just the non-transactional connections from the stateful pool so that they are released first. I added a new GetByFilter function that can be used to get specific connections as defined by the filter func.

StatefulConnectionPool shutdown modes

  • ShutdownNonTx should be first called. It releases all idle non-transactional connections. Non-idle connections are released as soon as they are unlocked.
  • ShutdownAll is called next. It returns the list of idle transactions so the caller (TxPool) rolls them back. The rest of the connections are closed and released by the stateful pool as they are returned

TxEngine shuts down in two phases

  • In immediate mode, the two-phase shutdown is called back to back.
  • In normal mode, the non-tx shutdown is called first. After the grace period elapses, the All shutdown is invoked. This timing will coincide with the state manager's kill of all queries. This should cause the tx pool to be closed immediately.

Query timeout

If a query is executed in a transaction, its time out is now the minimum of the query timeout and the transaction timeout.

sougou added 18 commits November 9, 2020 19:16
The previous query kill waited for mysql to return an error after
initiating the kill. Sometimes, mysql can take a long time to kill
queries. In this new code, we proactively close the client side
connection. This will cause the execute to return immediately.

We also return a custom non-connection error that will prevent
the Exec function from unnecessarily retrying the query.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
The underlying dbConn has an ID function which is the mysql conn ID.
We need to expose that in StatefulConnection with the same meaning.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Add support for /livequeryz to replace /streamqueryz

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Also separate out oltp and olap into different query lists
because they have different kill rules.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Also renamed the var to olapql to align with oltpql.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
TxEngine state transitions don't get called concurrently. Removed
all the synchronization logic.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
We're going to split oltp into transactional and non-transactional.
This will lead to three URL paths for each query list. It's better
to group them into a single URL, and add a type for each query list.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Separate out oltp into two separate qls so we can do faster
transitions. With this change, a transition from REPLICA to
MASTER will be immediate because we kill all currently running
stateful queries.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Needed for extracting no-transactional stateful connections.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
@sougou
Copy link
Contributor Author

sougou commented Nov 10, 2020

@mattlord

Comment on lines +117 to +118
gotBytes, _ := yaml2.Marshal(config)
log.Infof("Config:\n%s", gotBytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this adding for debugging or want to publish this on startup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to publish this on startup. VTTablet does the same thing. It's useful for knowing what the default values are for each flag. But it doesn't spam the output because it shows up only if you -alsologtostderr.

@sougou sougou merged commit 56d55e3 into vitessio:master Nov 12, 2020
@sougou sougou deleted the ss-fs2-fast-shutdown branch November 12, 2020 00:08
@askdba askdba added this to the v9.0 milestone Nov 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC: Fast and reliable vttablet state transitions
4 participants