Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mpp: check the tiflash availabilities before launching mpp queries. (#26130) #26192

Merged
merged 8 commits into from
Jul 16, 2021

Conversation

ti-srebot
Copy link
Contributor

@ti-srebot ti-srebot commented Jul 13, 2021

cherry-pick #26130 to release-5.1
You can switch your code base to this Pull Request by using git-extras:

# In tidb repo:
git pr https://github.com/pingcap/tidb/pull/26192

After apply modifications, you can push your change to this PR via:

git push git@github.com:ti-srebot/tidb.git pr/26192:release-5.1-d696ce33a79e

What problem does this PR solve?

Issue Number: close pingcap/tiflash#1807

Problem Summary:
Right now, TiDB's mpp client don't know which tiflash nodes are available before the mpp query starts. Then we adopted a very conservative method that only uses the tiflash nodes executed in the last query. This leads to the problem that if there are at least two tiflash replica for regions, we might not detect the new scale-out tiflash nodes and cannot fully exploit the cluster capibilities.

What is changed and how it works?

What's Changed:
To totally tackle this problem, we sent a "IsAlive" rpc request to every tiflash node before the query is launched. Then we can clearly know which nodes can be used.

Check List

Related Changes:
kvproto pr: pingcap/kvproto#781
client-go pr: tikv/client-go#225

Tests

  • Unit test, the original tests can make sure if all the nodes don't response, the query act as before.
  • Manual test (add detailed scripts or steps below)
    • env: three tiflash that is compitiable to this pr, meaning they can response "IsAlive" requests. Two tidb, one of them is original tidb, another one is patched by this pr.
    • for 1 replica, if we stop a tiflash node, both origin tidb and patched tidb works well. Then add a tiflash node, both tidb can use the new one.
    • for 2 replica, if we stop a tiflash node and launch a query for each tidb, original tidb will report an error at first, then can execute normally. the patched one can response normally whether for the query at first time or for the following queries.
    • if we scale-out by one tiflash node, the original tidb cannot detect it, but the patched one can detect new tiflash immediately

Release note

  • mpp: check the tiflash availabilities before launching mpp queries.

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor Author

/run-all-tests

@ti-srebot ti-srebot added sig/transaction SIG:Transaction size/L Denotes a PR that changes 100-499 lines, ignoring generated files. type/5.1-cherry-pick labels Jul 13, 2021
@ti-srebot
Copy link
Contributor Author

@hanfei1991 you're already a collaborator in bot's repo.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 15, 2021
@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • lysu
  • youjiali1995

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 15, 2021
@hanfei1991
Copy link
Member

/merge

@ti-chi-bot
Copy link
Member

@hanfei1991: /merge is only allowed for the committers, you can assign this pull request to the committer in list by filling /assign @committer in the comment to help merge this pull request.

In response to this:

/merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@lysu
Copy link
Contributor

lysu commented Jul 15, 2021

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: f4aade0

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 15, 2021
@ti-chi-bot
Copy link
Member

@ti-srebot: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@hanfei1991
Copy link
Member

/run_unit-test

@hanfei1991
Copy link
Member

/run-unit-test

@ti-chi-bot ti-chi-bot merged commit baa3b72 into pingcap:release-5.1 Jul 16, 2021
@youjiali1995 youjiali1995 mentioned this pull request Jul 16, 2021
1 task
@zhouqiang-cl zhouqiang-cl added this to the v5.1.1 milestone Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-approved Cherry pick PR approved by release team. sig/transaction SIG:Transaction size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/5.1-cherry-pick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants