Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coprocessor: parallel build cop tasks #14384

Closed
wants to merge 47 commits into from

Conversation

ekalinin
Copy link
Contributor

@ekalinin ekalinin commented Jan 8, 2020

PCP #14320

What problem does this PR solve?

Build all cop tasks in a separate goroutine, and send them immediately to copIteratorWorker.

What is changed and how it works?

Added three new functions which returns a channel instead of slice:

  • buildCopTasksChan which is almost the same as buildCopTasks, except it does all requests in a separate goroutines and returns a channel, where all new tasks will be sent.
  • buildTiDBMemCopTasksChan function. almost the same as
    buildTiDBMemCopTasks, except it returns a channel and sends tasks immediately

Also:

  • Added a new function type: splitRangesCallback (used in splitRanges)

Changes in other functions and structures:

  • CopClient.Send: removed buildCopTasks call (buildCopTasksChan will be used in copIterator.open)

  • copIterator:

    • tasks is a channel, will be filled from copIteratorTaskSender.run -> sendToTaskCh
  • copIteratorTaskSender:

    • added iterator field - reference to copIterator. Will be used to fill copIterator.tasks in copIteratorTaskSender.run
    • added newTasksCh field. It will be used to get new tasks from buildCopTasksChan
    • tasks is a channel to send tasks into copIterator.Next (order for processing)
  • copIteratorTaskSender.run:

    • iterating thru newTasksCh instead of tasks and fill copIterator.tasks during an iteration (via copIteratorTaskSender.sendToTaskCh)
  • copIterator.open:

    • used buildCopTasksChan to create a channel for getting new tasks
  • Added tests for buildCopTasksChan

Check List

Tests

  • Unit test

Code changes

  • Has exported function/method change
  • Has exported variable/fields change

Side effects

  • Increased code complexity

- CopClient.Send: removed buildCopTasks call
- copIterator:
  - tasks is empty at the start
  - added tasksFilled field to indicate thet tasks are filled
- Added buildCopTasksChan which is almost the same as buildCopTasks, except
  it returns a channel, where all new tasks will be sent
- Added a new function type: splitRangesCallback
- Added buildSplitRangesCallback to create a callback function wich is used in
  buildCopTasks & buildCopTasksChan for gettings tasks
- Added buildTiDBMemCopTasksChan. almost the same as
buildTiDBMemCopTasks, except it returns a channel
- copIteratorTaskSender:
  - added iter field. to be able to fill copIterator.tasks in
    copIteratorTaskSender.run
  - added newTasksCh field. to get new tasks from buildCopTasksChan
  - removed tasks field
- copIteratorTaskSender.run:
  - iterating via tasksCh instead of tasks
  - send a signal after the loop by tasksCh into iter.tasksFilled
- copIterator.open:
  - used buildCopTasksChan to create a channel for getting new tasks
- copIterator.Next:
  - for the KeepOrder branch only: start iterate over tasks slice only
    after tasksFilled channel got a message (from
    copIteratorTaskSender.run)
- Added tests for buildCopTasksChan
@sre-bot
Copy link
Contributor

sre-bot commented Jan 8, 2020

Thanks for your contribution. If your PR get merged, you will be rewarded 2500 points.

@sre-bot sre-bot added the contribution This PR is from a community contributor. label Jan 8, 2020
@francis0407 francis0407 requested a review from SunRunAway January 8, 2020 02:46
@zz-jason zz-jason added the sig/execution SIG execution label Jan 10, 2020
@zz-jason zz-jason requested a review from qw4990 January 10, 2020 00:30
@zz-jason zz-jason added the type/enhancement The issue or PR belongs to an enhancement. label Jan 10, 2020
Copy link
Contributor

@SunRunAway SunRunAway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @ekalinin do you have any clue to fix the tests you commented?

@ekalinin
Copy link
Contributor Author

Hi, @ekalinin do you have any clue to fix the tests you commented?

Not yet :( I hope today I will have time to fix them

@ekalinin
Copy link
Contributor Author

Hi @SunRunAway

Found the reason of failed tests. It was sender.sendRate (deadlock between sender.sendRate/it.putToken & <-it.tasksFilled).

So, I've removed it for a while. As far as i can see sendRate was added in #11578 as a fix to prevent OOM. Is it still actual?

@qw4990 qw4990 removed their request for review January 19, 2020 03:42
@ekalinin
Copy link
Contributor Author

Anyway, found a way to use sendRate with a new logic.
So now PR is ready to review (i think ;))

@ekalinin
Copy link
Contributor Author

/run-all-tests

@SunRunAway SunRunAway self-requested a review January 20, 2020 05:03
Copy link
Contributor

@SunRunAway SunRunAway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @ekalinin , it looks great.
I'm very sorry for the late reply, I've left several comments and hope to merge it soon.

@@ -526,8 +611,9 @@ func (worker *copIteratorWorker) run(ctx context.Context) {
}

// open starts workers and sender goroutines.
func (it *copIterator) open(ctx context.Context) {
func (it *copIterator) open(ctx context.Context) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you draw a text graph to illustrate how tasks will be transisted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do it today. What would better: just add it here, or add it as a comment into code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a comment is good to me.

@@ -231,28 +267,45 @@ func buildCopTasks(bo *Backoffer, cache *RegionCache, ranges *copRanges, req *kv
tableStart, tableEnd = keyRange[0].StartKey, keyRange[0].EndKey
}

if shouldStop(finishCh) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not only check finishCh here, but also check it each time a task sends to receiver, to make sure it will not be blocked by receiver channel if copIteratorTaskSender.run exits.

replicaReadSeed: it.replicaReadSeed,
}
go worker.run(ctx)
}
bo := NewBackoffer(ctx, copBuildTaskMaxBackoff).WithVars(it.vars)
_, err := buildCopTasksChan(bo, it.store.regionCache,
Copy link
Contributor

@SunRunAway SunRunAway Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The errCh should not be discarded here, and we should check errCh each time it tries to retrieve a task somewhere, cancel all tasks and throw the error when error of buildCopTasksChan happens.

}

func (sender *copIteratorTaskSender) run() {
// Send tasks to feed the worker goroutines.
for _, t := range sender.tasks {
for t := range sender.tasksFromBuilderCh {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This chanel should not be blocked when,

  1. buildCopTasksChan meets an error
  2. buildCopTasksChan get an exit signal from finishCh

And copIterator.Next should return error if buildCopTasksChan meets an error.

@sre-bot
Copy link
Contributor

sre-bot commented Feb 19, 2020

@ekalinin, please update your pull request.

1 similar comment
@sre-bot
Copy link
Contributor

sre-bot commented Feb 26, 2020

@ekalinin, please update your pull request.

@github-actions github-actions bot added the sig/sql-infra SIG: SQL Infra label Mar 8, 2020
@zz-jason zz-jason requested a review from SunRunAway March 9, 2020 14:39
@ekalinin ekalinin closed this Mar 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tikv contribution This PR is from a community contributor. sig/sql-infra SIG: SQL Infra type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants