Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented PITR-2 #6408

Merged
merged 37 commits into from
Jul 18, 2020
Merged

Implemented PITR-2 #6408

merged 37 commits into from
Jul 18, 2020

Conversation

arindamnayak
Copy link
Contributor

@arindamnayak arindamnayak commented Jul 3, 2020

Implemented PITR-2
Fixes #6267, fixes #4886

Here are the changes I did.

  • vttablet will accept binlog server parameter
  • all those parameters will be used while restoring the tablet and it need a snapshot keyspace.
  • from the snapshot keyspace type, we will get the snapshot_time
  • we will get the GTID from the snapshot time
  • then we will replicate from binlog server upto that GTID
  • once we reach that GTID , the slave connection will be closed and reset.

- Added flags for restore to time
- Added flags for binlog server details

Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
- added ability to fetch the GTID from the restore time
- added ability to apply the binlogs till the above GTID

Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
…chan

Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
@arindamnayak arindamnayak requested a review from sougou as a code owner July 3, 2020 18:47
@arindamnayak arindamnayak linked an issue Jul 3, 2020 that may be closed by this pull request
@arindamnayak arindamnayak marked this pull request as draft July 3, 2020 18:48
@arindamnayak arindamnayak requested a review from deepthi July 3, 2020 18:48
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
@arindamnayak arindamnayak force-pushed the pitr-2 branch 2 times, most recently from 1bffb9e to dc8af79 Compare July 9, 2020 12:12
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
@arindamnayak arindamnayak force-pushed the pitr-2 branch 2 times, most recently from fb065be to a989cf2 Compare July 10, 2020 18:29
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
@@ -64,6 +64,11 @@ type Engine struct {
notifierMu sync.Mutex
notifiers map[string]notifier

// SkipMetaCheck skips the metadata about the database and table information
SkipMetaCheck bool
Copy link
Contributor Author

@arindamnayak arindamnayak Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sougou , is it ok if we add this flag here? This flag will skip the sql query to get table information ( form information_schema). We need this when we need to get the GTID from timestamp , as the binlog server does not support the query and we also don't need that info in that process.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good for now. I see that you're just using it to covert timestamp to gtid. There may be a simpler way to do that without having to use vstream, but we can use this method until we figure that out.

…le names

Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly comments and documentation, but there is an important logic issue as well.

.github/workflows/docker_test_2.yml Show resolved Hide resolved
go/mysql/flavor_mysql.go Outdated Show resolved Hide resolved
go/test/endtoend/recovery/pitr/pitr_test.go Outdated Show resolved Hide resolved
binlogPort = flag.Int("binlog_port", 0, "(PITR restore parameter) port of binlog server.")
binlogUser = flag.String("binlog_user", "", "(PITR restore parameter) username of binlog server.")
binlogPwd = flag.String("binlog_password", "", "(PITR restore parameter) password of binlog server.")
timeoutForGTIDLookup = flag.Duration("binlog_lookup_timeout", 60*time.Second, "(PITR restore parameter) timeout for fetching gtid from timestamp.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think of gtid_lookup_timeout for this flag name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that does not signifies which gtid we are referring to, let me know if we should use that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recovery_gtid_lookup_timeout?

go/vt/vttablet/tabletmanager/restore.go Outdated Show resolved Hide resolved
go/vt/vttablet/tabletmanager/restore.go Outdated Show resolved Hide resolved
go/vt/vttablet/tabletmanager/restore.go Outdated Show resolved Hide resolved
go/vt/vttablet/tabletmanager/restore.go Outdated Show resolved Hide resolved
go/vt/vttablet/tabletserver/schema/engine.go Outdated Show resolved Hide resolved
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
@arindamnayak
Copy link
Contributor Author

Mostly comments and documentation, but there is an important logic issue as well.

I have implemented the suggestion you mentioned.

Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
@@ -32,7 +32,8 @@ type mysqlFlavor struct{}

// masterGTIDSet is part of the Flavor interface.
func (mysqlFlavor) masterGTIDSet(c *Conn) (GTIDSet, error) {
qr, err := c.ExecuteFetch("SELECT @@GLOBAL.gtid_executed", 1, false)
// making @@global as lowercase, as the PITR depends on binlog server, which honours only lowercase `global` value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just make this:

// keep @@global as lowercase, as some servers like the Ripple binlog server only honors a lowercase `global` value

Otherwise people might get confused (since this is not PITR-specific code)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I have updated the comment.

@deepthi
Copy link
Member

deepthi commented Jul 17, 2020

Can you balance tests between docker_test_1 and docker_test_2?

docker test 2 / Docker Test 2 (pull_request) Successful in 19m
docker_test_1 / Docker Test 1 (pull_request) Successful in 9m

Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
@arindamnayak
Copy link
Contributor Author

Can you balance tests between docker_test_1 and docker_test_2?

docker test 2 / Docker Test 2 (pull_request) Successful in 19m
docker_test_1 / Docker Test 1 (pull_request) Successful in 9m

I have balanced the tests

Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic looks good.
A few more nits. We can merge once those are addressed.

go/test/endtoend/cluster/cluster_util.go Outdated Show resolved Hide resolved
go/test/endtoend/recovery/pitr/main_test.go Outdated Show resolved Hide resolved
if proc, err := tablet.MysqlctlProcess.StartProcess(); err != nil {
return err
} else {
// ignore golint warning, we need the else block to use proc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you need a //nolint here? otherwise CI will fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why golangci-lint did not capture this earlier, it is passing without //nolint, to be on safe side I have added it.

Signed-off-by: Arindam Nayak <arindam.nayak@outlook.com>
@arindamnayak
Copy link
Contributor Author

The new logic looks good.
A few more nits. We can merge once those are addressed.

Implemented the suggested changes and also updated the flag name.

@deepthi deepthi merged commit e86a967 into vitessio:master Jul 18, 2020
@deepthi
Copy link
Member

deepthi commented Jul 18, 2020

Very nice work! This has been a long-standing request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC: Point in time recovery (PITR) Part 2 Point in time recovery (PITR)
4 participants