-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tidb-backup: add restoreUsingExistingVolume option #1708
Conversation
b8498c6
to
e6685d1
Compare
hmm, I don't think this should be necessary. But either way it is better to use the tidb-lightning chart. tidb-lightning with the tidb backend is preferred over the tidb-backup restore. |
what is the alternative?
@shenli recommended using DM for sub-TB and lightning for TB scale. DM has no chart so I was using the tidb-backup loader. What are the tradeoffs? I think we'll mostly be loading < 1TB of data. |
@mightyguava DM is for streaming migration, it continues running and synchronizes data from upstream. Lightning is for one-time importing job. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I'd recommend tidb-lightning too, it is just like |
/merge |
Your auto merge job has been accepted, waiting for 1710 |
@mightyguava |
/run-all-tests |
cherry pick to release-1.1 in PR #1712 |
cherry pick to release-1.0 failed |
updated by comment Update main.go enable github actions (pingcap#1690) deploy: close connection after set privilege (pingcap#1692) fix tls client cert bug (pingcap#1693) Co-authored-by: Song Gao <disxiaofei@163.com> change check podList update blockwrite imagePullPolicy update DefaultPollTimeout fix error msg Add secret get/list for webhook rbac (pingcap#1704) add more SANs to tidb server certificate (pingcap#1702) * add more SANs to tidb server certificate * address comment * address comment * codegen * fix CI tidb-backup: add restoreUsingExistingVolume option (pingcap#1708) Add consecutive count check for Auto-scaling (pingcap#1703) * Add consecutive count check for Auto-scaling tidb-backup: restore respects resources, imagePullPolicy, nodeSelector (pingcap#1705) tidb-initializer:close connection after set privilege (pingcap#1710) fix autoscaler api (pingcap#1718) use kubetest2 to run our e2e and support GKE provider (pingcap#1716) fix tidb-lightning errors (pingcap#1723) fixes for pingcap@9a3b1e2 Add auto-scaling calculation based by CPU load (pingcap#1722) * add cpu metrics func Remove consecutive check (pingcap#1732) * remove consecutive check temporarily Support user-defined tidb server/client certificate (pingcap#1714) * support user custom certificate * refine API * fix typo * fix some bugs * create service before certificate * tiny fix * address comments * address comment * address comment Co-authored-by: Song Gao <disxiaofei@163.com> Finish auto-scaler controller (pingcap#1731) * finish auto-scaler controller * revise compare tc add basic yaml deployment example (pingcap#1573) * add basic yaml deployment example Signed-off-by: Aylei <rayingecho@gmail.com> * Address review comments Signed-off-by: Aylei <rayingecho@gmail.com> * fix typo Signed-off-by: Aylei <rayingecho@gmail.com> Co-authored-by: Yecheng Fu <cofyc.jackson@gmail.com> Co-authored-by: Song Gao <disxiaofei@163.com> Update deploy_tidb_operator_staging.groovy (pingcap#1740) Co-authored-by: pingcap-github-bot <sre-bot@pingcap.com> add stale github action to close stale issues/prs (pingcap#1743) addd logs in stability test add debug log Update _start_pd.sh.tpl Update main.go add logs Fix TidbMonitor template error (pingcap#1745) * Fix TidbMonitor template error support eks provider in e2e (pingcap#1728) * support eks provider in e2e * upgrade to use kubetest v0.0.3 * prefix image tag with CLUSTER, then multiple clusters can be started in same project/account * - upgrade kubetest2-eks to v0.0.4 - use unique node group name * $RANDOM should be enough * support KUBE_WORKERS * fix mngName * fix e2e bug * specify runner suite name * increase open files for containers automatically * use kubetest2 v0.0.6 * --up-retries * decrease concurrency because in each node we will start a lots of pod Co-authored-by: Song Gao <disxiaofei@163.com> Support AdvancedStatefulSet in admission webhook (pingcap#1640) * Support AdvancedStatefulSet in admission webhook Make TidbMonitor intergrated in AutoScaler (pingcap#1747) add deploy yamls for dm with new ha architecture (pingcap#1738) * add deploy yamls for dm with new ha architecture * fix format * address comments * add configmap for dm-master Allow to configure Affinity or Tolerations for Backups and Restores (pingcap#1737) * Allow to configure Affinity or Tolerations for Backups and Restores * Add affinity and tolerations options to Helm charts for backup and restore * Update CRD's, affinity and tolerations for backup and restore Co-authored-by: Song Gao <disxiaofei@163.com> update actions/checkout to v2 (pingcap#1758) Co-authored-by: Song Gao <disxiaofei@163.com> Revise controller log and fix deployment template error (pingcap#1735) * revise log and fix deployment template error use /hack/e2e.sh to run a single node kind cluster for develop (pingcap#1749) upgrade local volume provisioner to 2.3.4 (pingcap#1778) https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner/releases/tag/v2.3.4 make the drainer name configurable (pingcap#1604) * make the drainer name configurable This is used for the statefulset/pod names. The release name is already unique, so I would actually suggest just using that without the cluster name. However, that is a backwards incompatible change that I hesitate to make. * add a warning about updating the drainer name Co-authored-by: Yecheng Fu <cofyc.jackson@gmail.com> Co-authored-by: Song Gao <disxiaofei@163.com> Added marketplace product code filter for bastion to avoid selecting AMI from wrong vendor (pingcap#1775) tls: fix cluster TLS while using CR to create cluster (pingcap#1773) binlog: add tls in pump and drainer (pingcap#1739) use TidbCluster CRD to simplify the test and increase wait timeout (pingcap#1786) release v1.1.0-beta.2 (pingcap#1768) * release v1.1.0-beta.1 * update * Update CHANGELOG-1.1.md Co-Authored-By: weekface <weekface@gmail.com> * Apply suggestions from code review Co-Authored-By: Keke Yi <40977455+yikeke@users.noreply.github.com> * enable tidbBackupManagerImage and use tagged version Co-authored-by: weekface <weekface@gmail.com> Co-authored-by: Keke Yi <40977455+yikeke@users.noreply.github.com> Co-authored-by: Song Gao <disxiaofei@163.com> terraform fmt (pingcap#1792) Update tidb-backup-manager image name (pingcap#1791) binglog: fix tls error when create pump with TLS when use CRD (pingcap#1799) fix master ci (pingcap#1802) add prefix for remote storage (pingcap#1790) fix tikv cluster tls bug (pingcap#1808) Manage hot region label for the tikv created by auto-scaler (pingcap#1801) * Manage hot region label for the tikv created by auto-scaler Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> Replace glog with klog (pingcap#1805) (pingcap#1813) enable defaulting (pingcap#1816) show cli flags in logs (pingcap#1807) use k8s standard tls secret format (pingcap#1824) * use standard tls secret format * fix tls config in prometheus scrape config Refactor Admission Webhook templates and values (pingcap#1832) * Refactor Admission Webhook templates and values Co-Authored-By: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> Make evict leader scheduler compatitable (pingcap#1831) * fix evict leader Co-authored-by: pingcap-github-bot <sre-bot@pingcap.com> upgrade kubetest2 to v0.0.7 (pingcap#1839) fix defaulting (pingcap#1845) Default TidbMonitor targetRef Namespace (pingcap#1834) current delete slot annotations check in Advanced Statefulset upgrader is not right (pingcap#1851) add hack/local-up-operator.sh to run tidb-operator locally and test examples (pingcap#1854) Support no secret for s3/ceph (pingcap#1817) * Support no secret for s3/ceph This is required if you use EKS ServiceAccount -> IAM role authentication via OIDC. * Use the environment directly for AWS credentials for rclone * Fixes to backup scripts * Update backup image to `pingcap/tidb-cloud-backup:20200229` Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> Co-authored-by: Tennix <tennix@users.noreply.github.com> Backup/Restore: support configuring TiKV GC life time (pingcap#1835) fix tidb defaulting (pingcap#1860) Backup: support TLS for br component (pingcap#1836) * backup: add TLS to backup br support starting tidb-server with `-advertise-address` parameter (pingcap#1859) * start tidb-server with * add EnableAdvertiseAddress switch * fix indent * address comments Fix hot region label setting for tikv auto-scaling (pingcap#1833) * mutate * fix log * add admission configuration * remove useless log * format by comment * use tikv cli * remove useless code * remove cmlister * fix lint * fix tpl Change the lightning restore image (pingcap#1869) Fix wrong method to get tikv configmap in mutation webhook (pingcap#1871) tls: Enable TLS For MySQL Clients (pingcap#1867) * Enable TLS For MySQL Clients * address comments Add timestamp annotation in tidbcluster statefulset (pingcap#1875) * Add timestamp annotation in tidbcluster statefulset fix drainer chart: unexpected define in command (pingcap#1873) fix kubetest2 version check (pingcap#1881) remove unnecessary setup (pingcap#1880) Fix defaulting webhook error (pingcap#1876) * fix defaulting * remove config validation * fix e2e test * fix e2e test feature:make Service port name configurable for tidb and pd service (pingcap#1823) * feature:make Service port name configurable for tidb and pd service * reset default port name * reset default port name * reset default port name * recover pd service clusterIP * comment pd and tidb service portName * set pd and tidb port name value in yaml * set pd and tidb port name value in yaml Fix tikv configuration key in toml and add an ut case (pingcap#1887) Fix nil error for update statefulset util (pingcap#1896) guide on manual tests in development (pingcap#1882) Co-authored-by: pingcap-github-bot <sre-bot@pingcap.com> run e2e tests in gke (pingcap#1889) fix operator failover config invalid (pingcap#1877) use cert-manager to create and renew tidb-server certificates (self-signed example) (pingcap#1844) * selfsigned tls cert created by cert-manager * add tests improve note and revise idc config (pingcap#1904) Co-authored-by: pingcap-github-bot <sre-bot@pingcap.com> Support IAM role for backup CRD (pingcap#1861) tls: TLS between TiDB components (pingcap#1870) add aws ami version link (pingcap#1903) add tikv-importer chart (pingcap#1910) * add tikv-importer chart * resolved some suggestions Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> fix prometheus scrape config issue while TLS is enabled (pingcap#1919) * fix prometheus scrape config while tls is enabled * fix chart problem * fix chart problem update eks e2e script and jenkins file (pingcap#1915) backup: support kms decryption secret (pingcap#1908) Support sync bucket in lightning (pingcap#1629) * Support sync bucket in lightning Signed-off-by: Aylei <rayingecho@gmail.com> * fix nodeSelector is not respected in tidb-lightning chart Signed-off-by: Aylei <rayingecho@gmail.com> * Fix nodeSelector indention Signed-off-by: Aylei <rayingecho@gmail.com> Add API document and its generating util (pingcap#1929) defaulting tikv container privileged field (pingcap#1933) Backup: make tikv support add serviceaccount and switch rclone env_auth to true (pingcap#1930) re comment pd service yaml value (pingcap#1850) * re comment pd service yaml value * fix check tidb clusters which own builtin StatefulSets only in upgrading (pingcap#1934) * Revert "current delete slot annotations check in Advanced Statefulset upgrader is not right (pingcap#1851)" This reverts commit 596d10d. * only check relevant tidb clusters fix drainer installation error (pingcap#1961) fix bug in e2e-examples script (pingcap#1957) Adding tolerations and affinity to the discovery chart template (pingcap#1959) update permission for tidb-controller-manager and add example for tidb-monitor (pingcap#1954) * update permission for tidb-controller-manager and add example for tidb-monitor * address comments Fix some webhook error (pingcap#1963) * fix webhook error backup: fix kms bug (pingcap#1955) backup: mask visual tables when dumper (pingcap#1970) add a serial test for stable scheduling (pingcap#1972) make tidb-initializer support TLS (pingcap#1931) some cleanups in e2e (pingcap#1974) TLS support for Pump and Drainer (pingcap#1979) Fix TidbMonitor several error (pingcap#1962) * Fix TidbMonitor several error lightning: support lightning use IAM (pingcap#1975) configure default parameters via envs in Jenkins job (pingcap#1989) fix clean bug (pingcap#1991) Add doc and examples for auto-scaler and intializer (pingcap#1772) * add doc and examples * fix by lint * revise the example * revise init * revise examples * Update tidb-cluster.yaml * revise by comment * revise examples * fix by lint * address the comment * Update examples/initialize/README.md Co-Authored-By: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> * Update examples/auto-scale/README.md Co-Authored-By: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> backup: support br compatible with new TLS interface (pingcap#1988) * backup: support br compatiable with new TLS interface Mount google-cloud-sdk into e2e image (pingcap#1997) add stability e2e group and a basic case (pingcap#1986) Co-authored-by: Song Gao <disxiaofei@163.com> Allocate tidb.initializer.resources to initcontainer in tidb initializer job (pingcap#1938) Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> Co-authored-by: pingcap-github-bot <sre-bot@pingcap.com> pin alicloud version to fix ci errors (pingcap#2006) improve orphan pods clean logic (pingcap#2007) - check pod has been scheduled or not - use ResourceVersion precondition fix args passing (pingcap#2010) Update TiDB Config to v3.1.0 (pingcap#1906) * update tidb config Update PD Config to v3.1.0 (pingcap#1928) * update pd config create tidb cluster with cr on aws (pingcap#2004) Backup: open mysql client TLS in backup (pingcap#2003) create tidb cluster on ack with cr (pingcap#2012) * create tidb cluster on ack with cr * update variable name * update variable name * update default replicas Add tikv store limit pattern (pingcap#1965) * add tikv limit pattern fix default value of separateSlowLog (pingcap#2023) UCP: additionalPrinterColumns tidbautoscaler (pingcap#1943) Limit Autofailure condition (pingcap#2015) don't run on k8s-node if branch is a commit (pingcap#2032) fix crd util (pingcap#2031) remove dependencies on k8s-node (pingcap#2036) * remove dependencies on k8s-node * fix examples for advanced statefulset (pingcap#2039) Deploy TiDB Cluster with CR via TiDB Operator v1.1 on GKE (pingcap#2027) install tidb-operator in test namespace in non-parallel test specs (pingcap#2029) * install tidb-operator in test namespace in non-parallel test specs * check tidb pods only Able to configure custom env for components (pingcap#2052) * Able to configure custom env for components * codegen Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> fix error fix error Update failover.go fix failover pd format add spec.paused field to pause the syncing of tidb cluster (pingcap#2013) Default Tidb Log File Configuration (pingcap#2045) * default tidb file log config Fix TidbMonitor Service Label (pingcap#2051) * Fix TidbMonitor Service Label Fix location label (pingcap#1941) * Fix location label Signed-off-by: Aylei <rayingecho@gmail.com> * Fix api doc Signed-off-by: Aylei <rayingecho@gmail.com> * Separate struct used for crd and pd client Signed-off-by: Aylei <rayingecho@gmail.com> * Fix boilerplate Signed-off-by: Aylei <rayingecho@gmail.com> Co-authored-by: Song Gao <disxiaofei@163.com> Co-authored-by: Yecheng Fu <fuyecheng@pingcap.com> Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> backup: fix issue pingcap#2028 (pingcap#2062) cert-allowed-cn support (pingcap#2061) * cert-allowed-cn support * cert-allowed-cn for drainer * tiny fix * fix ci Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> Co-authored-by: pingcap-github-bot <sre-bot@pingcap.com> remove enableAdvertiseAddress field, --advertise-address should be (pingcap#2076) always configured backup: fix issue 1657 (pingcap#2071) use tidb-lightning in restore instead of loader (pingcap#2068) * use tidb-lightning * Update restore.go * Update cmd/backup-manager/app/import/restore.go Co-Authored-By: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> * Update images/tidb-backup-manager/Dockerfile Co-Authored-By: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> Co-authored-by: pingcap-github-bot <sre-bot@pingcap.com> should not change relative order of envs (pingcap#2086) release v1.1.0-rc.1 (pingcap#2072) * release v1.1.0-rc.1 * address comments * address comments * address comments BR e2e test in AWS (pingcap#2038) readme, static: update doc links and image in readme (pingcap#2094) * readme, static: update doc links and image in readme * add description to documentation Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> docs: remove unnecessary duplicated docs (pingcap#2098) UCP pingcap#1753: add timeout config for query metrics from Prometheus (pingcap#2093) Signed-off-by: Qiannan Lyu <lvqiannan@gmail.com> Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> fix pd failover reocver Update failover.go fix begin insert Pod Update failover.go remove unnecessary change remove code add cluster2 fix check update failover fix stopNode add cluster3 fix alertmanager (pingcap#2108) check tidb cluster in owner references (pingcap#2112) Update Doc util (pingcap#2115) upgrade cert-manager to v0.14.1 in example tests (pingcap#2118) Add unit test for Auto-scaling Util (pingcap#2111) add a recovery test on node deletion for eks/gke (pingcap#2119) Set PD Dashboard Config when TLS Client enabled (pingcap#2085) support v<major>.<minor> format in KUBE_VERSION and add v1.18 support (pingcap#2126) * support v<major>.<minor> format in KUBE_VERSION and add v1.18 support * move hack::ensure_xxx after envs are printed Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> kill tidb-operator pods randomly in e2e (pingcap#2125) * kill tidb-operator pods randomly in e2e * don't use channel in configuration struct * add successful log crd for tiflash (pingcap#2122) * crd for tiflash * generated files * update defaulting for tiflash config * address comments * fix ci * update crd * update storage type definition Make webhook tls configuration easy to use (pingcap#2135) * Make webhook tls configuration easy to use change tidb readness probe to TCPSocket 4000 port (pingcap#2139) tls for tikv metircs api (pingcap#2137) Remove unnecessary informer caches (pingcap#1504) Add auto-scaling e2e test (pingcap#2123) * add auto-scaling e2e * fix interval error * fix e2e process * remove useless code * Update tests/e2e/tidbcluster/serial.go Co-Authored-By: Yecheng Fu <cofyc.jackson@gmail.com> * address the comment * Update tests/e2e/tidbcluster/serial.go Co-Authored-By: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> * address by comment * add log Co-authored-by: Yecheng Fu <cofyc.jackson@gmail.com> Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> Add e2e test for upgrading from 1.0.x (pingcap#2145) Add e2e test for upgrading from 1.0.x fix terraform destroy failure on aws (pingcap#2148) scripts to run e2e against OpenShift 4 (pingcap#2141) fix e2e error fix tag recover remove unnecessary change add log for node stop fix error fix lint security: tikv encryption kms config (pingcap#2151) Skip TLS when connecting to TiDB Server (pingcap#2143) deploy controller for tiflash (pingcap#2157) add AGE column (pingcap#2168) Add unit test for restore controller (pingcap#2166) * add unit tests for restore controller * tiny fix * address comments * fix CI fix a typo (pingcap#2167) Add more events for tidbcluster and autoscaler (pingcap#2150) * add event for tidbcluster and auto-scaler * fix unit test * Update pkg/manager/member/upgrader.go Co-Authored-By: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> * revise scaling logic * revise logic * fix failover event * remove upgrading event * remove scaling event * remove unnecessary event * remove useless code * revert changes Co-authored-by: DanielZhangQD <36026334+DanielZhangQD@users.noreply.github.com> Remove unused certificate control and related code. (pingcap#2176) fix debug docker (pingcap#2187) use fixed job names (pingcap#2188) Add spec.pd.maxFailoverCount to limit max failover replicas for PD (pingcap#2184) * Add spec.pd.maxFailoverCount to limit max failover replicas for PD * update api generated files remove error add more log fix openshift job (pingcap#2192) Support Auto-scaling status (pingcap#2182) * add tac status * update * fix status * update rc * revert replicas * update notes * add last ts * address the comment * use metav1.Time Wait for the VM to be ready in CI (pingcap#2194) * fix a typo * wait for the vm to be ready release v1.1.0-rc.2 (pingcap#2197) * release v1.1.0-rc.2 * Update CHANGELOG-1.1.md Co-Authored-By: Ran <huangran@pingcap.com> * Apply suggestions from code review Co-Authored-By: Ran <huangran@pingcap.com> * update Co-authored-by: Ran <huangran@pingcap.com> update-version (pingcap#2204) tmp skip revert skip fix br log issue and include both 3.1 and 4.0 br in the tidb-backup-manager image (pingcap#2213) easier to build and push to other docker image repo (pingcap#2207) install python for gcloud (pingcap#2206) fix tidb-debug docker build (pingcap#2215) delete pd data
What problem does this PR solve?
It looks like for ad-hoc backup and restore (at least in the failing test), the helm chart expects the same pvc from the backup to be kept around and used for the subsequent restore. This only works if the source and target clusters are in the same namespace.
I would like to restore to a cluster without any local backups, running on a different Kubernetes cluster in a different region, via S3.
This is split from #1705
What is changed and how does it work?
This adds a
restoreUsingExistingVolume
defaulting totrue
. If set tofalse
, then a new volume in therestore
mode to hold data loaded from s3, gcp, ceph, etc.This option has the same effect as simply setting the mode to
scheduled-restore
, but I didn't really understand whatscheduled-restore
meant.Does this PR introduce a user-facing change?: