-
Notifications
You must be signed in to change notification settings - Fork 25.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add troubleshooting guide for corrupt repository (#88391)
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
- Loading branch information
Showing
8 changed files
with
272 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
40 changes: 40 additions & 0 deletions
40
...ference/tab-widgets/troubleshooting/snapshot/corrupt-repository-widget.asciidoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
++++ | ||
<div class="tabs" data-tab-group="host"> | ||
<div role="tablist" aria-label="Re-add repository"> | ||
<button role="tab" | ||
aria-selected="true" | ||
aria-controls="cloud-tab-readd-repo" | ||
id="cloud-readd-repo"> | ||
Elasticsearch Service | ||
</button> | ||
<button role="tab" | ||
aria-selected="false" | ||
aria-controls="self-managed-tab-readd-repo" | ||
id="self-managed-readd-repo" | ||
tabindex="-1"> | ||
Self-managed | ||
</button> | ||
</div> | ||
<div tabindex="0" | ||
role="tabpanel" | ||
id="cloud-tab-readd-repo" | ||
aria-labelledby="cloud-readd-repo"> | ||
++++ | ||
|
||
include::corrupt-repository.asciidoc[tag=cloud] | ||
|
||
++++ | ||
</div> | ||
<div tabindex="0" | ||
role="tabpanel" | ||
id="self-managed-tab-readd-repo" | ||
aria-labelledby="self-managed-readd-repo" | ||
hidden=""> | ||
++++ | ||
|
||
include::corrupt-repository.asciidoc[tag=self-managed] | ||
|
||
++++ | ||
</div> | ||
</div> | ||
++++ |
219 changes: 219 additions & 0 deletions
219
docs/reference/tab-widgets/troubleshooting/snapshot/corrupt-repository.asciidoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,219 @@ | ||
// tag::cloud[] | ||
Fixing the corrupted repository will entail making changes in multiple deployments | ||
that write to the same snapshot repository. | ||
Only one deployment must be writing to a repository. The deployment | ||
that will keep writing to the repository will be called the "primary" deployment (the current cluster), | ||
and the other one(s) where we'll mark the repository read-only as the "secondary" | ||
deployments. | ||
|
||
First mark the repository as read-only on the secondary deployments: | ||
|
||
**Use {kib}** | ||
|
||
//tag::kibana-api-ex[] | ||
. Log in to the {ess-console}[{ecloud} console]. | ||
+ | ||
|
||
. On the **Elasticsearch Service** panel, click the name of your deployment. | ||
+ | ||
|
||
NOTE: | ||
If the name of your deployment is disabled your {kib} instances might be | ||
unhealthy, in which case please contact https://support.elastic.co[Elastic Support]. | ||
If your deployment doesn't include {kib}, all you need to do is | ||
{cloud}/ec-access-kibana.html[enable it first]. | ||
|
||
. Open your deployment's side navigation menu (placed under the Elastic logo in the upper left corner) | ||
and go to **Stack Management > Snapshot and Restore > Repositories**. | ||
+ | ||
[role="screenshot"] | ||
image::images/repositories.png[{kib} Console,align="center"] | ||
|
||
. The repositories table should now be visible. Click on the pencil icon at the | ||
right side of the repository to be marked as read-only. On the Edit page that opened | ||
scroll down and check "Read-only repository". Click "Save". | ||
Alternatively if deleting the repository altogether is preferable, select the checkbox | ||
at the left of the repository name in the repositories table and click the | ||
"Remove repository" red button at the top left of the table. | ||
|
||
At this point, it's only the primary (current) deployment that has the repository marked | ||
as writeable. | ||
{es} sees it as corrupt, so the repository needs to be removed and added back so that | ||
{es} can resume using it: | ||
|
||
Note that we're now configuring the primary (current) deployment. | ||
|
||
. Open the primary deployment's side navigation menu (placed under the Elastic logo in the upper left corner) | ||
and go to **Stack Management > Snapshot and Restore > Repositories**. | ||
+ | ||
[role="screenshot"] | ||
image::images/repositories.png[{kib} Console,align="center"] | ||
|
||
. Get the details for the repository we'll recreate later by clicking on the repository | ||
name in the repositories table and making note of all the repository configurations | ||
that are displayed on the repository details page (we'll use them when we recreate | ||
the repository). Close the details page using the link at | ||
the bottom left of the page. | ||
+ | ||
[role="screenshot"] | ||
image::images/repo_details.png[{kib} Console,align="center"] | ||
|
||
. With all the details above noted, next delete the repository. Select the | ||
checkbox at the left of the repository and hit the "Remove repository" red button | ||
at the top left of the page. | ||
|
||
. Recreate the repository by clicking the "Register Repository" button | ||
at the top right corner of the repositories table. | ||
+ | ||
[role="screenshot"] | ||
image::images/register_repo.png[{kib} Console,align="center"] | ||
|
||
. Fill in the repository name, select the type and click "Next". | ||
+ | ||
[role="screenshot"] | ||
image::images/register_repo_details.png[{kib} Console,align="center"] | ||
|
||
. Fill in the repository details (client, bucket, base path etc) with the values | ||
you noted down before deleting the repository and click the "Register" button | ||
at the bottom. | ||
|
||
. Select "Verify repository" to confirm that your settings are correct and the | ||
deployment can connect to your repository. | ||
//end::kibana-api-ex[] | ||
// end::cloud[] | ||
|
||
// tag::self-managed[] | ||
Fixing the corrupted repository will entail making changes in multiple clusters | ||
that write to the same snapshot repository. | ||
Only one cluster must be writing to a repository. Let's call the cluster | ||
we want to keep writing to the repository the "primary" cluster (the current cluster), | ||
and the other one(s) where we'll mark the repository as read-only the "secondary" | ||
clusters. | ||
|
||
Let's first work on the secondary clusters: | ||
|
||
. Get the configuration of the repository: | ||
+ | ||
[source,console] | ||
---- | ||
GET _snapshot/my-repo | ||
---- | ||
// TEST[skip:we're not setting up repos in these tests] | ||
+ | ||
The reponse will look like this: | ||
+ | ||
[source,console-result] | ||
---- | ||
{ | ||
"my-repo": { <1> | ||
"type": "s3", | ||
"settings": { | ||
"bucket": "repo-bucket", | ||
"client": "elastic-internal-71bcd3", | ||
"base_path": "myrepo" | ||
} | ||
} | ||
} | ||
---- | ||
// TESTRESPONSE[skip:the result is for illustrating purposes only] | ||
+ | ||
<1> Represents the current configuration for the repository. | ||
. Using the settings retrieved above, add the `readonly: true` option to mark | ||
it as read-only: | ||
+ | ||
[source,console] | ||
---- | ||
PUT _snapshot/my-repo | ||
{ | ||
"type": "s3", | ||
"settings": { | ||
"bucket": "repo-bucket", | ||
"client": "elastic-internal-71bcd3", | ||
"base_path": "myrepo", | ||
"readonly": true <1> | ||
} | ||
} | ||
---- | ||
// TEST[skip:we're not setting up repos in these tests] | ||
+ | ||
<1> Marks the repository as read-only. | ||
. Alternatively, deleting the repository is an option using: | ||
+ | ||
[source,console] | ||
---- | ||
DELETE _snapshot/my-repo | ||
---- | ||
// TEST[skip:we're not setting up repos in these tests] | ||
+ | ||
The response will look like this: | ||
+ | ||
[source,console-result] | ||
------------------------------------------------------------------------------ | ||
{ | ||
"acknowledged": true | ||
} | ||
------------------------------------------------------------------------------ | ||
// TESTRESPONSE[skip:the result is for illustrating purposes only] | ||
At this point, it's only the primary (current) cluster that has the repository marked | ||
as writeable. | ||
{es} sees it as corrupt though so let's remove the repository and recreate it so that | ||
{es} can resume using it: | ||
Note that now we're configuring the primary (current) cluster. | ||
. Get the configuration of the repository and save its configuration as we'll use it | ||
to recreate the repository: | ||
+ | ||
[source,console] | ||
---- | ||
GET _snapshot/my-repo | ||
---- | ||
// TEST[skip:we're not setting up repos in these tests] | ||
. Delete the repository: | ||
+ | ||
[source,console] | ||
---- | ||
DELETE _snapshot/my-repo | ||
---- | ||
// TEST[skip:we're not setting up repos in these tests] | ||
+ | ||
The response will look like this: | ||
+ | ||
[source,console-result] | ||
------------------------------------------------------------------------------ | ||
{ | ||
"acknowledged": true | ||
} | ||
------------------------------------------------------------------------------ | ||
// TESTRESPONSE[skip:the result is for illustrating purposes only] | ||
. Using the configuration we obtained above, let's recreate the repository: | ||
+ | ||
[source,console] | ||
---- | ||
PUT _snapshot/my-repo | ||
{ | ||
"type": "s3", | ||
"settings": { | ||
"bucket": "repo-bucket", | ||
"client": "elastic-internal-71bcd3", | ||
"base_path": "myrepo" | ||
} | ||
} | ||
---- | ||
// TEST[skip:we're not setting up repos in these tests] | ||
+ | ||
The response will look like this: | ||
+ | ||
[source,console-result] | ||
------------------------------------------------------------------------------ | ||
{ | ||
"acknowledged": true | ||
} | ||
------------------------------------------------------------------------------ | ||
// TESTRESPONSE[skip:the result is for illustrating purposes only] | ||
// end::self-managed[] | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
11 changes: 11 additions & 0 deletions
11
docs/reference/troubleshooting/snapshot/add-repository.asciidoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
[[add-repository]] | ||
== Multiple deployments writing to the same snapshot repository | ||
|
||
Multiple {es} deployments are writing to the same snapshot repository. {es} doesn't | ||
support this configuration and only one cluster is allowed to write to the same | ||
repository. | ||
To remedy the situation mark the repository as read-only or remove it from all the | ||
other deployments, and re-add (recreate) the repository in the current deployment: | ||
|
||
include::{es-repo-dir}/tab-widgets/troubleshooting/snapshot/corrupt-repository-widget.asciidoc[] | ||
|