Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Index Deletion During Partial Snapshot Create #50234

Merged

Conversation

original-brownbear
Copy link
Member

We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to #50202
Closes #50200

We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to elastic#50202
Closes elastic#50200
@original-brownbear original-brownbear added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.6.0 labels Dec 16, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@original-brownbear
Copy link
Member Author

@ywelsch turns out for 7.6+ this is a really trivial fix. 7.5 requires a different fix though since we use a list of indices to track what indices to finalize instead of the shard generations object.

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/bwc (known Docker issue)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this affect the "create snapshot" response? Will the index still be returned as part of the response (i.e. returned as successfully snapshotted index)?

snapshot.indices().forEach(idx -> indexLookup.put(idx.getName(), idx));
snapshot.shards().forEach(c -> builder.put(indexLookup.get(c.key.getIndexName()), c.key.id(), c.value.generation()));
snapshot.indices().forEach(idx -> {
if (metaData.index(idx.getName()) != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit annoying that we have to lookup by index name here, not index uuid. In theory, an index could have been deleted and recreated between snapshot start and completion. Do we have index uuid perhaps in the shard snapshots?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, this is actually a possibility (I reproduced it with the test added in d219dfa) and fixed it now by using the Index instances in the shards map from SnapshotInProgress :)

@original-brownbear
Copy link
Member Author

original-brownbear commented Dec 16, 2019

@ywelsch thanks for taking a look.

How does this affect the "create snapshot" response? Will the index still be returned as part of the response (i.e. returned as successfully snapshotted index)?

The response uses SnapshotInfo which is built from ShardGenerations so all is well here :) I did in fact find that the totalShards was off though and fixed that to be computed from the ShardGenerations for partial snapshots as well and it's covered by the new test.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Member Author

Thanks Yannick!

@original-brownbear original-brownbear merged commit e70c7c9 into elastic:master Dec 17, 2019
@original-brownbear original-brownbear deleted the partial-snapshot-behavior branch December 17, 2019 08:42
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Dec 17, 2019
* Fix Index Deletion During Partial Snapshot Create

We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to elastic#50202
Closes elastic#50200
original-brownbear added a commit that referenced this pull request Dec 17, 2019
We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to #50202
Closes #50200
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
* Fix Index Deletion During Partial Snapshot Create

We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to elastic#50202
Closes elastic#50200
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 26, 2020
We make sure to filter shard generations for indices that are missing
from the metadata when finalizing a partial snapshot (from concurrent index deletion)
but we failed to account for the case where we manually build a fake metadata instance
for snapshots without the global state.
Fixed this by handling missing indices by skipping, same way we do it for filtering the
shard generations.

Relates elastic#50234
original-brownbear added a commit that referenced this pull request Apr 27, 2020
We make sure to filter shard generations for indices that are missing
from the metadata when finalizing a partial snapshot (from concurrent index deletion)
but we failed to account for the case where we manually build a fake metadata instance
for snapshots without the global state.
Fixed this by handling missing indices by skipping, same way we do it for filtering the
shard generations.

Relates #50234
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 27, 2020
We make sure to filter shard generations for indices that are missing
from the metadata when finalizing a partial snapshot (from concurrent index deletion)
but we failed to account for the case where we manually build a fake metadata instance
for snapshots without the global state.
Fixed this by handling missing indices by skipping, same way we do it for filtering the
shard generations.

Relates elastic#50234
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 27, 2020
We make sure to filter shard generations for indices that are missing
from the metadata when finalizing a partial snapshot (from concurrent index deletion)
but we failed to account for the case where we manually build a fake metadata instance
for snapshots without the global state.
Fixed this by handling missing indices by skipping, same way we do it for filtering the
shard generations.

Relates elastic#50234
original-brownbear added a commit that referenced this pull request Apr 27, 2020
We make sure to filter shard generations for indices that are missing
from the metadata when finalizing a partial snapshot (from concurrent index deletion)
but we failed to account for the case where we manually build a fake metadata instance
for snapshots without the global state.
Fixed this by handling missing indices by skipping, same way we do it for filtering the
shard generations.

Relates #50234
original-brownbear added a commit that referenced this pull request Apr 27, 2020
We make sure to filter shard generations for indices that are missing
from the metadata when finalizing a partial snapshot (from concurrent index deletion)
but we failed to account for the case where we manually build a fake metadata instance
for snapshots without the global state.
Fixed this by handling missing indices by skipping, same way we do it for filtering the
shard generations.

Relates #50234
@original-brownbear original-brownbear restored the partial-snapshot-behavior branch August 6, 2020 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v7.6.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Snapshot Finalization Throws NPE in 7.5
4 participants