Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: avoid RaftGroupDeletedError from RHS in splitTrigger #39571

Merged
merged 1 commit into from
Aug 13, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 46 additions & 1 deletion pkg/storage/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -2134,8 +2134,44 @@ func splitPostApply(
log.Fatalf(ctx, "unable to find RHS replica: %+v", err)
}
{
rightRng.mu.Lock()
// Already holding raftMu, see above.
rightRng.mu.Lock()
// The right hand side of the split may have been removed and re-added
// in the meantime, and the replicaID in RightDesc may be stale.
// Consequently the call below may fail with a RaftGroupDeletedError. In
// general, this protects earlier incarnations of the replica that were
// since replicaGC'ed from reneging on promises made earlier (for
// example, once the HardState is removed, a replica could cast a
// different vote for the same term).
//
// It is safe to circumvent that in this case because the RHS must have
// remained uninitialized (the LHS blocks all user data, and since our
// Raft logs start at a nonzero index a snapshot must go through before
// any log entries are appended). This means the data in that range is
// just a HardState which we "logically" snapshot by assigning it data
// formerly located within the LHS.
//
// Note that if we ever have a way to replicaGC uninitialized replicas,
// the RHS may have been gc'ed and so the HardState would be gone. In
// that case, the requirement that the HardState remains would have been
// violated if the bypass below were used, which is why we place an
// assertion.
//
// See:
// https://github.com/cockroachdb/cockroach/issues/21146#issuecomment-365757329
tombstoneKey := keys.RaftTombstoneKey(rightRng.RangeID)
var tombstone roachpb.RaftTombstone
if ok, err := engine.MVCCGetProto(
ctx, r.store.Engine(), tombstoneKey, hlc.Timestamp{}, &tombstone, engine.MVCCGetOptions{},
); err != nil {
log.Fatalf(ctx, "unable to load tombstone for RHS: %+v", err)
} else if ok {
log.Fatalf(ctx, "split trigger found right-hand side with tombstone %+v: %v", tombstone, rightRng)
}
// NB: the safety argument above implies that we don't have to worry
// about restoring the existing minReplicaID if it's nonzero. No
// promises have been made, so none need to be kept.
rightRng.mu.minReplicaID = 0
err := rightRng.initRaftMuLockedReplicaMuLocked(&split.RightDesc, r.store.Clock(), 0)
rightRng.mu.Unlock()
if err != nil {
Expand Down Expand Up @@ -2536,6 +2572,15 @@ func (s *Store) removeReplicaImpl(
return err
}

if !rep.IsInitialized() {
// The split trigger relies on the fact that it can bypass the tombstone
// check for the RHS, but this is only true as long as we never delete
// its HardState.
//
// See the comment in splitPostApply for details.
log.Fatalf(ctx, "can not replicaGC uninitialized replicas")
}

// During merges, the context might have the subsuming range, so we explicitly
// log the replica to be removed.
log.Infof(ctx, "removing replica r%d/%d", rep.RangeID, replicaID)
Expand Down