-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcdserver/*, wal/*: changes to snapshots and WAL logic to fix #10219 #10356
Conversation
Codecov Report
@@ Coverage Diff @@
## master #10356 +/- ##
=========================================
- Coverage 71.6% 71.51% -0.1%
=========================================
Files 392 392
Lines 36461 36522 +61
=========================================
+ Hits 26109 26119 +10
- Misses 8524 8576 +52
+ Partials 1828 1827 -1
Continue to review full report at Codecov.
|
cc @jpbetz |
@@ -125,6 +126,23 @@ func (s *Snapshotter) Load() (*raftpb.Snapshot, error) { | |||
return snap, nil | |||
} | |||
|
|||
func (s *Snapshotter) LoadIndex(i uint64) (*raftpb.Snapshot, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's document function.
return nil, err | ||
|
||
// Find a snapshot to start/restart a raft node | ||
for i := uint64(0); ; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be extracted out into a separate function?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I'd like to get this fixed. @brk0v any interest in following up? If not, I'm happy to shepherd it through. |
@jingyih When you get some cycles, would you also review this? I'd like to get this fix (or some variant of it) merged. |
Superseded by #11888 (which retains the commits from this PR with full attribution). |
Proposal to fix inconsistent state between WAL and saved Snapshot #10219
Saving logic was changed from the current:
storage.Save()
)storage.WAL.SaveSnapshot()
)storage.Snapshotter.SaveSnap()
)storage.WAL.ReleaseLockTo()
)raftStorage.ApplySnapshot()
)to
storage.SaveSnapshot()
)storage.SaveAll()
)raftStorage.ApplySnapshot()
)storage.Release()
)Also added a snapshot check step during a node starting/restarting to prevent loading a snapshot that doesn't have a proper WAL record.