-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Follower keep rejecting MsgAppend from leader #428
Comments
|
|
|
Can it be reproduced reliably? I guess it's caused by not overwriting conflict entries correctly. /cc @gengliqi I believe Lines 521 to 523 in a5423b2
|
It will be helpful if you share the storage implementations and how ready is processed. |
I don't know how to reproduce it reliably. But when the network load is heavy and messages are delayed for a long time, this problem may occur. It's not a rare case. ready handling: It's the same as storage implementation: It's a very poor implementation, and basically the same as MemStorage's implementation but stores states into files. It has the problem that states are not stored atomically. Since no peers restarted, I think it's not the case. |
Following is wrong if ready's entries are conflict with the vector. The vector should truncated and then append. One of the example can be found here: Lines 296 to 298 in a5423b2
|
Oh, thanks! I'll fix that to see if this problem solved. By the way, what's the recommend version of |
TiKV uses a version that is almost the same as master. Latest master fixes bugs for general cases that don't affect TiKV, so TiKV don't update it yet. There is no known bugs for latest master, and I wish we can release a new version when documentations are polished. |
If the term check failed, generally it means the log of this index is stale and there should be a log with a different term of this index. |
Hi, I encounter a problem that a follower(the old leader) keep rejecting MsgAppend from the new leader.
There are three peers,
peer_1
,peer_2
,peer_3
.peer_2
is the leader at term 15 in the begining.peer_3
become candidate, requests votes frompeer_1
andpeer_2
.peer_1
votes forpeer_3
,peer_3
become leader at term 16.peer_2
as the raft: Avoid scanning raft log in become_leader #15 leader, appended an entry (index 3827, term 15).peer_2
receives msg of higher term, becomes follower at term 16.peer_3
tries to append entry (index 3827, term 16) topeer_2
.peer_2
found conflict at index 3827 (conflicting term 16, existing term 15).peer_3
tries to append entry (index 3828, term 16) topeer_2
, butpeer_2
reject it because term mismatch at index 3827 (conflicting term 16, existing term 15).peer_3
keeps appending these two entries topeer_2
:peer_2
get stuck, but the consensus is still going withinpeer_1
andpeer_3
It looks like the old entry (index 3827 term 15) in
peer_2
isn't overwritten by new leader's MsgAppend.This issue happened several times after updating
raft-rs
in master branch from six months ago to latest.I've updated my ready handling and storage implementation according to the latest
five_mem_node
example and recent commits,and no peers were restarted during the whole time.
The
raft-rs
I used is the latest master branch with reva5423b241ed641ca0941c816153a9dfc0fdbfad9
.I'm not sure if it's a problem of my implementation or not. Can someone help me with it?
The text was updated successfully, but these errors were encountered: