tocommit(3730) is out of range [lastIndex(0)]. Was the raft log corrupted, truncated, or lost #16220

qixiaoyang0 · 2023-07-11T09:46:06Z

What would you like to be added?

The heartbeat sent by the leader node contains the committed log index. If the index is higher than the follower node, follower will panic.
ETCD should recheck the logs instead of exiting process.

Why is this needed?

We deployed a 3-node cluster and tested the cluster. One of the test cases is 65% of the network error packets between nodes. About 10 minutes later, a follower node restarted. The log prints the panic stack:

`msg:
tocommit(3730) is out of range [lastIndex(0)]. Was the raft log corrupted, truncated, or lost?
stacktrace:
vendor/go.etcd.io/etcd/server/v3/etcdserver.(*zapRaftLogger).Panicf
vendor/go.etcd.io/etcd/server/v3/etcdserver/zap_raft.go:101

vendor/go.etcd.io/etcd/raft/v3.(*raftLog).commitTo
vendor/go.etcd.io/etcd/raft/v3/log.go:237

vendor/go.etcd.io/etcd/raft/v3.(*raft).handleHeartbeat
vendor/go.etcd.io/etcd/raft/v3/raft.go:1509

vendor/go.etcd.io/etcd/raft/v3.stepFollower
vendor/go.etcd.io/etcd/raft/v3/raft.go:1435

vendor/go.etcd.io/etcd/raft/v3.(*raft).Step
vendor/go.etcd.io/etcd/raft/v3/raft.go:975

vendor/go.etcd.io/etcd/raft/v3.(*node).run
vendor/go.etcd.io/etcd/raft/v3/node.go:356`

We have found similar issue #13509, #15699

chaochn47 · 2023-07-11T17:03:41Z

Duplicate of etcd-io/raft#18. Please refer to the raft repo issue, Thanks!

Closing current issue.

CabinfeverB · 2023-12-01T03:04:33Z

Hi @qixiaoyang0, I would like to know if the follower who experienced the panic had restarted before the panic. I also encountered a similar problem, and from the logs, I can confirm that followers did not restart.
But I don't think it would have happened without the restart. PTAL @chaochn47

qixiaoyang0 · 2023-12-01T03:48:29Z

Hi @qixiaoyang0, I would like to know if the follower who experienced the panic had restarted before the panic. I also encountered a similar problem, and from the logs, I can confirm that followers did not restart. But I don't think it would have happened without the restart. PTAL @chaochn47

The follower exits the process due to panic. The process may exit too quickly to output panic messages in your system.

CabinfeverB · 2023-12-01T04:01:55Z

What I meant is, if it's just a network issue, it should not cause a panic. I want to confirm with you if there were any other indications before the panic occurred.

qixiaoyang0 · 2023-12-02T09:56:16Z

What I meant is, if it's just a network issue, it should not cause a panic. I want to confirm with you if there were any other indications before the panic occurred.

There are no other indicators in my test case.

qixiaoyang0 added the type/feature label Jul 11, 2023

jmhbnz added the area/raft label Jul 11, 2023

chaochn47 closed this as completed Jul 11, 2023

overvenus mentioned this issue Dec 7, 2023

server: ignore raft messages if member id mismatch #17078

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tocommit(3730) is out of range [lastIndex(0)]. Was the raft log corrupted, truncated, or lost #16220

tocommit(3730) is out of range [lastIndex(0)]. Was the raft log corrupted, truncated, or lost #16220

qixiaoyang0 commented Jul 11, 2023 •

edited

Loading

chaochn47 commented Jul 11, 2023 •

edited

Loading

CabinfeverB commented Dec 1, 2023

qixiaoyang0 commented Dec 1, 2023

CabinfeverB commented Dec 1, 2023

qixiaoyang0 commented Dec 2, 2023

tocommit(3730) is out of range [lastIndex(0)]. Was the raft log corrupted, truncated, or lost #16220

tocommit(3730) is out of range [lastIndex(0)]. Was the raft log corrupted, truncated, or lost #16220

Comments

qixiaoyang0 commented Jul 11, 2023 • edited Loading

What would you like to be added?

Why is this needed?

chaochn47 commented Jul 11, 2023 • edited Loading

CabinfeverB commented Dec 1, 2023

qixiaoyang0 commented Dec 1, 2023

CabinfeverB commented Dec 1, 2023

qixiaoyang0 commented Dec 2, 2023

qixiaoyang0 commented Jul 11, 2023 •

edited

Loading

chaochn47 commented Jul 11, 2023 •

edited

Loading