You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have deployment of 3 pods , which is running from past 90 days, after 90 days the pod-0 suddenly started restarting. We applied work around like deleting the pod-0 member id, deletion of pod-0 ,then deleting the wal file of pod-0 and After this WA, the error in pod-0 log we found as below:
Pod should run without restarting and without any panic: tocommit(587192) is out of range [lastIndex(587189)] error.
How can we reproduce it (as minimally and precisely as possible)?
Deleted the member id of pod-0
Delete the pod-0
3.Delete the wal file of pod-0
After the above , panic: tocommit(587192) is out of range [lastIndex(587189)] error came in pod-0 logs.
Anything else we need to know?
We have found similar issue #13509 , the WA/solution didnt work for us.
Etcd version (please run commands below)
bash-4.4$ etcd --version
etcd Version: 3.5.3
Git SHA: 0452fee
Go Version: go1.16.15
Go OS/Arch: linux/amd64
bash-4.4$ etcdctl version
etcdctl version: 3.5.3
API version: 3.5
Etcd configuration (command line flags or environment variables)
paste your configuration here
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
$ etcdctl member list -w table
# paste output here
$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here
Essentially you can use the etcdctl snapshot functionality from one of your two live members to then restore the third failing member with etcdutl snapshot restore.
I'm going to close this issue as I don't believe there is an etcd bug here, and the etcd operations guide I've linked should assist you to resolve the problematic member.
You're welcome to reply below with new information if you would like this to be re-opened. If you do run into any issues with the operations guide I linked, please let us know by raising an issue on the etcd-io/website repository.
What happened?
We have deployment of 3 pods , which is running from past 90 days, after 90 days the pod-0 suddenly started restarting. We applied work around like deleting the pod-0 member id, deletion of pod-0 ,then deleting the wal file of pod-0 and After this WA, the error in pod-0 log we found as below:
What did you expect to happen?
Pod should run without restarting and without any panic: tocommit(587192) is out of range [lastIndex(587189)] error.
How can we reproduce it (as minimally and precisely as possible)?
3.Delete the wal file of pod-0
After the above , panic: tocommit(587192) is out of range [lastIndex(587189)] error came in pod-0 logs.
Anything else we need to know?
We have found similar issue #13509 , the WA/solution didnt work for us.
Etcd version (please run commands below)
bash-4.4$ etcd --version
etcd Version: 3.5.3
Git SHA: 0452fee
Go Version: go1.16.15
Go OS/Arch: linux/amd64
bash-4.4$ etcdctl version
etcdctl version: 3.5.3
API version: 3.5
Etcd configuration (command line flags or environment variables)
paste your configuration here
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
Relevant log output
The text was updated successfully, but these errors were encountered: