Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

parity often stops syncing and needs to be restarted #11854

Open
TomWang10 opened this issue Aug 12, 2020 · 15 comments
Open

parity often stops syncing and needs to be restarted #11854

TomWang10 opened this issue Aug 12, 2020 · 15 comments

Comments

@TomWang10
Copy link

OpenEthereum version: OpenEthereum/v3.0.1-stable-8ca8089-20200601/x86_64-unknown-linux-gnu/rustc1.43.1
Operating system: Linux
Installation: Binary executable
Fully synchronized: yes
Network: ethereum
Restarted: yes

parity often stops syncing and needs to be restarted

@rakita
Copy link

rakita commented Aug 14, 2020

Hm, not enough information to deduce anything about problem.

@gsalzer
Copy link

gsalzer commented Aug 17, 2020

Same here, over and over again. Every other day.

@gsalzer
Copy link

gsalzer commented Aug 17, 2020

And when sending openethereum the signal to stop, it acknowledges it in the log file ("Finishing work, please wait..."), but it won't stop. After killing the process it restarts without problem.

@gsalzer
Copy link

gsalzer commented Aug 17, 2020

Hm, not enough information to deduce anything about problem.

How can we provide enough information? Any particular debug flags to activate?

@rakita
Copy link

rakita commented Aug 20, 2020

Hello @gsalzer, you can start by including logs and giving us more information on your setup. And saying that the app just stops syncing is very broad sentence.

@TomWang10 When you say "often stops" it does not help, because often is very relative word, does it means once per hour, day or week.

@gsalzer
Copy link

gsalzer commented Aug 21, 2020

@rakita My setup is identical to TomWang10's in the original posting. See the toml below.

"And saying that the app just stops syncing is very broad sentence."

There is not much more to say than i already said above: openethereum just hangs without consuming much resources except the memory it sits on, doesn't write any log messages, and doesn't react properly to TERM signals. In the log below you see two such incidents:

  • Openethereum stops importing at block 1063068 on 2020-08-10 08:55:42 and hans until I discover it a few days later. I send a TERM signal on 2020-08-13 12:16:37, which is acknowledged by the line "Finishing work", but openethereum does not stop. 15 minutes later I kill and restart, after which openethereum starts syncing.
  • Same on 2020-08-17 02:43:42 at block 10674419. I send a TERM signal 8 hours later, kill openethereum after waiting 9 minutes, and ethereum starts synching again.

openethereum.log

config.txt

@rakita
Copy link

rakita commented Aug 21, 2020

@gsalzer yep this seems like #11758 issue. Information that there is no logs (even periodic one) for hours fits with issue at hand.

You can read more on topic there

@liamaharon
Copy link

liamaharon commented Aug 24, 2020

Same here. OpenEthereum is useless not viable for certain production use cases until this is fixed.

@cogmon
Copy link

cogmon commented Aug 24, 2020

Same here. OpenEthereum is useless until this is fixed.

Totally, same here. OpenEthereum is broken for node operators.

@adria0
Copy link

adria0 commented Aug 24, 2020

@TomWang10 @gsalzer @liamaharon @cogmon

The current OE team released the 3.0 from the 2.7 branch without being aware that 2.7 was not production-ready. We have been working on them for months without success, and the Berlin Hard fork is every day closer.

After seeing that there a lot of problems with the DB version upgrade and super-hard-to-catch async errors we finally decided to backport starting from the 2.5.13 branch that is the more stable version known at the moment. We expect to release the 3.1, based on 2.5.13 by mid September.

@cogmon
Copy link

cogmon commented Aug 24, 2020

@adria0

In the release notes of v2.7.2, it doesn't say "not production-ready". The "stable" in the release name suggests differently.

Do I understand correctly that anyone who updated to 2.7.2 is now screwed and has to resync - which for certain node configurations is a matter of several months?

The problem seems to be the production release of 2.7 in the first place, not that OE builds on top of 3.0.

@roninkaizen
Copy link

roninkaizen commented Aug 27, 2020

due to my absolutely personal experiences,
the limitation with cache-size 4096 changed the behaviour of
my nodes back to stability, next would have been the reducing peer-amount down to 15,
to declare: i do not run them "archive-mode" nor persistant tx-que but json-rpc activated- main-net

grafik
and yes, those above mentioned versions were all compiled&tested :) to give an idea of persistance in watching

@mmwanga
Copy link

mmwanga commented Sep 3, 2020

This is happening to me too on version 3.0.1. The absence of logs for a period of time is sufficient evidence of this issue.

./openEthereum/openethereum -d /data/parity/ --jsonrpc-interface all --jsonrpc-apis all --jsonrpc-cors "*" --geth --tx-queue-mem-limit=256 --tx-queue-size=8192 --cache-size=4096

@mmwanga
Copy link

mmwanga commented Sep 15, 2020

@TomWang10 @gsalzer @liamaharon @cogmon

The current OE team released the 3.0 from the 2.7 branch without being aware that 2.7 was not production-ready. We have been working on them for months without success, and the Berlin Hard fork is every day closer.

After seeing that there a lot of problems with the DB version upgrade and super-hard-to-catch async errors we finally decided to backport starting from the 2.5.13 branch that is the more stable version known at the moment. We expect to release the 3.1, based on 2.5.13 by mid September.

Any updates on this? Eagerly waiting

@adria0
Copy link

adria0 commented Sep 16, 2020

@TomWang10 @gsalzer @liamaharon @cogmon
The current OE team released the 3.0 from the 2.7 branch without being aware that 2.7 was not production-ready. We have been working on them for months without success, and the Berlin Hard fork is every day closer.
After seeing that there a lot of problems with the DB version upgrade and super-hard-to-catch async errors we finally decided to backport starting from the 2.5.13 branch that is the more stable version known at the moment. We expect to release the 3.1, based on 2.5.13 by mid September.

Any updates on this? Eagerly waiting

Please, read https://github.com/openethereum/openethereum/issues/11858

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants