You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before #2057 I had some issues with the rialto nodes. They were consuming ~300MB at startup and then growing until taking up the entire memory of the system, at which point the system froze.
We should check if rialto parachain not producing blocks was causing this. And if so see if it's expected or if it's because of a wrong rialto configuration.
The text was updated successfully, but these errors were encountered:
Found the issue. It was because of the node_impl_version error, not because of rialto-parachain not producing blocks.
I'm not very familiar with this logic but high level what was happening was:
At some point this logic was called, which was executing prepare-worker inside. This was failing because of the node_impl_version problem. Then this was retried in an infinite loop, once every 3 seconds. This is not causing any issue in itself.
For each block after that, the PVF logic would try to get the result of this validation, and the code would get here because the validation was in progress (in the infinite loop from point 1.). So it would do awaiting_prepare.add(...) which basically adds a handler to be called when the validation is ready. And this would slowly fill the memory with these handlers.
I'm not very familiar with this logic. Maybe this failure could be handled better, but I guess that to some extent this is expected if prepare-worker has issues. I'll be thinking a bit more about it. Not sure if it's an issue at all.
Before #2057 I had some issues with the rialto nodes. They were consuming ~300MB at startup and then growing until taking up the entire memory of the system, at which point the system froze.
We should check if rialto parachain not producing blocks was causing this. And if so see if it's expected or if it's because of a wrong rialto configuration.
The text was updated successfully, but these errors were encountered: