-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lotus market gets unresponsive with too many active transfers #7264
Comments
Example of hanging deals from Estuary, causing the market node to freeze up on certain commands:
|
@benjaminh83 could you follow up later to see if gs 0.9.1 in v1.11.3-rc1 helps? |
I was not fixed in v1.11.3-rc3:
Not responding in a couple of minutes. Now I have this additional process hanging there...
And the logs on Marked node says:
But nothing happens. cancel process still hanging and transfer is still on the list.. |
$ lotus-miner --call-on-markets stop And we see the logs go:
So now I have to kill the market node. Start it again and then cancel all the transfers. Same issue, now on GS9. |
Just to clarify -> the issue is specifically about not being able to restart deals/cancel deals/stop the markets process with too many stuck transfers. |
fyi. it can render a markets process unresponsive to cancel transfers. you can have 15 stalled transfers and as soon as you cancel a particular one the process becomes unresponsive - no correlating messages in the log thou. v1.11.2 |
v1.11.3-rc1 |
@aarshkshah1992 we also see the market node not processing any new transfers when entering this stage. I would say it’s more like the transfer module in the market node crashes, and although the market node is responding to some queries, its actually not in a position to serve incoming deals without a hard restart. |
might be resolved by #7359 |
Not fixed, still having this issue on multiple miners running 1.13.1-rc1 :( |
running into the same problem. v1.13.0 |
Hi everybody ! Thanks for the report. This issue has been resolved in release v1.13.2-rc4 and will be implemented in the upcoming release v1.13.2. Reproduced: Lotus version:
To keep things running great:
Closing Lotus issue ticket. If you are still experiencing this issue on v1.13.2-rc4 or later, please open a new ticket or let me know and I will re-open the ticket. Thanks ! |
Checklist
Latest release
, or the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.Lotus component
Lotus Version
Describe the Bug
Estuary has been testing with massive amounts of deals on our market nodes. While the client end (Estuary) is struggling with IO and bandwidth to actually transfer all the deal, we also see issues at our end (the market node).
Estuary has been opening +100 deal transfers against a single lotus market node, and typically this ends up in deal transfers breaking up. Errors like "stream reset".
Normally this would be a process of cleaning up by running:
lotus-miner --call-on-markets data-transfers cancel 1630379090449711315(some ID)
or
lotus-miner --call-on-markets data-transfers restart 1630379090449711315(some ID)
Unexpectedly, these commands are left hanging. Not being processed on the market node.
Only solution is to stop the market node:
lotus-miner --call-on-markets stop
But this is also not working - see the logging info. This is basically how the node is hanging. I believe it is not able to shut down the data-transfer module. Basically the process needs a kick. Kill -9
Once restarted, it responds fine to the commands
lotus-miner --call-on-markets data-transfers cancel
And it is possible to prune all the stuck deal transfers. Last time I ran this over 130 Estuary deal transfers.
Logging Information
Repo Steps
No response
The text was updated successfully, but these errors were encountered: