-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I/O timeouts during bench test #2744
Comments
Until the consensus process starts for a new block and until it really needs some transactions we can spare some cycles by not delivering transactions to it. In tests this doesn't affect TPS, but makes block delays a bit more stable. Related to #2744, I think it also may cause timeouts during transaction processing (waiting on the consensus process channel while it does something dBFT-related).
It makes sense in general (further narrowing down the time window when transactions are processed by consensus thread) and it improves block times a little too, especially in the 7+2 scenario. Related to #2744.
It makes sense in general (further narrowing down the time window when transactions are processed by consensus thread) and it improves block times a little too, especially in the 7+2 scenario. Related to #2744.
This allows to naturally scale transaction processing if we have some peer that is sending a lot of them while others are mostly silent. It also can help somewhat in the event we have 50 peers that all send transactions. 4+1 scenario benefits a lot from it, while 7+2 slows down a little. Delayed scenarios don't care. Surprisingly, this also makes disconnects (#2744) much more rare, 4-node scenario almost never sees it now. Most probably this is the case where peers affect each other a lot, single-threaded transaction receiver can be slow enough to trigger some timeout in getdata handler of its peer (because it tries to push a number of replies).
This allows to naturally scale transaction processing if we have some peer that is sending a lot of them while others are mostly silent. It also can help somewhat in the event we have 50 peers that all send transactions. 4+1 scenario benefits a lot from it, while 7+2 slows down a little. Delayed scenarios don't care. Surprisingly, this also makes disconnects (#2744) much more rare, 4-node scenario almost never sees it now. Most probably this is the case where peers affect each other a lot, single-threaded transaction receiver can be slow enough to trigger some timeout in getdata handler of its peer (because it tries to push a number of replies).
When block is being spread through the network we can get a lot of invs with the same hash. Some more stale nodes may also announce previous or some earlier block. We can avoid full DB lookup for them and minimize inv handling time (timeouts in inv handler had happened in #2744). It doesn't affect tests, just makes node a little less likely to spend some considerable amount of time in the inv handler.
Timeouts were mostly happening in #2757 and #2759 also make timeouts less likely to happen because we now have one Things also tried, but less successful:
I think that's enough for this task, we have a better networking subsystem now and timeouts are less likely to happen. |
Until the consensus process starts for a new block and until it really needs some transactions we can spare some cycles by not delivering transactions to it. In tests this doesn't affect TPS, but makes block delays a bit more stable. Related to #2744, I think it also may cause timeouts during transaction processing (waiting on the consensus process channel while it does something dBFT-related).
When block is being spread through the network we can get a lot of invs with the same hash. Some more stale nodes may also announce previous or some earlier block. We can avoid full DB lookup for them and minimize inv handling time (timeouts in inv handler had happened in #2744). It doesn't affect tests, just makes node a little less likely to spend some considerable amount of time in the inv handler.
As the load appears on the test network I/O timeouts start to happen:
This leads to disconnects/reconnects and isn't particularly good for performance. While it's almost fine to have them (the system is under stress and it uses 2s blocks), this still can be a sign of some messages requiring reader thread to do more work than it ought to. We need to check message handlers and see if anything can be improved there to minimize chances for these timeouts/disconnects.
The text was updated successfully, but these errors were encountered: