Improve eth_flush performance by disabling wait in Client::doWork #5547

gumb0 · 2019-04-04T10:31:19Z

eth_flush is a non-standard / not documented method, I don't know for sure why it was created, but it seems to be suitable workaround for the problems Solidity team was having lately running soltest+aleth on CI ethereum/solidity#6457

Their problem basically boils down to eth_sendTransaction, followed by test_mineBlocks(1) doesn't guarantee that the submitted transaction will end up included in the mined block.
(The reason is the way Client::doWork works - it checks for new transactions in the Transaction Queue, then checks whether it should start sealing current pending block. If eth_sendTransaction & test_mineBlocks calls happen in-between these two checks, the mined block will not contain the transaction)

So quick workaround solution without complicating Client logic further could be to call eth_flush right after eth_sendTransaction but before test_mineBlocks - that will run Client::doWork once, which should push new transaction from Transaction Queue to the pending block.

This PR disables 1 second sleep in the end of Client::doWork (when called from eth_flush) which looks unnecessary for this case and is slowing down Solidity tests significantly.

codecov-io · 2019-04-04T11:27:48Z

Codecov Report

Merging #5547 into master will increase coverage by <.01%.
The diff coverage is 0%.

@@            Coverage Diff            @@
##           master   #5547      +/-   ##
=========================================
+ Coverage   61.89%   61.9%   +<.01%     
=========================================
  Files         344     344              
  Lines       28757   28757              
  Branches     3267    3267              
=========================================
+ Hits        17800   17802       +2     
+ Misses       9787    9785       -2     
  Partials     1170    1170

halfalicious · 2019-04-06T22:20:00Z

@gumb0 Since this only impacts eth_flush and there don't appear to be any callers in our codebase I think this change is okay. However, I'm a little confused about the scenario:

Their problem basically boils down to eth_sendTransaction, followed by test_mineBlocks(1) doesn't guarantee that the submitted transaction will end up included in the mined block.
(The reason is the way Client::doWork works - it checks for new transactions in the Transaction Queue, then checks whether it should start sealing current pending block. If eth_sendTransaction & test_mineBlocks calls happen in-between these two checks, the mined block will not contain the transaction)

This implies that doWork is already running when eth_sendTransaction / test_mineBlocks(1) is called (presumably on the Client thread), which means that this test case is susceptible to race conditions (since you have 2 instances of doWork running concurrently). Is it possible to make this test more deterministic by removing the Client doWork loop and just have the single doWork call that eth_flush makes on the RPC thread? Or does creating the RPC interface require a Client instance (and creating a Client instance kicks off the doWork loop)?

This PR disables 1 second sleep in the end of Client::doWork (when called from eth_flush) which looks unnecessary for this case and is slowing down Solidity tests significantly.`
Yea, according to the code comments this sleep is needed to give a newly mined block time to propagate to the block queue, presumably so it can be both sync'd with the local blockchain and sent out to other nodes. If we don't care about that for the Solidity tests then I don't think we need the wait.

Another possible approach would be to parameterize the wait in eth_flush so the caller can decide whether or not they want to wait. This might be helpful if we decide to author blockchain sync or block propagation tests and we end up needing the wait.

halfalicious

Looks good, have some questions though (see comment in this PR)

gumb0 · 2019-04-08T09:46:13Z

This implies that doWork is already running when eth_sendTransaction / test_mineBlocks(1) is called (presumably on the Client thread)

Correct

which means that this test case is susceptible to race conditions (since you have 2 instances of doWork running concurrently).

Good point, I think Client class code in general is not very thread-safe in the face of RPC calls and other threads... (some other RPC methods call doWork or access member variables) So it's hard to address this in this quick-fix PR.

Is it possible to make this test more deterministic by removing the Client doWork loop and just have the single doWork call that eth_flush makes on the RPC thread?

We can't just remove Client::doWork loop, it's kind of the main loop of aleth application.
(What it does exactly: executes the blocks and writes them to DB during the sync; constructs the pending block from the Transaction Queue; controls mining threads; removes expired filters set with eth_...Filter RPC calls)

Another possible approach would be to parameterize the wait in eth_flush so the caller can decide whether or not they want to wait. This might be helpful if we decide to author blockchain sync or block propagation tests and we end up needing the wait.

It doesn't look useful to me, it's just the sleep in the very end of the method. You could do this sleep on the calling side if you need it.

... Now when I think of alternative approaches, I think the better way to flush would be to wait until the Transaction Queue is empty, and all transactions are included into the pending block (it's basically what soltest ended up doing for now on their side)
But I'm inclined not to touch this until soltest needs another fix.

Improve eth_flush performance by disabling wait in Client::doWork

87ef41d

Add changelog item

68622b9

gumb0 requested review from chfast and halfalicious April 5, 2019 13:35

halfalicious approved these changes Apr 6, 2019

View reviewed changes

gumb0 merged commit 937236d into master Apr 8, 2019

gumb0 deleted the fix-eth-flush branch April 8, 2019 09:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve eth_flush performance by disabling wait in Client::doWork #5547

Improve eth_flush performance by disabling wait in Client::doWork #5547

gumb0 commented Apr 4, 2019 •

edited

Loading

codecov-io commented Apr 4, 2019 •

edited

Loading

halfalicious commented Apr 6, 2019 •

edited

Loading

halfalicious left a comment •

edited

Loading

gumb0 commented Apr 8, 2019

Improve eth_flush performance by disabling wait in Client::doWork #5547

Improve eth_flush performance by disabling wait in Client::doWork #5547

Conversation

gumb0 commented Apr 4, 2019 • edited Loading

codecov-io commented Apr 4, 2019 • edited Loading

Codecov Report

halfalicious commented Apr 6, 2019 • edited Loading

halfalicious left a comment • edited Loading

Choose a reason for hiding this comment

gumb0 commented Apr 8, 2019

gumb0 commented Apr 4, 2019 •

edited

Loading

codecov-io commented Apr 4, 2019 •

edited

Loading

halfalicious commented Apr 6, 2019 •

edited

Loading

halfalicious left a comment •

edited

Loading