Skip to content
This repository has been archived by the owner on Oct 28, 2021. It is now read-only.

Improve eth_flush performance by disabling wait in Client::doWork #5547

Merged
merged 2 commits into from
Apr 8, 2019

Conversation

gumb0
Copy link
Member

@gumb0 gumb0 commented Apr 4, 2019

eth_flush is a non-standard / not documented method, I don't know for sure why it was created, but it seems to be suitable workaround for the problems Solidity team was having lately running soltest+aleth on CI ethereum/solidity#6457

Their problem basically boils down to eth_sendTransaction, followed by test_mineBlocks(1) doesn't guarantee that the submitted transaction will end up included in the mined block.
(The reason is the way Client::doWork works - it checks for new transactions in the Transaction Queue, then checks whether it should start sealing current pending block. If eth_sendTransaction & test_mineBlocks calls happen in-between these two checks, the mined block will not contain the transaction)

So quick workaround solution without complicating Client logic further could be to call eth_flush right after eth_sendTransaction but before test_mineBlocks - that will run Client::doWork once, which should push new transaction from Transaction Queue to the pending block.

This PR disables 1 second sleep in the end of Client::doWork (when called from eth_flush) which looks unnecessary for this case and is slowing down Solidity tests significantly.

@codecov-io
Copy link

codecov-io commented Apr 4, 2019

Codecov Report

Merging #5547 into master will increase coverage by <.01%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #5547      +/-   ##
=========================================
+ Coverage   61.89%   61.9%   +<.01%     
=========================================
  Files         344     344              
  Lines       28757   28757              
  Branches     3267    3267              
=========================================
+ Hits        17800   17802       +2     
+ Misses       9787    9785       -2     
  Partials     1170    1170

@gumb0 gumb0 requested review from chfast and halfalicious April 5, 2019 13:35
@halfalicious
Copy link
Contributor

halfalicious commented Apr 6, 2019

@gumb0 Since this only impacts eth_flush and there don't appear to be any callers in our codebase I think this change is okay. However, I'm a little confused about the scenario:

Their problem basically boils down to eth_sendTransaction, followed by test_mineBlocks(1) doesn't guarantee that the submitted transaction will end up included in the mined block.
(The reason is the way Client::doWork works - it checks for new transactions in the Transaction Queue, then checks whether it should start sealing current pending block. If eth_sendTransaction & test_mineBlocks calls happen in-between these two checks, the mined block will not contain the transaction)

This implies that doWork is already running when eth_sendTransaction / test_mineBlocks(1) is called (presumably on the Client thread), which means that this test case is susceptible to race conditions (since you have 2 instances of doWork running concurrently). Is it possible to make this test more deterministic by removing the Client doWork loop and just have the single doWork call that eth_flush makes on the RPC thread? Or does creating the RPC interface require a Client instance (and creating a Client instance kicks off the doWork loop)?

This PR disables 1 second sleep in the end of Client::doWork (when called from eth_flush) which looks unnecessary for this case and is slowing down Solidity tests significantly.`
Yea, according to the code comments this sleep is needed to give a newly mined block time to propagate to the block queue, presumably so it can be both sync'd with the local blockchain and sent out to other nodes. If we don't care about that for the Solidity tests then I don't think we need the wait.

Another possible approach would be to parameterize the wait in eth_flush so the caller can decide whether or not they want to wait. This might be helpful if we decide to author blockchain sync or block propagation tests and we end up needing the wait.

Copy link
Contributor

@halfalicious halfalicious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, have some questions though (see comment in this PR)

@gumb0 gumb0 merged commit 937236d into master Apr 8, 2019
@gumb0 gumb0 deleted the fix-eth-flush branch April 8, 2019 09:20
@gumb0
Copy link
Member Author

gumb0 commented Apr 8, 2019

This implies that doWork is already running when eth_sendTransaction / test_mineBlocks(1) is called (presumably on the Client thread)

Correct

which means that this test case is susceptible to race conditions (since you have 2 instances of doWork running concurrently).

Good point, I think Client class code in general is not very thread-safe in the face of RPC calls and other threads... (some other RPC methods call doWork or access member variables) So it's hard to address this in this quick-fix PR.

Is it possible to make this test more deterministic by removing the Client doWork loop and just have the single doWork call that eth_flush makes on the RPC thread?

We can't just remove Client::doWork loop, it's kind of the main loop of aleth application.
(What it does exactly: executes the blocks and writes them to DB during the sync; constructs the pending block from the Transaction Queue; controls mining threads; removes expired filters set with eth_...Filter RPC calls)

Another possible approach would be to parameterize the wait in eth_flush so the caller can decide whether or not they want to wait. This might be helpful if we decide to author blockchain sync or block propagation tests and we end up needing the wait.

It doesn't look useful to me, it's just the sleep in the very end of the method. You could do this sleep on the calling side if you need it.

... Now when I think of alternative approaches, I think the better way to flush would be to wait until the Transaction Queue is empty, and all transactions are included into the pending block (it's basically what soltest ended up doing for now on their side)
But I'm inclined not to touch this until soltest needs another fix.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants