-
Notifications
You must be signed in to change notification settings - Fork 713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Study alternative containers for Storage #2344
Comments
I am doubtful of the higher performance of these things. For example, see the below excerpt from https://www.chromium.org/developers/lock-and-condition-variable/
In our specific situation, we have the below structure:
Most of the "work" will occurs while locking an individual That lock around the top-level map could be replaced by using some sort of lock-free map, however we should consider:
|
I don't have a full vision of the code so I can't talk about if the synchronization will be a real problem. But regarding this SO answer : https://stackoverflow.com/a/5680988 it may depend on the algorithm and also in the cases. If we look at this benchmark : https://github.com/xacrimon/conc-map-bench with a lot of reads and less writing it seems that we can have a huge improvement. |
Tried to run the benchmark that is pretty cool : https://github.com/ibraheemdev/conc-map-bench (this fork have a fix for flurry) with sha256 with several crates (bitcoin_hashes and RustCrypto) but they don't implement Following this issue : RustCrypto/hashes#49 it's not a good way to implement this trait for cryptographic hashes. Also, following the SO answer above (and my mind) the best way is to test in "close" real condition by updating the current existing benchmark on https://github.com/xacrimon/conc-map-bench to have possibly to give a number of writing threads and reading threads and also use our proper hashs. |
Btw @damip left-right is a child project of evmap : jonhoo/left-right@abe3769 |
I have fork and run the benchmark with the hash of Massa and prehash on Scenario Heavy Read :
Scenario Exchange :
Scenario rapid grow :
I am working on a fork of bustle (crate used in the benchmark) to split the read and write operations in specific threads and be able to specify how many read/write threads we want. Links :
|
@AurelienFT Thank you for your investigations. It is expected that those maps will show an improvement over locks, since these lock-free structures are specifically designed with that in mind. The question is whether it will show an improvement in our specific application, which will depend on whether the reading/writing is on the critical path. Could be worth running benchmarks after the refactoring has been completed, and after having determined that the application spends a significant amount of time in that read/write workflow(either manipulating the map, or waiting on the lock). The currently proposed API of |
I totally agree. The real case situation is always the best and having metrics measurement around the node is pretty useful and can help us see real improvements or not. |
I have successfully modified the two crates to allow a number of read and write threads. It's a bit "hacky" and if we want to contribute it upstream I will have to refactor it a bit but I don't think it's our priority right now. Here is the results : Scenario Heavy Read :
Scenario Exchange :
Scenario rapid grow :
If you want me to rerun with different data/number of threads it's possible you can also run it directly with my own fork of the crates :
To run the benchmark on the fork you can run this command :
The a --read-threads flag must always be paired with a --write-threads which will create a pair of test for benchmark and as shown above you can add multiples. (Will try to change to accept comma list directly). To render the plots :
ConclusionWe can see that when we have a lot of read test DashMap or Flurry could be nice. And when we have a lot of write the locks start to be more efficient (which is not our case). |
Dashmap works with sharding, that's smart, not magic and scalable!! 🧔 🤟 |
Flurry also :
|
@gterzian Do you have a branch with the implementation of storage, used at least one time and that can be run (not necessary on the testnet but on the labnet for example) ? As there is an interface and the change of storage system will be transparent for the rest of the code. I will try to make sub branches of this one with the implementation of Storage using the Also if you have some tasks to help you on the integration of the storage so that I can help you on that because the more usage we have the more the bench will be ! |
@AurelienFT Thank you for your enthousiasm. I would like to reiterate what I wrote above:
For clarity, this means that I don't think we should be making any efforts on this until the following has been reached:
Point 2 is especially important, since we can always optimize things, but most of it is wasted effort since if it will not materially affect overall performance of the application. As an example, we could have a 10 times improvement on "something", however this would be meaningless if the app spends only 0,00001% of its time doing that "thing".
Appreciated, the main discussion would be over at #2178. I think a good start would be to familiarize yourself with the various issues and currently open PR. I think there will be more specific tasks next week when some initial work is finished.
As per above, the goal is not to optimize this, unless we notice it becomes a problem :) |
This can be interesting to revisit after Storage is used in the whole codebase |
The performance of the current version is good. Closing |
The shared Storage has few concurrent writes by a couple of threads, heavy concurrent reads by many threads.
We are currently planning on using
Arc<RwLock<PreHashMap<BlockId, Arc<RwLock<BlockInfo>>>>
as the underlying storage container.Now I recently watched this: https://www.youtube.com/watch?v=HJ-719EGIts
Basically it could be possible to do most of the work we want to do lock-free, with much higher performance !
Then I looked for stuff and found crates proposing this kind of stuff:
It looks like those structures can yield an order-of-magnitude performance improvement under the right conditions (that look like the conditions we have).
However, depending on the implementation, there could be some unexpected problems such as a temporary lack of read/write consistency.
We need to:
The text was updated successfully, but these errors were encountered: