Study alternative containers for Storage #2344

damip · 2022-02-26T11:21:45Z

The shared Storage has few concurrent writes by a couple of threads, heavy concurrent reads by many threads.

We are currently planning on using Arc<RwLock<PreHashMap<BlockId, Arc<RwLock<BlockInfo>>>> as the underlying storage container.

Now I recently watched this: https://www.youtube.com/watch?v=HJ-719EGIts

Basically it could be possible to do most of the work we want to do lock-free, with much higher performance !

Then I looked for stuff and found crates proposing this kind of stuff:

https://docs.rs/flurry/latest/flurry/
https://github.com/xacrimon/dashmap (with comparative benchmark: https://github.com/xacrimon/conc-map-bench )
https://github.com/jonhoo/left-right
https://docs.rs/evmap/latest/evmap/

It looks like those structures can yield an order-of-magnitude performance improvement under the right conditions (that look like the conditions we have).

However, depending on the implementation, there could be some unexpected problems such as a temporary lack of read/write consistency.

We need to:

study the proposed containers (and look for others)
get a more precise idea of the Read/Write concurrency and throughput expectations in our architecture. Maybe wait to have the first version of Storage implemented to do an actual benchmark
try changing the underlying Storage container with the various options and bench the impact on performance

The text was updated successfully, but these errors were encountered:

gterzian · 2022-03-01T09:35:50Z

Basically it could be possible to do most of the work we want to do lock-free, with much higher performance !

I am doubtful of the higher performance of these things.

For example, see the below excerpt from https://www.chromium.org/developers/lock-and-condition-variable/

Programmers assume that locking is expensive, and that using atomic
operations will be more efficient. But in reality, acquiring and releasing a
lock is cheaper than a cache miss; attention to cache behaviour is usually a
more fruitful way to improve performance. Furthermore, lock-free data
structures are often more expensive than using locks. This is because a lock
allows arbitrary changes to be made to a complex data structure; if the same
changes must be made without a lock, the result is likely to take more
atomic read-modify-write instructions, not fewer.

In our specific situation, we have the below structure:

Arc<RwLock<Map<BlockId, Arc<RwLock<StoredBlock>>>>>

Most of the "work" will occurs while locking an individual StoredBlock, so there should be little contention on the lock around the top-level map.

That lock around the top-level map could be replaced by using some sort of lock-free map, however we should consider:

Probably losing the optimization of our current pre-hashed maps.
That going lock-free there probably doesn't really give any relevant performance boost.
That we also may want the synchronization that comes with a lock. For example if a block is pruned, we may not want it to be able to be read logically after that point. Lock-free structures may have looser synchronization, and/or if they do allow such "hard" synchronization, they might not be any faster than simply a lock around a map.

AurelienFT · 2022-03-01T10:32:22Z

I don't have a full vision of the code so I can't talk about if the synchronization will be a real problem.

But regarding this SO answer : https://stackoverflow.com/a/5680988 it may depend on the algorithm and also in the cases. If we look at this benchmark : https://github.com/xacrimon/conc-map-bench with a lot of reads and less writing it seems that we can have a huge improvement.

AurelienFT · 2022-03-01T14:37:52Z

Tried to run the benchmark that is pretty cool : https://github.com/ibraheemdev/conc-map-bench (this fork have a fix for flurry) with sha256 with several crates (bitcoin_hashes and RustCrypto) but they don't implement Hasher trait.

Following this issue : RustCrypto/hashes#49 it's not a good way to implement this trait for cryptographic hashes. Also, following the SO answer above (and my mind) the best way is to test in "close" real condition by updating the current existing benchmark on https://github.com/xacrimon/conc-map-bench to have possibly to give a number of writing threads and reading threads and also use our proper hashs.

AurelienFT · 2022-03-01T15:33:19Z

Btw @damip left-right is a child project of evmap : jonhoo/left-right@abe3769

AurelienFT · 2022-03-01T17:44:20Z

I have fork and run the benchmark with the hash of Massa and prehash on massa_models. Here is the results:

Scenario Heavy Read :

read   98%
insert  1%
remove  1%

Scenario Exchange :

read    10%
insert  40%
remove  40%
update  10%

Scenario rapid grow :

read    5%
insert 80%
remove  5%
update 10%

I am working on a fork of bustle (crate used in the benchmark) to split the read and write operations in specific threads and be able to specify how many read/write threads we want.

Links :

Fork of the benchmark : https://github.com/AurelienFT/conc-map-bench
Fork of bustle : https://github.com/AurelienFT/bustle

gterzian · 2022-03-02T04:05:00Z

@AurelienFT Thank you for your investigations.

It is expected that those maps will show an improvement over locks, since these lock-free structures are specifically designed with that in mind.

The question is whether it will show an improvement in our specific application, which will depend on whether the reading/writing is on the critical path.

Could be worth running benchmarks after the refactoring has been completed, and after having determined that the application spends a significant amount of time in that read/write workflow(either manipulating the map, or waiting on the lock).

The currently proposed API of Storage would make such experimentation quite easy, since the use of the lock around the top-level map is hidden from the user.

AurelienFT · 2022-03-02T08:13:06Z

Could be worth running benchmarks after the refactoring has been completed, and after having determined that the application spends a significant amount of time in that read/write workflow(either manipulating the map, or waiting on the lock).

I totally agree. The real case situation is always the best and having metrics measurement around the node is pretty useful and can help us see real improvements or not.

AurelienFT · 2022-03-02T16:39:06Z

I have successfully modified the two crates to allow a number of read and write threads. It's a bit "hacky" and if we want to contribute it upstream I will have to refactor it a bit but I don't think it's our priority right now.

Here is the results :

Scenario Heavy Read :

read   98%
insert  1%
remove  1%

Scenario Exchange :

read    10%
insert  40%
remove  40%
update  10%

Scenario rapid grow :

read    5%
insert 80%
remove  5%
update 10%

If you want me to rerun with different data/number of threads it's possible you can also run it directly with my own fork of the crates :

Fork of the benchmark : https://github.com/AurelienFT/conc-map-bench
Fork of bustle : https://github.com/AurelienFT/bustle

To run the benchmark on the fork you can run this command :

./scripts/bench.bash --read-threads="9" --write-threads="2" --read-threads="8"  --write-threads="2" --read-threads="7"  --write-threads="2" --read-threads="7"  --write-threads="3" --read-threads="7"  --write-threads="4"

The a --read-threads flag must always be paired with a --write-threads which will create a pair of test for benchmark and as shown above you can add multiples. (Will try to change to accept comma list directly).

To render the plots :

./scripts/plot.bash

Conclusion

We can see that when we have a lot of read test DashMap or Flurry could be nice. And when we have a lot of write the locks start to be more efficient (which is not our case).

adrien-zinger · 2022-03-03T09:27:14Z

Dashmap works with sharding, that's smart, not magic and scalable!! 🧔 🤟

AurelienFT · 2022-03-03T11:33:55Z

Dashmap works with sharding, that's smart, not magic and scalable!! 🧔 🤟

Flurry also :

This map usually acts as a binned (bucketed) hash table. Each key-value mapping is held in a BinEntry

AurelienFT · 2022-03-03T11:47:53Z

New benchamrk with random data as value instead of 0

Here is the results :

Scenario Heavy Read :

read   98%
insert  1%
remove  1%

Scenario Exchange :

read    10%
insert  40%
remove  40%
update  10%

Scenario rapid grow :

read    5%
insert 80%
remove  5%
update 10%

AurelienFT · 2022-03-03T16:42:10Z

@gterzian Do you have a branch with the implementation of storage, used at least one time and that can be run (not necessary on the testnet but on the labnet for example) ?

As there is an interface and the change of storage system will be transparent for the rest of the code. I will try to make sub branches of this one with the implementation of Storage using the Dashmap crate and the Flurrycrate so that we will be able to test and benchmark it.

Also if you have some tasks to help you on the integration of the storage so that I can help you on that because the more usage we have the more the bench will be !

gterzian · 2022-03-04T04:08:33Z

@AurelienFT Thank you for your enthousiasm.

I would like to reiterate what I wrote above:

Could be worth running benchmarks after the refactoring has been completed, and after having determined that the application spends a significant amount of time in that read/write workflow(either manipulating the map, or waiting on the lock).

For clarity, this means that I don't think we should be making any efforts on this until the following has been reached:

We have (largely) finished the integration of storage into the project.
We have noticed that the application spends a significant amount of time waiting on the lock on the top-level map becoming available.

Point 2 is especially important, since we can always optimize things, but most of it is wasted effort since if it will not materially affect overall performance of the application.

As an example, we could have a 10 times improvement on "something", however this would be meaningless if the app spends only 0,00001% of its time doing that "thing".

if you have some tasks to help you on the integration of the storage so that I can help you

Appreciated, the main discussion would be over at #2178.

I think a good start would be to familiarize yourself with the various issues and currently open PR. I think there will be more specific tasks next week when some initial work is finished.

because the more usage we have the more the bench will be

As per above, the goal is not to optimize this, unless we notice it becomes a problem :)

AurelienFT · 2022-03-30T15:27:09Z

I have run a simulator with 26 nodes and 100 tx/s on main with the storage and a branch where I implement DashMap and flurry and result are very close so it don't seems to be a bottleneck :

RwLock

DashMap

Flurry

damip · 2022-08-24T13:32:46Z

This can be interesting to revisit after Storage is used in the whole codebase

damip · 2023-06-05T09:24:18Z

The performance of the current version is good. Closing

damip added the good first issue Good for newcomers label Feb 26, 2022

damip mentioned this issue Mar 1, 2022

Shared "storage" library to avoid block copies #2178

Closed

AureliaDolo added design labels Mar 1, 2022

AureliaDolo added the benchmark label Mar 2, 2022

gterzian mentioned this issue Mar 9, 2022

Add shared storage #2233

Closed

AureliaDolo removed the good first issue Good for newcomers label Mar 31, 2022

damip closed this as completed Jun 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Study alternative containers for Storage #2344

Study alternative containers for Storage #2344

damip commented Feb 26, 2022

gterzian commented Mar 1, 2022

AurelienFT commented Mar 1, 2022 •

edited

Loading

AurelienFT commented Mar 1, 2022 •

edited

Loading

AurelienFT commented Mar 1, 2022

AurelienFT commented Mar 1, 2022

gterzian commented Mar 2, 2022 •

edited

Loading

AurelienFT commented Mar 2, 2022

AurelienFT commented Mar 2, 2022 •

edited

Loading

adrien-zinger commented Mar 3, 2022

AurelienFT commented Mar 3, 2022

AurelienFT commented Mar 3, 2022

AurelienFT commented Mar 3, 2022 •

edited

Loading

gterzian commented Mar 4, 2022 •

edited

Loading

AurelienFT commented Mar 30, 2022

damip commented Aug 24, 2022

damip commented Jun 5, 2023

Study alternative containers for Storage #2344

Study alternative containers for Storage #2344

Comments

damip commented Feb 26, 2022

gterzian commented Mar 1, 2022

AurelienFT commented Mar 1, 2022 • edited Loading

AurelienFT commented Mar 1, 2022 • edited Loading

AurelienFT commented Mar 1, 2022

AurelienFT commented Mar 1, 2022

Scenario Heavy Read :

Scenario Exchange :

Scenario rapid grow :

gterzian commented Mar 2, 2022 • edited Loading

AurelienFT commented Mar 2, 2022

AurelienFT commented Mar 2, 2022 • edited Loading

Scenario Heavy Read :

Scenario Exchange :

Scenario rapid grow :

Conclusion

adrien-zinger commented Mar 3, 2022

AurelienFT commented Mar 3, 2022

AurelienFT commented Mar 3, 2022

New benchamrk with random data as value instead of 0

Scenario Heavy Read :

Scenario Exchange :

Scenario rapid grow :

AurelienFT commented Mar 3, 2022 • edited Loading

gterzian commented Mar 4, 2022 • edited Loading

AurelienFT commented Mar 30, 2022

RwLock

DashMap

Flurry

damip commented Aug 24, 2022

damip commented Jun 5, 2023

AurelienFT commented Mar 1, 2022 •

edited

Loading

AurelienFT commented Mar 1, 2022 •

edited

Loading

gterzian commented Mar 2, 2022 •

edited

Loading

AurelienFT commented Mar 2, 2022 •

edited

Loading

AurelienFT commented Mar 3, 2022 •

edited

Loading

gterzian commented Mar 4, 2022 •

edited

Loading