Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized Rothschild performance on high TPS #371

Merged
merged 10 commits into from
Jan 3, 2024

Conversation

coderofstuff
Copy link
Collaborator

@coderofstuff coderofstuff commented Dec 29, 2023

Two bottlenecks were found:

  1. Hashing of TX ID causes the generation of txs (which is done serially) to significantly slow down the entire loop
  2. UTXO Selection - currently for each run of select_utxo, it will:
  • loop over all UTXOs, filtering out those that are pending
  • then select the UTXOs it hasn't used yet
  • this causes each TX generation iteration to run in O(N) time where N = # of utxos. Effectively, this means that each iteration runs an O(N x M) time, where M is the passed TPS.

Solutions implemented:
Executed with --tps=1000

  1. Parallelize TX generation - hashing is a bottleneck so generating TXs seriallize will slow the entire loop down significantly. Instead of creating 1 TX per loop iteration, make the loop iteration tick ever 1 second and for each iteration create all the TXs needed in parallel.
  2. Optimize UTXO selection: Since we need to ensure we use a UTXO only once anyway, use an index tracker to track which UTXO index we should use next. This index is monotonically increasing, so it never tries to re-use the UTXOs from earlier. It only resets when the UTXO set is also refreshed.

Sample performance of new implementation:

2023-12-28 21:44:39.327-07:00 [INFO ] Tx rate: 982.6/sec, avg UTXO amount: 788482289, avg UTXOs per tx: 1, avg outs per tx: 2, estimated available UTXOs: 743223
2023-12-28 21:44:49.506-07:00 [INFO ] Tx rate: 982.4/sec, avg UTXO amount: 489091181, avg UTXOs per tx: 1, avg outs per tx: 2, estimated available UTXOs: 733223
2023-12-28 21:44:59.646-07:00 [INFO ] Tx rate: 986.2/sec, avg UTXO amount: 489082566, avg UTXOs per tx: 1, avg outs per tx: 2, estimated available UTXOs: 723223
2023-12-28 21:45:09.681-07:00 [INFO ] Tx rate: 996.5/sec, avg UTXO amount: 489081965, avg UTXOs per tx: 1, avg outs per tx: 2, estimated available UTXOs: 713223
2023-12-28 21:45:19.703-07:00 [INFO ] Tx rate: 997.8/sec, avg UTXO amount: 489081965, avg UTXOs per tx: 1, avg outs per tx: 2, estimated available UTXOs: 703223
2023-12-28 21:45:29.818-07:00 [INFO ] Tx rate: 988.6/sec, avg UTXO amount: 489079501, avg UTXOs per tx: 1, avg outs per tx: 2, estimated available UTXOs: 693223

coderofstuff and others added 8 commits December 31, 2023 10:09
So we can compare the performance difference between serial generation and parallel generation
1. Parallelize TX generation - hashing is a bottleneck so generating TXs seriallize will slow
the entire loop down significantly
2. Optimize UTXO selection - previously it was running at O(UTXO) due to filtering logic
@coderofstuff coderofstuff force-pushed the rothschild-unleashed branch 2 times, most recently from b96fbf1 to ecac582 Compare December 31, 2023 19:10
Due to integer divide, when tps is < 100, we end up creating no tx
@michaelsutton michaelsutton merged commit 1905cee into kaspanet:master Jan 3, 2024
6 checks passed
@coderofstuff coderofstuff deleted the rothschild-unleashed branch January 5, 2024 04:58
KashProtocol pushed a commit to Kash-Protocol/rusty-kash that referenced this pull request Jan 5, 2024
KashProtocol pushed a commit to Kash-Protocol/rusty-kash that referenced this pull request Jan 6, 2024
KashProtocol pushed a commit to Kash-Protocol/rusty-kash that referenced this pull request Jan 8, 2024
KashProtocol pushed a commit to Kash-Protocol/rusty-kash that referenced this pull request Jan 9, 2024
D-Stacks pushed a commit to D-Stacks/rusty-kaspa that referenced this pull request Jan 23, 2024
* Add rothschild tx benchmark

So we can compare the performance difference between serial generation and parallel generation

* Optimize high TPS generation

1. Parallelize TX generation - hashing is a bottleneck so generating TXs seriallize will slow
the entire loop down significantly
2. Optimize UTXO selection - previously it was running at O(UTXO) due to filtering logic

* Move criterion to dev-dependency

* Some more rothschild improvements

* Fix from_fmillis typo

* Add flags

* Parallelize txid hashing in simpa

* Moving ClientPool to grpc/client

* Fix rothschild sending less than 100 txs

Due to integer divide, when tps is < 100, we end up creating no tx

* Remove obsolete temp init comment

---------

Co-authored-by: Ori Newman <orinewman1@gmail.com>
smartgoo pushed a commit to smartgoo/rusty-kaspa that referenced this pull request Jun 18, 2024
* Add rothschild tx benchmark

So we can compare the performance difference between serial generation and parallel generation

* Optimize high TPS generation

1. Parallelize TX generation - hashing is a bottleneck so generating TXs seriallize will slow
the entire loop down significantly
2. Optimize UTXO selection - previously it was running at O(UTXO) due to filtering logic

* Move criterion to dev-dependency

* Some more rothschild improvements

* Fix from_fmillis typo

* Add flags

* Parallelize txid hashing in simpa

* Moving ClientPool to grpc/client

* Fix rothschild sending less than 100 txs

Due to integer divide, when tps is < 100, we end up creating no tx

* Remove obsolete temp init comment

---------

Co-authored-by: Ori Newman <orinewman1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants