Parallelize Read-only Transaction Execution Design Document #130

linh2931 · 2023-01-17T19:13:32Z

Initial release for review.

…n execution

Analyze all RPC APIs, use tables, elaborate how to queue non-thread safe requests, restructure document.

Discuss further the options when priorities are tied.

heifner · 2023-01-19T14:43:31Z

transactions/read-only/parallel.md

+- `read-window-time`: time in milliseconds the `read` window lasts. Must be equal to or greater than `max-read-only-transaction-time`. Default to 200 milliseconds
+- `read-only-max-queued-time`: time in milliseconds when is exceeded by the time the earliest transaction, node switches to `read` window, even it is before the end of `write` window.
+- `read-window-min-time`: time in milliseconds which must be remained in the `read` window when new transactions are scheduled for execution. Default to 5 milliseconds. This is to avoid unnecessary incomplete transaction execution.
+- `max-read-only-transaction-time`: time in milliseconds a read-only transaction can execute before being considered invalid. Default to 150 milliseconds. This option has already been implemented by #558


Currently the default, and widely used, max-transaction-time is 30ms. We recently merged AntelopeIO/leap#649 which interrupts start_block when a block is received. We also have AntelopeIO/leap#590 that allows a block to be propagated as long as the previous one has been consumed which I assume will make it in soon. Currently on-chain max_transaction_cpu_usage for EOS is 150ms.

As part of this effort, should read-only transactions (or all transactions) be killed when a block is received to be processed? With the read window and max-read-only-transaction-time being 150ms it does seem like we are potentially delaying block consumption longer than today.

There could be case for leaving this as milliseconds to match max-transaction-time. If so, I think it should be called read-only-max-transaction-time-ms.

Thanks. We should break the read window when a new block is received. We could lower the default values for read and write windows for rapidly toggling. between writes and reads.

Will change to read-only-max-transaction-time-ms.

Oh, I didn't think more than one cycle of write-window read-window would happen over a block interval. With having to fit transactions in those windows that doesn't seem reasonable to have more than one of each per block interval.

heifner · 2023-01-19T14:45:57Z

transactions/read-only/parallel.md

+### Configuration Options
+
+- `read-only-num-threads`: the number of threads in read-only transaction execution thread pool. Default to `0`. If it is `0`, read-only transactions are executed on the main thread sequentially as they arrive
+- `write-window-time`: time in milliseconds the `write` window lasts. Default to 500 milliseconds


Can you add info on how this interacts with cpu-effort-percent and produce-time-offset-us and the last block versions of these.

heifner · 2023-01-19T14:47:14Z

transactions/read-only/parallel.md

+
+- `read-only-num-threads`: the number of threads in read-only transaction execution thread pool. Default to `0`. If it is `0`, read-only transactions are executed on the main thread sequentially as they arrive
+- `write-window-time`: time in milliseconds the `write` window lasts. Default to 500 milliseconds
+- `read-window-time`: time in milliseconds the `read` window lasts. Must be equal to or greater than `max-read-only-transaction-time`. Default to 200 milliseconds


Why do we need two config options? Why is read or write not just the remaining time?

Which remaining time? The cycle period doesn't necessarily have anything to do with block time intervals.

Current

idle = get_info and other http requests are processed, push_trx are queued

BP Current
[ start-block, (cpu-effort-percent | max_block_cpu_usage), idle ]
[ process-block, cpu-effort-precent, idle ]

Validation Node Current
[ process-block, (process-trxs | http_requests) ]

Proposed in this design:

BP Current
[ start-block, (cpu-effort-percent | max_block_cpu_usage | write-window-time - start-block-time), read-window-time ]
[ process-block, (cpu-effort-precent | write-window-time - process-block-time), read-window-time ]

Validation Node Current
[ process-block, (write-window-time - process-block-time), read-window-time, idle ]

Correct?

transactions/read-only/parallel.md

heifner · 2023-01-19T15:08:31Z

transactions/read-only/parallel.md

+- Safety between read-only transaction threads and other `nodeos` threads
+   - _main_ thread: The `main` thread only performs functions safe to read-only transaction execution.
+   - _chain_ thread: `chain` threads are used in `apply_block`, `log_irreversible`, `finalize_block`,  `create_block_state_future`. Those do not run while in `read` window.
+   - _net_ thread: It is used for low-level networking. No conflicts with read-only transaction execution.


With AntelopeIO/leap#590 net threads can process block header validation. But this should be fine and will not conflict with read-only transaction execution.

transactions/read-only/parallel.md

greg7mdp · 2023-01-19T19:42:55Z

transactions/read-only/parallel.md

+    R --> T[read_window_deadline passed?]
+    T -->|yes| A
+    T -->|no| R  
+```


I think the flowchart could be simplified to:

flowchart TD A(((write window))) -->|push a new ro trx| B[longest_queued_time > read-only-max-queued-time] B -->|yes| R(((read window))) B -->|no| A A --> D[write_window_deadline passed <b>AND</b> read-only trx queue not empty?] D -->|yes| R D -->|no| A R --> S[read-only trx queue empty <b>OR</b> read_window_deadline passed?] S -->|yes| A S -->|no| R

Loading

The checks for read-only transaction queue and read-only window time are done independently. That's why I separate them.

Why reflect such details in a state diagram, if it doesn't add any useful information?

transactions/read-only/parallel.md

greg7mdp · 2023-01-19T19:50:51Z

transactions/read-only/parallel.md

+  - In `write` window, compare the priorities of the top functions in `read-only-safe` and `not-read-only-safe`. The one with higher priority is executed. If tied, three options are considered:
+    - `not-read-only-safe` function is favored 
+    - randomly pick one
+    - add a time attribute to the functions and the older one is picked. This keeps the original behavior. Even though at a cost of the extra time field and an extra comparison, this option seems best.


Why? My understanding was that during the write window, we are not supposed to process readonly transactions (instead of processing them, we queue them for later execution in parallel). So we should process only from the write queue.

In write window, we handle both read and write operations, but not read-only transactions. Will define the terms at the beginning of the document.

So should we have 3 queues?

write transactions and write operations (only in write window)

readonly transactions (only in read window)

readonly operations (both read and write window)

We do plan to have 3 queues:

write operation queue (in appbase)

read-only operation queue (in appbase)

read-only transaction queue

Where do the write transactions go?
Also shouldn't they all be in appbase since we typically want to process items from multiple queues?

Write transactions go to write operation queue. I will define the terms to make them less confusing.

greg7mdp · 2023-01-19T19:55:41Z

transactions/read-only/parallel.md

+
+### Configuration Options
+
+- `read-only-num-threads`: the number of threads in read-only transaction execution thread pool. Default to `0`. If it is `0`, read-only transactions are executed on the main thread sequentially as they arrive


Default to 0. If it is 0, read-only transactions are executed on the main thread sequentially as they arrive

Does this mean that, by default, this new functionality is disabled? Is it really what we want?

Good point. Not sure what's a good number.

As this feature should normally not run on producer nodes, disabling it by default seems better.

As this feature should normally not run on producer nodes

So if not run on producer nodes, it will not improve the chain max TPS, right?

It helps indirectly by speeding up API (and PTP if servicing RPCs) nodes.

I don't see how this would change the max chain TPS, though.

That is not the point of these changes. If anything, this will decrease max TPS of a single producer node. That is one thing we will need to test. We should verify that making the priority queue thread safe and other changes do not greatly decrease max TPS for a single producer node.

That's fine. Is there a document that describes the point of this change then? If not, maybe this document should start with a rationale for the changes. I originally assumed (apparently incorrectly) that the intent was to increase the chain TPS number.

transactions/read-only/parallel.md

linh2931 and others added 6 commits January 12, 2023 12:07

initial version of parallel.md for parallelizing read-only transactio…

b4c7787

…n execution

Minor editorial corrections.

8d7e0c6

Restructure and complete all analysis

db5e9e9

Analyze all RPC APIs, use tables, elaborate how to queue non-thread safe requests, restructure document.

Incorporate Areg's comments, add more design

01b2e15

Minor editorial changes

f728ebf

Update tests

e3149f4

linh2931 requested review from heifner, arhag and spoonincode January 17, 2023 19:13

More appbase priority queue option discussion

c90d46d

Discuss further the options when priorities are tied.

linh2931 requested review from larryk85 and greg7mdp January 18, 2023 22:26

heifner requested changes Jan 19, 2023

View reviewed changes

greg7mdp requested changes Jan 19, 2023

View reviewed changes

greg7mdp reviewed Jan 19, 2023

View reviewed changes

heifner reviewed Jan 19, 2023

View reviewed changes

transactions/read-only/parallel.md Outdated Show resolved Hide resolved

linh2931 added 4 commits January 22, 2023 22:14

Incorporate review comments

cf6f2ed

Add windows size discussion.

ea6202c

Use microseconds for read and write window time

d5823a6

Clarify write and read queues description

bed3a08

heifner approved these changes Jan 26, 2023

View reviewed changes

greg7mdp approved these changes Jan 26, 2023

View reviewed changes

Update with further discussion with Areg

954b15a

arhag requested changes Jan 27, 2023

View reviewed changes

linh2931 added 2 commits January 27, 2023 18:08

Incorporate Areg's review comments.

ae62970

Minor spelling corrections

bd9ea26

arhag approved these changes Jan 28, 2023

View reviewed changes

linh2931 merged commit d017ece into main Jan 28, 2023

linh2931 deleted the readonly_trx branch January 28, 2023 02:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize Read-only Transaction Execution Design Document #130

Parallelize Read-only Transaction Execution Design Document #130

linh2931 commented Jan 17, 2023

heifner Jan 19, 2023

heifner Jan 19, 2023

linh2931 Jan 19, 2023

heifner Jan 20, 2023

heifner Jan 19, 2023

heifner Jan 19, 2023

arhag Jan 19, 2023

heifner Jan 19, 2023 •

edited

Loading

heifner Jan 19, 2023

greg7mdp Jan 19, 2023

linh2931 Jan 19, 2023

greg7mdp Jan 19, 2023

greg7mdp Jan 19, 2023 •

edited

Loading

linh2931 Jan 19, 2023

greg7mdp Jan 19, 2023 •

edited

Loading

linh2931 Jan 19, 2023

greg7mdp Jan 19, 2023 •

edited

Loading

linh2931 Jan 19, 2023

greg7mdp Jan 19, 2023

linh2931 Jan 19, 2023

linh2931 Jan 20, 2023

greg7mdp Jan 20, 2023

linh2931 Jan 20, 2023

greg7mdp Jan 20, 2023

heifner Jan 20, 2023

greg7mdp Jan 20, 2023


		### Configuration Options

		- `read-only-num-threads`: the number of threads in read-only transaction execution thread pool. Default to `0`. If it is `0`, read-only transactions are executed on the main thread sequentially as they arrive

Parallelize Read-only Transaction Execution Design Document #130

Parallelize Read-only Transaction Execution Design Document #130

Conversation

linh2931 commented Jan 17, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heifner Jan 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

greg7mdp Jan 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

greg7mdp Jan 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

greg7mdp Jan 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heifner Jan 19, 2023 •

edited

Loading

greg7mdp Jan 19, 2023 •

edited

Loading

greg7mdp Jan 19, 2023 •

edited

Loading

greg7mdp Jan 19, 2023 •

edited

Loading