-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize Read-only Transaction Execution Design Document #130
Conversation
Analyze all RPC APIs, use tables, elaborate how to queue non-thread safe requests, restructure document.
Discuss further the options when priorities are tied.
transactions/read-only/parallel.md
Outdated
- `read-window-time`: time in milliseconds the `read` window lasts. Must be equal to or greater than `max-read-only-transaction-time`. Default to 200 milliseconds | ||
- `read-only-max-queued-time`: time in milliseconds when is exceeded by the time the earliest transaction, node switches to `read` window, even it is before the end of `write` window. | ||
- `read-window-min-time`: time in milliseconds which must be remained in the `read` window when new transactions are scheduled for execution. Default to 5 milliseconds. This is to avoid unnecessary incomplete transaction execution. | ||
- `max-read-only-transaction-time`: time in milliseconds a read-only transaction can execute before being considered invalid. Default to 150 milliseconds. This option has already been implemented by #558 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the default, and widely used, max-transaction-time
is 30ms. We recently merged AntelopeIO/leap#649 which interrupts start_block
when a block is received. We also have AntelopeIO/leap#590 that allows a block to be propagated as long as the previous one has been consumed which I assume will make it in soon. Currently on-chain max_transaction_cpu_usage
for EOS is 150ms.
As part of this effort, should read-only transactions (or all transactions) be killed when a block is received to be processed? With the read window and max-read-only-transaction-time
being 150ms it does seem like we are potentially delaying block consumption longer than today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There could be case for leaving this as milliseconds to match max-transaction-time
. If so, I think it should be called read-only-max-transaction-time-ms
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. We should break the read window when a new block is received. We could lower the default values for read and write windows for rapidly toggling. between writes and reads.
Will change to read-only-max-transaction-time-ms
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't think more than one cycle of write-window read-window would happen over a block interval. With having to fit transactions in those windows that doesn't seem reasonable to have more than one of each per block interval.
transactions/read-only/parallel.md
Outdated
### Configuration Options | ||
|
||
- `read-only-num-threads`: the number of threads in read-only transaction execution thread pool. Default to `0`. If it is `0`, read-only transactions are executed on the main thread sequentially as they arrive | ||
- `write-window-time`: time in milliseconds the `write` window lasts. Default to 500 milliseconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add info on how this interacts with cpu-effort-percent
and produce-time-offset-us
and the last block versions of these.
transactions/read-only/parallel.md
Outdated
|
||
- `read-only-num-threads`: the number of threads in read-only transaction execution thread pool. Default to `0`. If it is `0`, read-only transactions are executed on the main thread sequentially as they arrive | ||
- `write-window-time`: time in milliseconds the `write` window lasts. Default to 500 milliseconds | ||
- `read-window-time`: time in milliseconds the `read` window lasts. Must be equal to or greater than `max-read-only-transaction-time`. Default to 200 milliseconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need two config options? Why is read or write not just the remaining time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which remaining time? The cycle period doesn't necessarily have anything to do with block time intervals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Current
idle = get_info and other http requests are processed, push_trx are queued
BP Current
[ start-block, (cpu-effort-percent | max_block_cpu_usage), idle ]
[ process-block, cpu-effort-precent, idle ]
Validation Node Current
[ process-block, (process-trxs | http_requests) ]
- Proposed in this design:
BP Current
[ start-block, (cpu-effort-percent | max_block_cpu_usage | write-window-time - start-block-time), read-window-time ]
[ process-block, (cpu-effort-precent | write-window-time - process-block-time), read-window-time ]
Validation Node Current
[ process-block, (write-window-time - process-block-time), read-window-time, idle ]
Correct?
- Safety between read-only transaction threads and other `nodeos` threads | ||
- _main_ thread: The `main` thread only performs functions safe to read-only transaction execution. | ||
- _chain_ thread: `chain` threads are used in `apply_block`, `log_irreversible`, `finalize_block`, `create_block_state_future`. Those do not run while in `read` window. | ||
- _net_ thread: It is used for low-level networking. No conflicts with read-only transaction execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With AntelopeIO/leap#590 net threads can process block header validation. But this should be fine and will not conflict with read-only transaction execution.
R --> T[read_window_deadline passed?] | ||
T -->|yes| A | ||
T -->|no| R | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the flowchart could be simplified to:
flowchart TD
A(((write window))) -->|push a new ro trx| B[longest_queued_time > read-only-max-queued-time]
B -->|yes| R(((read window)))
B -->|no| A
A --> D[write_window_deadline passed <b>AND</b> read-only trx queue not empty?]
D -->|yes| R
D -->|no| A
R --> S[read-only trx queue empty <b>OR</b> read_window_deadline passed?]
S -->|yes| A
S -->|no| R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checks for read-only transaction queue and read-only window time are done independently. That's why I separate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why reflect such details in a state diagram, if it doesn't add any useful information?
transactions/read-only/parallel.md
Outdated
- In `write` window, compare the priorities of the top functions in `read-only-safe` and `not-read-only-safe`. The one with higher priority is executed. If tied, three options are considered: | ||
- `not-read-only-safe` function is favored | ||
- randomly pick one | ||
- add a time attribute to the functions and the older one is picked. This keeps the original behavior. Even though at a cost of the extra time field and an extra comparison, this option seems best. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? My understanding was that during the write
window, we are not supposed to process readonly transactions (instead of processing them, we queue them for later execution in parallel). So we should process only from the write
queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In write window, we handle both read and write operations, but not read-only transactions. Will define the terms at the beginning of the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should we have 3 queues?
- write transactions and write operations (only in write window)
- readonly transactions (only in read window)
- readonly operations (both read and write window)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do plan to have 3 queues:
- write operation queue (in appbase)
- read-only operation queue (in appbase)
- read-only transaction queue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do the write transactions go?
Also shouldn't they all be in appbase since we typically want to process items from multiple queues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write transactions go to write operation queue. I will define the terms to make them less confusing.
transactions/read-only/parallel.md
Outdated
|
||
### Configuration Options | ||
|
||
- `read-only-num-threads`: the number of threads in read-only transaction execution thread pool. Default to `0`. If it is `0`, read-only transactions are executed on the main thread sequentially as they arrive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default to
0
. If it is0
, read-only transactions are executed on the main thread sequentially as they arrive
Does this mean that, by default, this new functionality is disabled? Is it really what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Not sure what's a good number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this feature should normally not run on producer nodes, disabling it by default seems better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this feature should normally not run on producer nodes
So if not run on producer nodes, it will not improve the chain max TPS, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It helps indirectly by speeding up API (and PTP if servicing RPCs) nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this would change the max chain TPS, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is not the point of these changes. If anything, this will decrease max TPS of a single producer node. That is one thing we will need to test. We should verify that making the priority queue thread safe and other changes do not greatly decrease max TPS for a single producer node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine. Is there a document that describes the point of this change then? If not, maybe this document should start with a rationale for the changes. I originally assumed (apparently incorrectly) that the intent was to increase the chain TPS number.
Initial release for review.