Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize Read-only Transaction Execution Design Document #130
Parallelize Read-only Transaction Execution Design Document #130
Changes from 7 commits
b4c7787
8d7e0c6
db5e9e9
01b2e15
f728ebf
e3149f4
c90d46d
cf6f2ed
ea6202c
d5823a6
bed3a08
954b15a
ae62970
bd9ea26
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the flowchart could be simplified to:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checks for read-only transaction queue and read-only window time are done independently. That's why I separate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why reflect such details in a state diagram, if it doesn't add any useful information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? My understanding was that during the
write
window, we are not supposed to process readonly transactions (instead of processing them, we queue them for later execution in parallel). So we should process only from thewrite
queue.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In write window, we handle both read and write operations, but not read-only transactions. Will define the terms at the beginning of the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should we have 3 queues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do plan to have 3 queues:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do the write transactions go?
Also shouldn't they all be in appbase since we typically want to process items from multiple queues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write transactions go to write operation queue. I will define the terms to make them less confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to limit this according to how much virtual memory we have? AntelopeIO/leap#645
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think this is one of those knobs we need to set a limit on so we can grok what the maximal resource usage is. For example with
read-only-num-threads = 512
&max_sync_call_depth = 4
the maximum number of simultaneous wasm executions is 2048 (or maybe 2560 depending how you define sync call depth). .512 or 1024 seem like good numbers to start with for now, imoThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that, by default, this new functionality is disabled? Is it really what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Not sure what's a good number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this feature should normally not run on producer nodes, disabling it by default seems better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if not run on producer nodes, it will not improve the chain max TPS, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It helps indirectly by speeding up API (and PTP if servicing RPCs) nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this would change the max chain TPS, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is not the point of these changes. If anything, this will decrease max TPS of a single producer node. That is one thing we will need to test. We should verify that making the priority queue thread safe and other changes do not greatly decrease max TPS for a single producer node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine. Is there a document that describes the point of this change then? If not, maybe this document should start with a rationale for the changes. I originally assumed (apparently incorrectly) that the intent was to increase the chain TPS number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add info on how this interacts with
cpu-effort-percent
andproduce-time-offset-us
and the last block versions of these.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need two config options? Why is read or write not just the remaining time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which remaining time? The cycle period doesn't necessarily have anything to do with block time intervals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idle = get_info and other http requests are processed, push_trx are queued
BP Current
[ start-block, (cpu-effort-percent | max_block_cpu_usage), idle ]
[ process-block, cpu-effort-precent, idle ]
Validation Node Current
[ process-block, (process-trxs | http_requests) ]
BP Current
[ start-block, (cpu-effort-percent | max_block_cpu_usage | write-window-time - start-block-time), read-window-time ]
[ process-block, (cpu-effort-precent | write-window-time - process-block-time), read-window-time ]
Validation Node Current
[ process-block, (write-window-time - process-block-time), read-window-time, idle ]
Correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the default, and widely used,
max-transaction-time
is 30ms. We recently merged AntelopeIO/leap#649 which interruptsstart_block
when a block is received. We also have AntelopeIO/leap#590 that allows a block to be propagated as long as the previous one has been consumed which I assume will make it in soon. Currently on-chainmax_transaction_cpu_usage
for EOS is 150ms.As part of this effort, should read-only transactions (or all transactions) be killed when a block is received to be processed? With the read window and
max-read-only-transaction-time
being 150ms it does seem like we are potentially delaying block consumption longer than today.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There could be case for leaving this as milliseconds to match
max-transaction-time
. If so, I think it should be calledread-only-max-transaction-time-ms
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. We should break the read window when a new block is received. We could lower the default values for read and write windows for rapidly toggling. between writes and reads.
Will change to
read-only-max-transaction-time-ms
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't think more than one cycle of write-window read-window would happen over a block interval. With having to fit transactions in those windows that doesn't seem reasonable to have more than one of each per block interval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With AntelopeIO/leap#590 net threads can process block header validation. But this should be fine and will not conflict with read-only transaction execution.