Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize Read-only Transaction Execution Design Document #130

Merged
merged 14 commits into from
Jan 28, 2023

Conversation

linh2931
Copy link
Member

Initial release for review.

Discuss further the options when priorities are tied.
@linh2931 linh2931 requested review from larryk85 and greg7mdp January 18, 2023 22:26
- `read-window-time`: time in milliseconds the `read` window lasts. Must be equal to or greater than `max-read-only-transaction-time`. Default to 200 milliseconds
- `read-only-max-queued-time`: time in milliseconds when is exceeded by the time the earliest transaction, node switches to `read` window, even it is before the end of `write` window.
- `read-window-min-time`: time in milliseconds which must be remained in the `read` window when new transactions are scheduled for execution. Default to 5 milliseconds. This is to avoid unnecessary incomplete transaction execution.
- `max-read-only-transaction-time`: time in milliseconds a read-only transaction can execute before being considered invalid. Default to 150 milliseconds. This option has already been implemented by #558
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the default, and widely used, max-transaction-time is 30ms. We recently merged AntelopeIO/leap#649 which interrupts start_block when a block is received. We also have AntelopeIO/leap#590 that allows a block to be propagated as long as the previous one has been consumed which I assume will make it in soon. Currently on-chain max_transaction_cpu_usage for EOS is 150ms.

As part of this effort, should read-only transactions (or all transactions) be killed when a block is received to be processed? With the read window and max-read-only-transaction-time being 150ms it does seem like we are potentially delaying block consumption longer than today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be case for leaving this as milliseconds to match max-transaction-time. If so, I think it should be called read-only-max-transaction-time-ms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. We should break the read window when a new block is received. We could lower the default values for read and write windows for rapidly toggling. between writes and reads.

Will change to read-only-max-transaction-time-ms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't think more than one cycle of write-window read-window would happen over a block interval. With having to fit transactions in those windows that doesn't seem reasonable to have more than one of each per block interval.

### Configuration Options

- `read-only-num-threads`: the number of threads in read-only transaction execution thread pool. Default to `0`. If it is `0`, read-only transactions are executed on the main thread sequentially as they arrive
- `write-window-time`: time in milliseconds the `write` window lasts. Default to 500 milliseconds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add info on how this interacts with cpu-effort-percent and produce-time-offset-us and the last block versions of these.


- `read-only-num-threads`: the number of threads in read-only transaction execution thread pool. Default to `0`. If it is `0`, read-only transactions are executed on the main thread sequentially as they arrive
- `write-window-time`: time in milliseconds the `write` window lasts. Default to 500 milliseconds
- `read-window-time`: time in milliseconds the `read` window lasts. Must be equal to or greater than `max-read-only-transaction-time`. Default to 200 milliseconds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need two config options? Why is read or write not just the remaining time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which remaining time? The cycle period doesn't necessarily have anything to do with block time intervals.

Copy link
Member

@heifner heifner Jan 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Current

idle = get_info and other http requests are processed, push_trx are queued

BP Current
[ start-block, (cpu-effort-percent | max_block_cpu_usage), idle ]
[ process-block, cpu-effort-precent, idle ]

Validation Node Current
[ process-block, (process-trxs | http_requests) ]

  • Proposed in this design:

BP Current
[ start-block, (cpu-effort-percent | max_block_cpu_usage | write-window-time - start-block-time), read-window-time ]
[ process-block, (cpu-effort-precent | write-window-time - process-block-time), read-window-time ]

Validation Node Current
[ process-block, (write-window-time - process-block-time), read-window-time, idle ]

Correct?

transactions/read-only/parallel.md Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
- Safety between read-only transaction threads and other `nodeos` threads
- _main_ thread: The `main` thread only performs functions safe to read-only transaction execution.
- _chain_ thread: `chain` threads are used in `apply_block`, `log_irreversible`, `finalize_block`, `create_block_state_future`. Those do not run while in `read` window.
- _net_ thread: It is used for low-level networking. No conflicts with read-only transaction execution.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With AntelopeIO/leap#590 net threads can process block header validation. But this should be fine and will not conflict with read-only transaction execution.

transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
R --> T[read_window_deadline passed?]
T -->|yes| A
T -->|no| R
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the flowchart could be simplified to:

flowchart TD
    A(((write window))) -->|push a new ro trx| B[longest_queued_time > read-only-max-queued-time]
    B -->|yes| R(((read window)))
    B -->|no| A
    A --> D[write_window_deadline passed <b>AND</b> read-only trx queue not empty?]
    D -->|yes| R
    D -->|no| A
    
    R --> S[read-only trx queue empty <b>OR</b> read_window_deadline passed?]
    S -->|yes| A
    S -->|no| R
Loading

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checks for read-only transaction queue and read-only window time are done independently. That's why I separate them.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why reflect such details in a state diagram, if it doesn't add any useful information?

transactions/read-only/parallel.md Outdated Show resolved Hide resolved
Comment on lines 159 to 162
- In `write` window, compare the priorities of the top functions in `read-only-safe` and `not-read-only-safe`. The one with higher priority is executed. If tied, three options are considered:
- `not-read-only-safe` function is favored
- randomly pick one
- add a time attribute to the functions and the older one is picked. This keeps the original behavior. Even though at a cost of the extra time field and an extra comparison, this option seems best.
Copy link

@greg7mdp greg7mdp Jan 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? My understanding was that during the write window, we are not supposed to process readonly transactions (instead of processing them, we queue them for later execution in parallel). So we should process only from the write queue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In write window, we handle both read and write operations, but not read-only transactions. Will define the terms at the beginning of the document.

Copy link

@greg7mdp greg7mdp Jan 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should we have 3 queues?

  1. write transactions and write operations (only in write window)
  2. readonly transactions (only in read window)
  3. readonly operations (both read and write window)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do plan to have 3 queues:

  1. write operation queue (in appbase)
  2. read-only operation queue (in appbase)
  3. read-only transaction queue

Copy link

@greg7mdp greg7mdp Jan 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do the write transactions go?
Also shouldn't they all be in appbase since we typically want to process items from multiple queues?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write transactions go to write operation queue. I will define the terms to make them less confusing.


### Configuration Options

- `read-only-num-threads`: the number of threads in read-only transaction execution thread pool. Default to `0`. If it is `0`, read-only transactions are executed on the main thread sequentially as they arrive

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default to 0. If it is 0, read-only transactions are executed on the main thread sequentially as they arrive

Does this mean that, by default, this new functionality is disabled? Is it really what we want?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Not sure what's a good number.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this feature should normally not run on producer nodes, disabling it by default seems better.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this feature should normally not run on producer nodes

So if not run on producer nodes, it will not improve the chain max TPS, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It helps indirectly by speeding up API (and PTP if servicing RPCs) nodes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this would change the max chain TPS, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not the point of these changes. If anything, this will decrease max TPS of a single producer node. That is one thing we will need to test. We should verify that making the priority queue thread safe and other changes do not greatly decrease max TPS for a single producer node.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. Is there a document that describes the point of this change then? If not, maybe this document should start with a rationale for the changes. I originally assumed (apparently incorrectly) that the intent was to increase the chain TPS number.

transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
transactions/read-only/parallel.md Outdated Show resolved Hide resolved
@linh2931 linh2931 merged commit d017ece into main Jan 28, 2023
@linh2931 linh2931 deleted the readonly_trx branch January 28, 2023 02:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants