Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust the output of EXPLAIN / EXPLAIN ANALYZE to show some placement info #29421

Open
morgo opened this issue Nov 4, 2021 · 4 comments
Open
Assignees
Labels
sig/planner SIG: Planner sig/sql-infra SIG: SQL Infra type/enhancement The issue or PR belongs to an enhancement.

Comments

@morgo
Copy link
Contributor

morgo commented Nov 4, 2021

Enhancement

Subtask of #18030

In Placement Policies, we have some "built in" tags that we support syntactic sugar placement hints for. Such as CREATE PLACEMENT POLICY x PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-west-1".

Because PRIMARY_REGION and PRIMARY_ZONE in particular imply higher latency, it would be helpful to treat these differently and how some information in EXPLAIN that communicates this might be adding to the query latency.

It might be harder to demonstrate this for non syntactic sugar varieties of placement such as +disk=ssd. However, ideas welcome if there is an idea that can include both.

@morgo morgo added the type/enhancement The issue or PR belongs to an enhancement. label Nov 4, 2021
@morgo
Copy link
Contributor Author

morgo commented Nov 4, 2021

@yudongusa @AilinKid what do you think?

@AilinKid
Copy link
Contributor

AilinKid commented Nov 8, 2021

mysql> explain select * from t;
+-----------------------+----------+-----------+---------------+--------------------------------+
| id                    | estRows  | task      | access object | operator info                  |
+-----------------------+----------+-----------+---------------+--------------------------------+
| TableReader_5         | 10000.00 | root      |               | data:TableFullScan_4           |
| └─TableFullScan_4     | 10000.00 | cop[tikv] | table:t       | keep order:false, stats:pseudo |
+-----------------------+----------+-----------+---------------+--------------------------------+
2 rows in set (0.00 sec)

mysql> explain select * from t;
+---------------------------+----------+-----------+-----------------------+--------------------------------+
| id                        | estRows  | task      | access object         | operator info                  |
+---------------------------+----------+-----------+-----------------------+--------------------------------+
| PartitionUnion_8          | 10001.00 | root      |                       |                                |
| ├─TableReader_10          | 1.00     | root      |                       | data:TableFullScan_9           |
| │ └─TableFullScan_9       | 1.00     | cop[tikv] | table:t, partition:p0 | keep order:false, stats:pseudo |
| └─TableReader_12          | 10000.00 | root      |                       | data:TableFullScan_11          |
|   └─TableFullScan_11      | 10000.00 | cop[tikv] | table:t, partition:p1 | keep order:false, stats:pseudo |
+---------------------------+----------+-----------+-----------------------+--------------------------------+
5 rows in set (0.00 sec)

Example above show that explain info are showed with the granularity of single operator and the leave nodes only reach the table level. (partition table has multi table level leave nodes)

Since placement rules also have some placement constraints at table level, we can show them with the data source nodes adding the query latency references.

It's promising that:

mysql> explain select * from t;
+-----------------------+----------+-----------+---------------+--------------------------------+
| id                    | estRows  | task      |       ...     |  placement rules               |
+-----------------------+----------+-----------+---------------+--------------------------------+
| TableReader_5         | 10000.00 | root      |               | None                           |
| └─TableFullScan_4     | 10000.00 | cop[tikv] |       ...     | PRIMARY_REGION="us-east-1"     |
+-----------------------+----------+-----------+---------------+--------------------------------+
2 rows in set (0.00 sec)

For non syntactic sugar varieties of placement such as +disk=ssd, non-location labels can't tell us more unless user knows where these storage nodes with that labels are.

For more fine-grained detail, a follow-read will be routed to REGIONS rather than PRIMARY_REGION, which adds some complexity for explain.

@SunRunAway
Copy link
Contributor

What I want to know basicly is:

  1. I need to know which of the regions would the query scans in, when an "explain" is executed.
  2. I need to know how many rows the query scans in each of the regions, when an "explain analyze" is executed.

@dveeden
Copy link
Contributor

dveeden commented Nov 8, 2021

Maybe this should be part of access object. Another thing to consider is to not show this by default and only show this when running EXPLAIN PLACEMENT ... or EXPLAIN EXTENDED ... ? This is because the explain output is already using very long lines by default, and with the tree-like output of the id column this makes the output not very readble if line wrapping is enabled. While less -S as pager helps, this is not the default in any MySQL client.

@AilinKid AilinKid added sig/planner SIG: Planner sig/sql-infra SIG: SQL Infra and removed sig/planner SIG: Planner labels Nov 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner sig/sql-infra SIG: SQL Infra type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

5 participants