Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com>
  • Loading branch information
shichun-0415 and TomShawn authored Jan 13, 2023
1 parent 2da5054 commit 835e5b9
Showing 1 changed file with 10 additions and 10 deletions.
20 changes: 10 additions & 10 deletions tiflash/tune-tiflash-performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ aliases: ['/docs/dev/tiflash/tune-tiflash-performance/','/docs/dev/reference/tif

# Tune TiFlash Performance

This document introduces how to tune the performance of TiFlash by properly planning machine resources and tuning TiDB parameters. By following these methods, you can achieve optimal performance of your TiFlash cluster.
This document introduces how to tune the performance of TiFlash by properly planning machine resources and tuning TiDB parameters. By following these methods, your TiFlash cluster can achieve optimal performance.

## Plan resources

Expand All @@ -17,7 +17,7 @@ If you want to save machine resources and have no requirement on isolation, you
This section describes how to improve TiFlash performance by tuning TiDB parameters, including:

- [Forcibly enable the MPP mode](#forcibly-enable-the-mpp-mode)
- [Push down aggregate functions before `Join` or `Union`](#push-down-aggregate-functions-before-join-or-union)
- [Push down aggregate functions to a position before `Join` or `Union`](#push-down-aggregate-functions-to-a-position-before-join-or-union)
- [Enable `Distinct` optimization](#enable-distinct-optimization)
- [Compact data using the `ALTER TABLE ... COMPACT` statement](#compact-data-using-the-alter-table--compact-statement)
- [Replace Shuffled Hash Join with Broadcast Hash Join](#replace-shuffled-hash-join-with-broadcast-hash-join)
Expand All @@ -26,17 +26,17 @@ This section describes how to improve TiFlash performance by tuning TiDB paramet

### Forcibly enable the MPP mode

MPP execution plans can fully utilize distributed compute resources, thereby significantly improving the efficiency of batch data queries. When a query does not generate an MPP execution plan, you can forcibly enable the MPP mode:
MPP execution plans can fully utilize distributed computing resources, thereby significantly improving the efficiency of batch data queries. When the optimizer does not generate an MPP execution plan for a query, you can forcibly enable the MPP mode:

The variable [`tidb_enforce_mpp`](/system-variables.md#tidb_enforce_mpp-new-in-v51) controls whether to ignore the optimizer's cost estimation and to forcibly use TiFlash's MPP mode for query execution. To enable MPP mode forcibly, run the following command:

```sql
set @@tidb_enforce_mpp = ON;
```

### Push down aggregate functions before `Join` or `Union`
### Push down aggregate functions to a position before `Join` or `Union`

By pushing down aggregate operations before `Join` or `Union`, you can reduce the data to be processed by `Join` or `Union`, thereby improving performance.
By pushing down aggregate operations to the position before `Join` or `Union`, you can reduce the data to be processed in the `Join` or `Union` operation, thereby improving performance.

The variable [`tidb_opt_agg_push_down`](/system-variables.md#tidb_opt_agg_push_down) controls whether the optimizer executes the optimization operation of pushing down the aggregate function to the position before `Join` or `Union`. When the aggregate operations are quite slow in the query, you can set this variable to `ON`.

Expand All @@ -46,17 +46,17 @@ set @@tidb_opt_agg_push_down = ON;

### Enable `Distinct` optimization

TiFlash does not support some aggregate functions that accept the `Distinct` column, such as `Sum`. By default, the entire aggregate function is calculated in TiDB. By enabling the `Distinct` optimization, some operations can be pushed down to TiFlash, thereby improving query performance:
TiFlash does not support some aggregate functions that accept the `Distinct` column, such as `Sum`. By default, the entire aggregate function is calculated in TiDB. By enabling the `Distinct` optimization, some operations can be pushed down to TiFlash, thereby improving query performance.

If the aggregate function with the `distinct` operation is slow in the query, you can enable the optimizer to execute the optimization operation of pushing down the aggregate function with `Distinct` (such as `select sum(distinct a) from t`) to Coprocessor by setting the value of the [`tidb_opt_distinct_agg_push_down`](/system-variables.md#tidb_opt_distinct_agg_push_down) variable to `ON`.
If the aggregate function with the `distinct` operation is slow in a query, you can enable the optimization operation of pushing down the aggregate function with `Distinct` (such as `select sum(distinct a) from t`) to Coprocessor by setting the value of the [`tidb_opt_distinct_agg_push_down`](/system-variables.md#tidb_opt_distinct_agg_push_down) variable to `ON`.

```sql
set @@tidb_opt_distinct_agg_push_down = ON;
```

### Compact data using the `ALTER TABLE ... COMPACT` statement

Executing the [`ALTER TABLE ... COMPACT`](/sql-statements/sql-statement-alter-table-compact.md) statement can initiate compaction for a specific table or partition on a TiFlash node. During the compaction, storage nodes rewrite physical data, including cleaning up deleted rows and merging multiple versions of data caused by updates. This helps enhance read performance and reduce disk usage. The following are examples:
Executing the [`ALTER TABLE ... COMPACT`](/sql-statements/sql-statement-alter-table-compact.md) statement can initiate compaction for a specific table or partition on a TiFlash node. During the compaction, the physical data on the node is rewritten, including cleaning up deleted rows and merging multiple versions of data caused by updates. This helps enhance access performance and reduce disk usage. The following are examples:

```sql
ALTER TABLE employees COMPACT TIFLASH REPLICA;
Expand All @@ -68,15 +68,15 @@ ALTER TABLE employees COMPACT PARTITION pNorth, pEast TIFLASH REPLICA;

### Replace Shuffled Hash Join with Broadcast Hash Join

For `Join` operations with small tables, the Broadcast Hash Join algorithm can avoid transmitting large tables, thereby improving the compute performance.
For `Join` operations with small tables, the Broadcast Hash Join algorithm can avoid transfering large tables, thereby improving the computing performance.

- The [`tidb_broadcast_join_threshold_size`](/system-variables.md#tidb_broadcast_join_threshold_size-new-in-v50) variable controls whether to use the Broadcast Hash Join algorithm. If the table size (unit: byte) is smaller than the value of this variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used.

```sql
set @@tidb_broadcast_join_threshold_size = 2000000;
```

- The [`tidb_broadcast_join_threshold_count`](/system-variables.md#tidb_broadcast_join_threshold_count-new-in-v50) variable also controls whether to use the Broadcast Hash Join algorithm. If the objects of the join operation belong to a subquery, the optimizer cannot estimate the size of the subquery result set. In this situation, the size is determined by the number of rows in the result set. If the estimated number of rows in the subquery is fewer than the value of this variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used.
- The [`tidb_broadcast_join_threshold_count`](/system-variables.md#tidb_broadcast_join_threshold_count-new-in-v50) variable also controls whether to use the Broadcast Hash Join algorithm. If the objects of the join operation belong to a subquery, the optimizer cannot estimate the size of the subquery result set. In this situation, the size is determined by the number of rows in the result set. If the estimated number of rows for the subquery is fewer than the value of this variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used.

```sql
set @@tidb_broadcast_join_threshold_count = 100000;
Expand Down

0 comments on commit 835e5b9

Please sign in to comment.