-
Notifications
You must be signed in to change notification settings - Fork 696
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
- Loading branch information
1 parent
4019382
commit fe46f0b
Showing
1 changed file
with
70 additions
and
26 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,58 +1,102 @@ | ||
--- | ||
title: Tune TiFlash Performance | ||
summary: Learn how to tune the performance of TiFlash. | ||
summary: Learn how to tune the performance of TiFlash by planning machine resources and tuning TiDB parameters. | ||
--- | ||
|
||
# Tune TiFlash Performance | ||
|
||
This document introduces how to tune the performance of TiFlash, including planning machine resources and tuning TiDB parameters. | ||
This document introduces how to tune the performance of TiFlash by properly planning machine resources and tuning TiDB parameters. By following these methods, your TiFlash cluster can achieve optimal performance. | ||
|
||
## Plan resources | ||
|
||
If you want to save machine resources and have no requirement on isolation, you can use the method that combines the deployment of both TiKV and TiFlash. It is recommended that you save enough resources for TiKV and TiFlash respectively, and do not share disks. | ||
|
||
## Tune TiDB parameters | ||
|
||
1. For the TiDB node dedicated to OLAP/TiFlash, it is recommended that you increase the value of the [`tidb_distsql_scan_concurrency`](/system-variables.md#tidb_distsql_scan_concurrency) configuration item for this node to `80`: | ||
This section describes how to improve TiFlash performance by tuning TiDB parameters, including: | ||
|
||
```sql | ||
set @@tidb_distsql_scan_concurrency = 80; | ||
``` | ||
- [Forcibly enable the MPP mode](#forcibly-enable-the-mpp-mode) | ||
- [Push down aggregate functions to a position before `Join` or `Union`](#push-down-aggregate-functions-to-a-position-before-join-or-union) | ||
- [Enable `Distinct` optimization](#enable-distinct-optimization) | ||
- [Compact data using the `ALTER TABLE ... COMPACT` statement](#compact-data-using-the-alter-table--compact-statement) | ||
- [Replace Shuffled Hash Join with Broadcast Hash Join](#replace-shuffled-hash-join-with-broadcast-hash-join) | ||
- [Set a greater execution concurrency](#set-a-greater-execution-concurrency) | ||
- [Configure `tiflash_fine_grained_shuffle_stream_count`](#configure-tiflash_fine_grained_shuffle_stream_count) | ||
|
||
2. Enable the super batch feature: | ||
### Forcibly enable the MPP mode | ||
|
||
You can use the [`tidb_allow_batch_cop`](/system-variables.md#tidb_allow_batch_cop-new-in-v40) variable to set whether to merge Region requests when reading from TiFlash. | ||
MPP execution plans can fully utilize distributed computing resources, thereby significantly improving the efficiency of batch data queries. When the optimizer does not generate an MPP execution plan for a query, you can forcibly enable the MPP mode: | ||
|
||
When the number of Regions involved in the query is relatively large, try to set this variable to `1` (effective for coprocessor requests with `aggregation` operators that are pushed down to TiFlash), or set this variable to `2` (effective for all coprocessor requests that are pushed down to TiFlash). | ||
The variable [`tidb_enforce_mpp`](/system-variables.md#tidb_enforce_mpp-new-in-v51) controls whether to ignore the optimizer's cost estimation and to forcibly use TiFlash's MPP mode for query execution. To enable MPP mode forcibly, run the following command: | ||
|
||
```sql | ||
set @@tidb_allow_batch_cop = 1; | ||
``` | ||
```sql | ||
set @@tidb_enforce_mpp = ON; | ||
``` | ||
|
||
3. Enable the optimization of pushing down aggregate functions before TiDB operators such as `JOIN` or `UNION`: | ||
### Push down aggregate functions to a position before `Join` or `Union` | ||
|
||
You can use the [`tidb_opt_agg_push_down`](/system-variables.md#tidb_opt_agg_push_down) variable to control the optimizer to execute this optimization. When the aggregate operations are quite slow in the query, try to set this variable to `1`. | ||
By pushing down aggregate operations to the position before `Join` or `Union`, you can reduce the data to be processed in the `Join` or `Union` operation, thereby improving performance. | ||
|
||
```sql | ||
set @@tidb_opt_agg_push_down = 1; | ||
``` | ||
The variable [`tidb_opt_agg_push_down`](/system-variables.md#tidb_opt_agg_push_down) controls whether the optimizer executes the optimization operation of pushing down the aggregate function to the position before `Join` or `Union`. When the aggregate operations are quite slow in the query, you can set this variable to `ON`. | ||
|
||
4. Enable the optimization of pushing down aggregate functions with `Distinct` before TiDB operators such as `JOIN` or `UNION`: | ||
```sql | ||
set @@tidb_opt_agg_push_down = ON; | ||
``` | ||
|
||
You can use the [`tidb_opt_distinct_agg_push_down`](/system-variables.md#tidb_opt_distinct_agg_push_down) variable to control the optimizer to execute this optimization. When the aggregate operations with `Distinct` are quite slow in the query, try to set this variable to `1`. | ||
### Enable `Distinct` optimization | ||
|
||
```sql | ||
set @@tidb_opt_distinct_agg_push_down = 1; | ||
``` | ||
TiFlash does not support some aggregate functions that accept the `Distinct` column, such as `Sum`. By default, the entire aggregate function is calculated in TiDB. By enabling the `Distinct` optimization, some operations can be pushed down to TiFlash, thereby improving query performance. | ||
|
||
If the aggregate function with the `distinct` operation is slow in a query, you can enable the optimization operation of pushing down the aggregate function with `Distinct` (such as `select sum(distinct a) from t`) to Coprocessor by setting the value of the [`tidb_opt_distinct_agg_push_down`](/system-variables.md#tidb_opt_distinct_agg_push_down) variable to `ON`. | ||
|
||
```sql | ||
set @@tidb_opt_distinct_agg_push_down = ON; | ||
``` | ||
|
||
### Compact data using the `ALTER TABLE ... COMPACT` statement | ||
|
||
Executing the [`ALTER TABLE ... COMPACT`](/sql-statements/sql-statement-alter-table-compact.md) statement can initiate compaction for a specific table or partition on a TiFlash node. During the compaction, the physical data on the node is rewritten, including cleaning up deleted rows and merging multiple versions of data caused by updates. This helps enhance access performance and reduce disk usage. The following are examples: | ||
|
||
5. Compact data using the `ALTER TABLE ... COMPACT` statement if necessary: | ||
```sql | ||
ALTER TABLE employees COMPACT TIFLASH REPLICA; | ||
``` | ||
|
||
Executing the [`ALTER TABLE ... COMPACT`](/sql-statements/sql-statement-alter-table-compact.md) statement can initiate compaction for a specific table or partition on a TiFlash node. During the compaction, storage nodes rewrite physical data, including cleaning up deleted rows and merging multiple versions of data caused by updates. This helps enhance read performance and reduce disk usage. The following are examples: | ||
```sql | ||
ALTER TABLE employees COMPACT PARTITION pNorth, pEast TIFLASH REPLICA; | ||
``` | ||
|
||
### Replace Shuffled Hash Join with Broadcast Hash Join | ||
|
||
For `Join` operations with small tables, the Broadcast Hash Join algorithm can avoid transfering large tables, thereby improving the computing performance. | ||
|
||
- The [`tidb_broadcast_join_threshold_size`](/system-variables.md#tidb_broadcast_join_threshold_size-new-in-v50) variable controls whether to use the Broadcast Hash Join algorithm. If the table size (unit: byte) is smaller than the value of this variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used. | ||
|
||
```sql | ||
ALTER TABLE employees COMPACT TIFLASH REPLICA; | ||
set @@tidb_broadcast_join_threshold_size = 2000000; | ||
``` | ||
|
||
- The [`tidb_broadcast_join_threshold_count`](/system-variables.md#tidb_broadcast_join_threshold_count-new-in-v50) variable also controls whether to use the Broadcast Hash Join algorithm. If the objects of the join operation belong to a subquery, the optimizer cannot estimate the size of the subquery result set. In this situation, the size is determined by the number of rows in the result set. If the estimated number of rows for the subquery is fewer than the value of this variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used. | ||
|
||
```sql | ||
ALTER TABLE employees COMPACT PARTITION pNorth, pEast TIFLASH REPLICA; | ||
set @@tidb_broadcast_join_threshold_count = 100000; | ||
``` | ||
|
||
### Set a greater execution concurrency | ||
|
||
A greater execution concurrency allows TiFlash to occupy more CPU resources of the system, thereby improving query performance. | ||
|
||
The [`tidb_max_tiflash_threads`](/system-variables.md#tidb_max_tiflash_threads-new-in-v610) variable is used to set the maximum concurrency for TiFlash to execute a request. The unit is threads. | ||
|
||
```sql | ||
set @@tidb_max_tiflash_threads = 20; | ||
``` | ||
|
||
### Configure `tiflash_fine_grained_shuffle_stream_count` | ||
|
||
You can increase the concurrency for executing window functions by configuring [`tiflash_fine_grained_shuffle_stream_count`](/system-variables.md#tiflash_fine_grained_shuffle_stream_count-new-in-v620) of the Fine Grained Shuffle feature. In this way, the execution of window functions can occupy more system resources, which improves query performance. | ||
|
||
When a window function is pushed down to TiFlash for execution, you can use this variable to control the concurrency level of the window function execution. The unit is threads. | ||
|
||
```sql | ||
set @@tiflash_fine_grained_shuffle_stream_count = 20; | ||
``` |