Skip to content

Commit

Permalink
glossary: add some terms to glossary (#14298)
Browse files Browse the repository at this point in the history
  • Loading branch information
ran-huang authored Feb 11, 2025
1 parent 7165179 commit 513a2b2
Showing 1 changed file with 105 additions and 9 deletions.
114 changes: 105 additions & 9 deletions glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,18 +42,30 @@ Baseline Capturing captures queries that meet capturing conditions and create bi

### Batch Create Table

Batch Create Table is a feature introduced in TiDB v6.0.0. This feature is enabled by default. When restoring data with a large number of tables (nearly 50000) using BR (Backup & Restore), the feature can greatly speed up the restore process by creating tables in batches. For details, see [Batch Create Table](/br/br-batch-create-table.md).
The Batch Create Table feature greatly speeds up the creation of multiple tables at a time by creating tables in batches. For example, when restoring thousands of tables using the [Backup & Restore (BR)](/br/backup-and-restore-overview.md) tool, this feature helps reduce the overall recovery time. For more information, see [Batch Create Table](/br/br-batch-create-table.md).

### Bucket

A [Region](#regionpeerraft-group) is logically divided into several small ranges called bucket. TiKV collects query statistics by buckets and reports the bucket status to PD. For details, see the [Bucket design doc](https://github.com/tikv/rfcs/blob/master/text/0082-dynamic-size-region.md#bucket).
A [Region](#regionpeerraft-group) is logically divided into several small ranges called bucket. TiKV collects query statistics by buckets and reports the bucket status to PD. For more information, see the [Bucket design doc](https://github.com/tikv/rfcs/blob/master/text/0082-dynamic-size-region.md#bucket).

## C

### Cached Table

With the cached table feature, TiDB loads the data of an entire table into the memory of the TiDB server, and TiDB directly gets the table data from the memory without accessing TiKV, which improves the read performance.

### Cluster

A cluster is a group of nodes that work together to provide services. By using clusters in a distributed system, TiDB achieves higher availability and greater scalability compared to a single-node setup.

In the distributed architecture of the TiDB database:

- TiDB nodes provide a scalable SQL layer for client interactions.
- PD nodes provide a resilient metadata layer for TiDB.
- TiKV nodes, using the Raft protocol, provide highly available, scalable, and resilient storage for TiDB.

For more information, see [TiDB Architecture](/tidb-architecture.md).

### Coalesce Partition

Coalesce Partition is a way of decreasing the number of partitions in a Hash or Key partitioned table. For more information, see [Manage Hash and Key partitions](/partitioned-table.md#manage-hash-and-key-partitions).
Expand All @@ -64,14 +76,24 @@ In RocksDB and TiKV, a Column Family (CF) represents a logical grouping of key-v

### Common Table Expression (CTE)

A Common Table Expression (CTE) enables you to define a temporary result set that can be referred to multiple times within a SQL statement using the [`WITH`](/sql-statements/sql-statement-with.md) clause. For more information, see [Common Table Expression](/develop/dev-guide-use-common-table-expression.md).
A Common Table Expression (CTE) enables you to define a temporary result set that can be referred to multiple times within a SQL statement using the [`WITH`](/sql-statements/sql-statement-with.md) clause, which improves the statement readability and execution efficiency. For more information, see [Common Table Expression](/develop/dev-guide-use-common-table-expression.md).

### Continuous Profiling

Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource overhead at the system call level. With the support of Continuous Profiling, TiDB provides performance insight as clear as directly looking into the database source code, and helps R&D and operation and maintenance personnel to locate the root cause of performance problems using a flame graph. For details, see [TiDB Dashboard Instance Profiling - Continuous Profiling](/dashboard/continuous-profiling.md).
Continuous Profiling is a way to observe resource overhead at the system call level. With Continuous Profiling, TiDB provides fine-grained observations of performance issues, helping operations teams identify the root cause using a flame graph. For more information, see [TiDB Dashboard Instance Profiling - Continuous Profiling](/dashboard/continuous-profiling.md).

### Coprocessor

Coprocessor is a coprocessing mechanism that shares the computation workload with TiDB. It is located in the storage layer (TiKV or TiFlash) and collaboratively processes computations [pushed down](/functions-and-operators/expressions-pushed-down.md) from TiDB on a per-Region basis.

## D

### Dumpling

Dumpling is a data export tool for exporting data stored in TiDB, MySQL, or MariaDB as SQL or CSV data files. It can also be used for logical full backups or exports. Additionally, Dumpling supports exporting data to Amazon S3.

For more information, see [Use Dumpling to Export Data](/dumpling-overview.md).

### Data Definition Language (DDL)

Data Definition Language (DDL) is a part of the SQL standard that deals with creating, modifying, and dropping tables and other objects. For more information, see [DDL Introduction](/ddl-introduction.md).
Expand Down Expand Up @@ -100,6 +122,14 @@ Distributed eXecution Framework (DXF) is the framework used by TiDB to centrally

Dynamic pruning mode is one of the modes that TiDB accesses partitioned tables. In dynamic pruning mode, each operator supports direct access to multiple partitions. Therefore, TiDB no longer uses Union. Omitting the Union operation can improve the execution efficiency and avoid the problem of Union concurrent execution.

## E

### Expression index

The expression index is a special type of index created on an expression. Once an expression index is created, TiDB can use this index for expression-based queries, significantly improving query performance.

For more information, see [CREATE INDEX - Expression index](/sql-statements/sql-statement-create-index.md#expression-index).

## G

### Garbage Collection (GC)
Expand All @@ -116,9 +146,13 @@ Global Transaction Identifiers (GTIDs) are unique transaction IDs used in MySQL

## H

### Hotspot

Hotspot refers to a situation where the read and write workloads in TiKV are concentrated on one or a few Regions or nodes. This can lead to performance bottlenecks, preventing optimal system performance. To solve hotspot issues, see [Troubleshoot Hotspot Issues](/troubleshoot-hot-spot-issues.md).

### Hybrid Transactional and Analytical Processing (HTAP)

Hybrid Transactional and Analytical Processing (HTAP) is a database feature that enables both OLTP (Online Transactional Processing) and OLAP (Online Analytical Processing) workloads within the same database. For TiDB, the HTAP feature is provided by using TiKV for row storage and TiFlash for columnar storage. For more information, see [the definition of HTAP on the Gartner website](https://www.gartner.com/en/information-technology/glossary/htap-enabling-memory-computing-technologies).
Hybrid Transactional and Analytical Processing (HTAP) is a database feature that enables both OLTP (Online Transactional Processing) and OLAP (Online Analytical Processing) workloads within the same database. For TiDB, the HTAP feature is provided by using TiKV for row storage and TiFlash for columnar storage. For more information, see [Quick Start with TiDB HTAP](/quick-start-with-htap.md) and [Explore HTAP](/explore-htap.md).

## I

Expand Down Expand Up @@ -150,6 +184,12 @@ Leader/Follower/Learner each corresponds to a role in a Raft group of [peers](#r

Lightweight Directory Access Protocol (LDAP) is a standardized way of accessing a directory with information. It is commonly used for account and user data management. TiDB supports LDAP via [LDAP authentication plugins](/security-compatibility-with-mysql.md#authentication-plugin-status).

### Lock View

The Lock View feature provides more information about lock conflicts and lock waits in pessimistic locking, making it convenient for DBAs to observe transaction locking situations and troubleshoot deadlock issues.

For more information, see system table documentation: [`TIDB_TRX`](/information-schema/information-schema-tidb-trx.md), [`DATA_LOCK_WAITS`](/information-schema/information-schema-data-lock-waits.md), and [`DEADLOCKS`](/information-schema/information-schema-deadlocks.md).

### Long Term Support (LTS)

Long Term Support (LTS) refers to software versions that are extensively tested and maintained for extended periods. For more information, see [TiDB Versioning](/releases/versioning.md).
Expand Down Expand Up @@ -201,12 +241,22 @@ Currently, available steps generated by PD include:
- `PromoteLearner`: Promotes a specified learner to a voting member
- `SplitRegion`: Splits a specified Region into two

### Optimistic transaction

Optimistic transactions are transactions that use optimistic concurrency control and generally do not cause conflicts in concurrent environments. After enabling optimistic transactions, TiDB checks for conflicts only when the transaction is finally committed. The optimistic transaction mode is suitable for read-heavy and write-light concurrent scenarios, which can improve the performance of TiDB.

For more information, see [TiDB Optimistic Transaction Model](/optimistic-transaction.md).

## P

### Partitioning

[Partitioning](/partitioned-table.md) refers to physically dividing a table into smaller table partitions, which can be done by partition methods such as RANGE, LIST, HASH, and KEY partitioning.

### PD Control (pd-ctl)

PD Control (pd-ctl) is a command-line tool used to interact with the Placement Driver (PD) in the TiDB cluster. You can use it to obtain cluster status information and modify the cluster configuration. For more information, see [PD Control User Guide](/pd-control.md).

### Pending/Down

"Pending" and "down" are two special states of a peer. Pending indicates that the Raft log of followers or learners is vastly different from that of leader. Followers in pending cannot be elected as leader. "Down" refers to a state that a peer ceases to respond to leader for a long time, which usually means the corresponding node is down or isolated from the network.
Expand All @@ -215,6 +265,12 @@ Currently, available steps generated by PD include:

Placement Driver (PD) is a core component in the [TiDB Architecture](/tidb-architecture.md#placement-driver-pd-server) responsible for storing metadata, assigning [Timestamp Oracle (TSO)](/tso.md) for transaction timestamps, orchestrating data placement on TiKV, and running [TiDB Dashboard](/dashboard/dashboard-overview.md). For more information, see [TiDB Scheduling](/tidb-scheduling.md).

### Placement Rules

Placement rules are used to configure the placement of data in a TiKV cluster. With this feature, you can specify the deployment of tables and partitions to different regions, data centers, cabinets, or hosts. Use cases include optimizing data availability strategies at low cost, ensuring that local data replicas are available for local stale reads, and complying with local data compliance requirements.

For more information, see [Placement Rules in SQL](/placement-rules-in-sql.md).

### Point Get

Point get means reading a single row of data by a unique index or primary index, the returned resultset is up to one row.
Expand All @@ -225,7 +281,7 @@ Point in Time Recovery (PITR) enables you to restore data to a specific point in

### Predicate columns

In most cases, when executing SQL statements, the optimizer only uses statistics of some columns (such as columns in the `WHERE`, `JOIN`, `ORDER BY`, and `GROUP BY` statements). These used columns are called predicate columns. For details, see [Collect statistics on some columns](/statistics.md#collect-statistics-on-some-columns).
In most cases, when executing SQL statements, the optimizer only uses statistics of some columns (such as columns in the `WHERE`, `JOIN`, `ORDER BY`, and `GROUP BY` statements). These used columns are called predicate columns. For more information, see [Collect statistics on some columns](/statistics.md#collect-statistics-on-some-columns).

## Q

Expand All @@ -241,7 +297,7 @@ Quota Limiter is an experimental feature introduced in TiDB v6.0.0. If the machi

### Raft Engine

Raft Engine is an embedded persistent storage engine with a log-structured design. It is built for TiKV to store multi-Raft logs. Since v5.4, TiDB supports using Raft Engine as the log storage engine. For details, see [Raft Engine](/tikv-configuration-file.md#raft-engine).
Raft Engine is an embedded persistent storage engine with a log-structured design. It is built for TiKV to store multi-Raft logs. Since v5.4, TiDB supports using Raft Engine as the log storage engine. For more information, see [Raft Engine](/tikv-configuration-file.md#raft-engine).

### Region Split

Expand All @@ -265,6 +321,10 @@ Request Unit (RU) is a unified abstraction unit for the resource usage in TiDB.

Restore is the reverse of the backup operation. It is the process of bringing back the system to an earlier state by retrieving data from a prepared backup.

### RocksDB

[RocksDB](https://rocksdb.org/) is an LSM-tree structured engine that provides key-value storage and read-write functionality. It was developed by Facebook and is based on LevelDB. RocksDB is the core storage engine of TiKV.

## S

### Scheduler
Expand All @@ -276,6 +336,18 @@ Schedulers are components in PD that generate scheduling tasks. Each scheduler i
- `hot-region-scheduler`: Balances the distribution of hot Regions
- `evict-leader-{store-id}`: Evicts all leaders of a node (often used for rolling upgrades)

### Security Enhanced Mode (SEM)

The Security Enhanced Mode (SEM) is used for finer-grained permission control of TiDB administrators. Inspired by systems such as [Security-Enhanced Linux](https://en.wikipedia.org/wiki/Security-Enhanced_Linux), SEM reduces the abilities of users with the `SUPER` privilege and instead requires `RESTRICTED` fine-grained privileges, which must be explicitly granted to control specific administrative actions.

For more information, see [System Variables documentation - `tidb_enable_enhanced_security`](/system-variables.md#tidb_enable_enhanced_security).

### Stale Read

Stale Read is a mechanism that TiDB applies to read historical versions of data stored in TiDB. Using this mechanism, you can read the corresponding historical data of a specific point in time or within a specified time range, and thus save the latency brought by data replication between storage nodes. When you use Stale Read, TiDB randomly selects a replica for data reading, which means that all replicas are available for data reading.

For more information, see [Stale Read](/stale-read.md).

### Static Sorted Table / Sorted String Table (SST)

Static Sorted Table or Sorted String Table is a file storage format used in RocksDB (a storage engine used by [TiKV](/storage-engine/rocksdb-overview.md)).
Expand All @@ -286,13 +358,37 @@ A store refers to the storage node in the TiKV cluster (an instance of `tikv-ser

## T

### Temporary table

Temporary tables enable you to store intermediate calculation results temporarily, eliminating the need to create and drop tables repeatedly. Once the data is no longer needed, TiDB automatically cleans up and recycles the temporary tables. This feature simplifies application logic, reduces table management overhead, and improves performance.

For more information, see [Temporary Tables](/temporary-tables.md).

### TiCDC

[TiCDC](/ticdc/ticdc-overview.md) is a tool that enables incremental data replication from TiDB to various downstream targets. These downstream targets can include other TiDB instances, MySQL-compatible databases, storage services, and streaming processors (such as Kafka and Pulsar). TiCDC pulls the data change logs from the upstream TiKV, parses them into ordered row-level change data, and then outputs the data to the downstream. For more information about the concepts and terms of TiCDC, see [TiCDC Glossary](/ticdc/ticdc-glossary.md).

### TiDB Lightning

[TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) is a tool for importing Terabyte-level data from static files into TiDB clusters. It is commonly used for the initial data import into TiDB clusters.

For more information on the concepts and terminology of TiDB Lightning, see [TiDB Lightning Glossary](/tidb-lightning/tidb-lightning-glossary.md).

### TiFlash

[TiFlash](/tiflash/tiflash-overview.md) is a key component of TiDB's HTAP architecture. It is a columnar extension of TiKV that provides both strong consistency and good isolation. TiFlash maintains columnar replicas by asynchronously replicating data from TiKV using the **Raft Learner protocol**. For reads, it leverages the **Raft consensus index** and **MVCC (Multi-Version Concurrency Control)** to achieve **Snapshot Isolation** consistency. This architecture effectively addresses isolation and synchronization challenges in HTAP workloads, enabling efficient analytical queries while maintaining real-time data consistency.

### Timestamp Oracle (TSO)

Because TiKV is a distributed storage system, it requires a global timing service, Timestamp Oracle (TSO), to assign a monotonically increasing timestamp. In TiKV, such a feature is provided by PD, and in Google [Spanner](http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf), this feature is provided by multiple atomic clocks and GPS. For details, see [TSO](/tso.md).
Because TiKV is a distributed storage system, it requires a global timing service, Timestamp Oracle (TSO), to assign a monotonically increasing timestamp. In TiKV, such a feature is provided by PD, and in Google [Spanner](http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf), this feature is provided by multiple atomic clocks and GPS. For more information, see [TSO](/tso.md).

### TiUP

[TiUP](/tiup/tiup-overview.md) is a management tool used for deploying, upgrading, and managing TiDB clusters, as well as managing various components within the TiDB cluster including TiDB, PD, and TiKV. With TiUP, you can easily run any component within TiDB by executing a single command, greatly simplifying the management process.

### Top SQL

Top SQL helps locate SQL queries that contribute to a high load of a TiDB or TiKV node in a specified time range. For details, see [Top SQL user document](/dashboard/top-sql.md).
Top SQL helps locate SQL queries that contribute to a high load of a TiDB or TiKV node in a specified time range. For more information, see [Top SQL user document](/dashboard/top-sql.md).

### Transactions Per Second (TPS)

Expand Down

0 comments on commit 513a2b2

Please sign in to comment.