From ca331fb386d4cb9ce7a88420704f65b65b784c4b Mon Sep 17 00:00:00 2001 From: Morgan Tocker Date: Tue, 16 Mar 2021 23:54:55 -0600 Subject: [PATCH] Update garbage collection docs (#4552) --- alert-rules.md | 4 +- ...up-and-restore-using-dumpling-lightning.md | 18 +-- br/backup-and-restore-tool.md | 25 +-- br/backup-and-restore-use-cases.md | 24 +-- br/use-br-command-line-tool.md | 13 -- dumpling-overview.md | 4 +- error-codes.md | 16 +- faq/migration-tidb-faq.md | 12 +- faq/sql-faq.md | 8 +- faq/tidb-faq.md | 4 +- garbage-collection-configuration.md | 144 +++--------------- garbage-collection-overview.md | 6 +- read-historical-data.md | 16 +- .../sql-statement-flashback-table.md | 6 +- system-variables.md | 40 +++++ tidb-troubleshooting-map.md | 8 +- 16 files changed, 108 insertions(+), 240 deletions(-) diff --git a/alert-rules.md b/alert-rules.md index c42f34cffbf59..8550522d833d1 100644 --- a/alert-rules.md +++ b/alert-rules.md @@ -437,7 +437,7 @@ This section gives the alert rules for the TiKV component. * Solution: - 1. Perform `select VARIABLE_VALUE from mysql.tidb where VARIABLE_NAME = "tikv_gc_leader_desc"` to locate the `tidb-server` corresponding to the GC leader; + 1. Perform `SELECT VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME = "tikv_gc_leader_desc"` to locate the `tidb-server` corresponding to the GC leader; 2. View the log of the `tidb-server`, and grep gc_worker tidb.log; 3. If you find that the GC worker has been resolving locks (the last log is "start resolve locks") or deleting ranges (the last log is “start delete {number} ranges”) during this time, it means the GC process is running normally. Otherwise, contact [support@pingcap.com](mailto:support@pingcap.com) to resolve this issue. @@ -623,7 +623,7 @@ This section gives the alert rules for the TiKV component. * Solution: 1. It is normally because the GC concurrency is set too high. You can moderately lower the GC concurrency degree, and you need to first confirm that the failed GC is caused by the busy server. - 2. You can moderately lower the concurrency degree by running `update set VARIABLE_VALUE="{number}” where VARIABLE_NAME=”tikv_gc_concurrency”`. + 2. You can moderately lower the concurrency degree by adjusting [`tikv_db_concurrency`](/system-variables.md#tidb_gc_concurrency). ### Warning-level alerts diff --git a/backup-and-restore-using-dumpling-lightning.md b/backup-and-restore-using-dumpling-lightning.md index 31e8bff36b924..abb685e3ca38c 100644 --- a/backup-and-restore-using-dumpling-lightning.md +++ b/backup-and-restore-using-dumpling-lightning.md @@ -59,22 +59,22 @@ The steps to manually modify the GC time are as follows: {{< copyable "sql" >}} ```sql - SELECT * FROM mysql.tidb WHERE VARIABLE_NAME = 'tikv_gc_life_time'; + SHOW GLOBAL VARIABLES LIKE 'tidb_gc_life_time'; ``` ```sql - +-----------------------+------------------------------------------------------------------------------------------------+ - | VARIABLE_NAME | VARIABLE_VALUE | - +-----------------------+------------------------------------------------------------------------------------------------+ - | tikv_gc_life_time | 10m0s | - +-----------------------+------------------------------------------------------------------------------------------------+ - 1 rows in set (0.02 sec) + +-------------------+-------+ + | Variable_name | Value | + +-------------------+-------+ + | tidb_gc_life_time | 10m0s | + +-------------------+-------+ + 1 row in set (0.03 sec) ``` {{< copyable "sql" >}} ```sql - UPDATE mysql.tidb SET VARIABLE_VALUE = '720h' WHERE VARIABLE_NAME = 'tikv_gc_life_time'; + SET GLOBAL tidb_gc_life_time = '720h'; ``` 2. After executing the `dumpling` command, restore the GC value of the TiDB cluster to the initial value in step 1: @@ -82,7 +82,7 @@ The steps to manually modify the GC time are as follows: {{< copyable "sql" >}} ```sql - UPDATE mysql.tidb SET VARIABLE_VALUE = '10m' WHERE VARIABLE_NAME = 'tikv_gc_life_time'; + SET GLOBAL tidb_gc_life_time = '10m'; ``` ## Restore data into TiDB diff --git a/br/backup-and-restore-tool.md b/br/backup-and-restore-tool.md index e47c66b88ad61..edac616c28e10 100644 --- a/br/backup-and-restore-tool.md +++ b/br/backup-and-restore-tool.md @@ -6,11 +6,11 @@ aliases: ['/docs/dev/br/backup-and-restore-tool/','/docs/dev/reference/tools/br/ # BR Tool Overview -[BR](http://github.com/pingcap/br) (Backup & Restore) is a command-line tool for distributed backup and restoration of the TiDB cluster data. It is supported to use BR only in TiDB v3.1 and later versions. +[BR](http://github.com/pingcap/br) (Backup & Restore) is a command-line tool for distributed backup and restoration of the TiDB cluster data. Compared with [`dumpling`](/backup-and-restore-using-dumpling-lightning.md), BR is more suitable for scenarios of huge data volume. -This document describes BR's implementation principles, recommended deployment configuration, usage restrictions, several methods to use BR, etc. +This document describes BR's implementation principles, recommended deployment configuration, usage restrictions and several methods to use BR. ## Implementation principles @@ -113,21 +113,15 @@ After the restoration operation is completed, BR performs a checksum calculation > > - If you do not mount a network disk or use other shared storage, the data backed up by BR will be generated on each TiKV node. Because BR only backs up leader replicas, you should estimate the space reserved for each node based on the leader size. > -> - Meanwhile, because TiDB v4.0 uses leader count for load balancing by default, leaders are greatly different in size, resulting in uneven distribution of backup data on each node. +> - Because TiDB uses leader count for load balancing by default, leaders can greatly differ in size. This might resulting in uneven distribution of backup data on each node. ### Usage restrictions The following are the limitations of using BR for backup and restoration: -- It is supported to use BR only in TiDB v3.1 and later versions. - When BR restores data to the upstream cluster of TiCDC/Drainer, TiCDC/Drainer cannot replicate the restored data to the downstream. - BR supports operations only between clusters with the same [`new_collations_enabled_on_first_bootstrap`](/character-set-and-collation.md#collation-support-framework) value because BR only backs up KV data. If the cluster to be backed up and the cluster to be restored use different collations, the data validation fails. Therefore, before restoring a cluster, make sure that the switch value from the query result of the `select VARIABLE_VALUE from mysql.tidb where VARIABLE_NAME='new_collation_enabled';` statement is consistent with that during the backup process. - - For v3.1 clusters, the new collation framework is not supported, so you can see it as disabled. - - For v4.0 clusters, check whether the new collation is enabled by executing `SELECT VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME='new_collation_enabled';`. - - For example, assume that data is backed up from a v3.1 cluster and will be restored to a v4.0 cluster. The `new_collation_enabled` value of the v4.0 cluster is `true`, which means that the new collation is enabled in the cluster to be restored when this cluster is created. If you perform the restore in this situation, an error might occur. - ### Minimum machine configuration required for running BR The minimum machine configuration required for running BR is as follows: @@ -159,20 +153,11 @@ Currently, the following methods are supported to run the BR tool: #### Use SQL statements -In TiDB v4.0.2 and later versions, you can run the BR tool using SQL statements. - -For detailed operations, see the following documents: - -- [Backup syntax](/sql-statements/sql-statement-backup.md#backup) -- [Restore syntax](/sql-statements/sql-statement-restore.md#restore) +TiDB supports both [`BACKUP`](/sql-statements/sql-statement-backup.md#backup) and [`RESTORE`](/sql-statements/sql-statement-restore.md#restore) SQL statements. The progress of these operations can be monitored with the statement [`SHOW BACKUPS|RESTORES`](/sql-statements/sql-statement-show-backups.md). #### Use the command-line tool -In TiDB versions above v3.1, you can run the BR tool using the command-line tool. - -First, you need to download the binary file of the BR tool. See [download link](/download-ecosystem-tools.md#br-backup-and-restore). - -For how to use the command-line tool to perform backup and restore operations, see [Use the BR command-line tool](/br/use-br-command-line-tool.md). +The `br` command-line utility is available as a [separate download](/download-ecosystem-tools.md#br-backup-and-restore). For details, see [Use BR Command-line for Backup and Restoration](/br/use-br-command-line-tool.md). #### In the Kubernetes environment diff --git a/br/backup-and-restore-use-cases.md b/br/backup-and-restore-use-cases.md index 6087457d804dc..8a8372a7fc6f2 100644 --- a/br/backup-and-restore-use-cases.md +++ b/br/backup-and-restore-use-cases.md @@ -24,7 +24,7 @@ This document aims to help you achieve the following goals: ## Audience -You are expected to have a basic understanding of [TiDB](https://docs.pingcap.com/tidb/v4.0) and [TiKV](https://tikv.org/). +You are expected to have a basic understanding of TiDB and [TiKV](https://tikv.org/). Before reading on, make sure you have read [BR Tool Overview](/br/backup-and-restore-tool.md), especially [Usage Restrictions](/br/backup-and-restore-tool.md#usage-restrictions) and [Best Practices](/br/backup-and-restore-tool.md#best-practices). @@ -82,26 +82,10 @@ Before the backup or restoration operations, you need to do some preparations: ### Preparation for backup -In TiDB v4.0.8 and later versions, BR supports the self-adaptive Garbage Collection (GC). So to avoid manually configuring GC, you only need to register `backupTS` in `safePoint` in PD and make sure that `safePoint` does not move forward during the backup process. +For the detailed usage of the `br backup` command, refer to [Use BR Command-line for Backup and Restoration](/br/use-br-command-line-tool.md). -In TiDB v4.0.7 and earlier versions, you need to manually configure GC before and after the BR backup through the following steps: - -1. Before executing the [`br backup` command](/br/use-br-command-line-tool.md#br-command-line-description), check the value of the [`tikv_gc_life_time`](/garbage-collection-configuration.md#tikv_gc_life_time) configuration item, and adjust the value appropriately in the MySQL client to make sure that GC does not run during the backup operation. - - {{< copyable "sql" >}} - - ```sql - SELECT * FROM mysql.tidb WHERE VARIABLE_NAME = 'tikv_gc_life_time'; - UPDATE mysql.tidb SET VARIABLE_VALUE = '720h' WHERE VARIABLE_NAME = 'tikv_gc_life_time'; - ``` - -2. After the backup operation, set the parameter back to the original value. - - {{< copyable "sql" >}} - - ```sql - UPDATE mysql.tidb SET VARIABLE_VALUE = '10m' WHERE VARIABLE_NAME = 'tikv_gc_life_time'; - ``` +1. Before executing the `br backup` command, ensure that no DDL is running on the TiDB cluster. +2. Ensure that the storage device where the backup will be created has sufficient space. ### Preparation for restoration diff --git a/br/use-br-command-line-tool.md b/br/use-br-command-line-tool.md index f35317be1af33..4194e3ccfafcf 100644 --- a/br/use-br-command-line-tool.md +++ b/br/use-br-command-line-tool.md @@ -71,19 +71,6 @@ Each of the above three sub-commands might still include the following three sub To back up the cluster data, use the `br backup` command. You can add the `full` or `table` sub-command to specify the scope of your backup operation: the whole cluster or a single table. -If the BR version is earlier than v4.0.8, and the backup duration might exceed the [`tikv_gc_life_time`](/garbage-collection-configuration.md#tikv_gc_life_time) configuration which is `10m0s` by default (`10m0s` means 10 minutes), increase the value of this configuration item. - -For example, set `tikv_gc_life_time` to `720h`: - -{{< copyable "sql" >}} - -```sql -mysql -h${TiDBIP} -P4000 -u${TIDB_USER} ${password_str} -Nse \ - "update mysql.tidb set variable_value='720h' where variable_name='tikv_gc_life_time'"; -``` - -Since v4.0.8, BR automatically adapts to GC and you do not need to manually adjust the `tikv_gc_life_time` value. - ### Back up all the cluster data To back up all the cluster data, execute the `br backup full` command. To get help on this command, execute `br backup full -h` or `br backup full --help`. diff --git a/dumpling-overview.md b/dumpling-overview.md index 88110bd7de417..d51d4fae13aa6 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -307,7 +307,7 @@ In other scenarios, if the data size is very large, to avoid export failure due {{< copyable "sql" >}} ```sql -update mysql.tidb set VARIABLE_VALUE = '720h' where VARIABLE_NAME = 'tikv_gc_life_time'; +SET GLOBAL tidb_gc_life_time = '720h'; ``` After your operation is completed, set the GC time back (the default value is `10m`): @@ -315,7 +315,7 @@ After your operation is completed, set the GC time back (the default value is `1 {{< copyable "sql" >}} ```sql -update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time'; +SET GLOBAL tidb_gc_life_time = '10m'; ``` Finally, all the exported data can be imported back to TiDB using [Lightning](/tidb-lightning/tidb-lightning-backends.md). diff --git a/error-codes.md b/error-codes.md index bff7bc1e61e3d..4b7bd5e6e6c2f 100644 --- a/error-codes.md +++ b/error-codes.md @@ -148,9 +148,9 @@ TiDB is compatible with the error codes in MySQL, and in most cases returns the * Error Number: 8048 - An unsupported database isolation level is set. - - If you cannot modify the codes because you are using a third-party tool or framework, consider using `tidb_skip_isolation_level_check` to bypass this check. + An unsupported database isolation level is set. + + If you cannot modify the codes because you are using a third-party tool or framework, consider using [`tidb_skip_isolation_level_check`](/system-variables.md#tidb_skip_isolation_level_check) to bypass this check. {{< copyable "sql" >}} @@ -178,16 +178,10 @@ TiDB is compatible with the error codes in MySQL, and in most cases returns the * Error Number: 8055 - The current snapshot is too old. The data may have been garbage collected. You can increase the value of `tikv_gc_life_time` to avoid this problem. The new version of TiDB automatically reserves data for long-running transactions. Usually this error does not occur. - + The current snapshot is too old. The data may have been garbage collected. You can increase the value of [`tidb_gc_life_time`](/system-variables.md#tidb_gc_life_time) to avoid this problem. The new version of TiDB automatically reserves data for long-running transactions. Usually this error does not occur. + See [garbage collection overview](/garbage-collection-overview.md) and [garbage collection configuration](/garbage-collection-configuration.md). - {{< copyable "sql" >}} - - ```sql - update mysql.tidb set VARIABLE_VALUE="24h" where VARIABLE_NAME="tikv_gc_life_time"; - ``` - * Error Number: 8059 The auto-random ID is exhausted and cannot be allocated. There is no way to recover from such errors currently. It is recommended to use bigint when using the auto random feature to obtain the maximum number of assignment. And try to avoid manually assigning values to the auto random column. diff --git a/faq/migration-tidb-faq.md b/faq/migration-tidb-faq.md index 0a2b815e086f0..65b2b113824c0 100644 --- a/faq/migration-tidb-faq.md +++ b/faq/migration-tidb-faq.md @@ -19,7 +19,7 @@ Restart the TiDB service, add the `-skip-grant-table=true` parameter in the conf ### How to export the data in TiDB? -Currently, TiDB does not support `select into outfile`. You can use the following methods to export the data in TiDB: +You can use the following methods to export the data in TiDB: - See [MySQL uses mysqldump to export part of the table data](https://blog.csdn.net/xin_yu_xin/article/details/7574662) in Chinese and export data using mysqldump and the `WHERE` clause. - Use the MySQL client to export the results of `select` to a file. @@ -120,13 +120,3 @@ If the amount of data that needs to be deleted at a time is very large, this loo - The [Lightning](/tidb-lightning/tidb-lightning-overview.md) tool is developed for distributed data import. It should be noted that the data import process does not perform a complete transaction process for performance reasons. Therefore, the ACID constraint of the data being imported during the import process cannot be guaranteed. The ACID constraint of the imported data can only be guaranteed after the entire import process ends. Therefore, the applicable scenarios mainly include importing new data (such as a new table or a new index) or the full backup and restoring (truncate the original table and then import data). - Data loading in TiDB is related to the status of disks and the whole cluster. When loading data, pay attention to metrics like the disk usage rate of the host, TiClient Error, Backoff, Thread CPU and so on. You can analyze the bottlenecks using these metrics. - -### What should I do if it is slow to reclaim storage space after deleting data? - -You can configure concurrent GC to increase the speed of reclaiming storage space. The default concurrency is 1, and you can modify it to at most 50% of the number of TiKV instances using the following command: - -{{< copyable "sql" >}} - -```sql -update mysql.tidb set VARIABLE_VALUE="3" where VARIABLE_NAME="tikv_gc_concurrency"; -``` diff --git a/faq/sql-faq.md b/faq/sql-faq.md index bbbbee41e4d62..a487ab7eeeeff 100644 --- a/faq/sql-faq.md +++ b/faq/sql-faq.md @@ -134,13 +134,9 @@ Deleting a large amount of data leaves a lot of useless keys, affecting the quer ## What should I do if it is slow to reclaim storage space after deleting data? -You can configure concurrent GC to increase the speed of reclaiming storage space. The default concurrency is 1, and you can modify it to at most 50% of the number of TiKV instances using the following command: +Because TiDB uses Multiversion concurrency control (MVCC), deleting data does not immediately reclaim space. Garbage collection is delayed so that concurrent transactions are able to see earlier versions of rows. This can be configured via the [`tidb_gc_life_time`](/system-variables.md#tidb_gc_life_time) (default: `10m0s`) system variable. -{{< copyable "sql" >}} - -```sql -update mysql.tidb set VARIABLE_VALUE="3" where VARIABLE_NAME="tikv_gc_concurrency"; -``` +When performing a backup, the `tidb_gc_life_time` is also automatically extended so that the backup can complete successfully. ## Does `SHOW PROCESSLIST` display the system process ID? diff --git a/faq/tidb-faq.md b/faq/tidb-faq.md index 8b9fe014f038b..69efa7777d61e 100644 --- a/faq/tidb-faq.md +++ b/faq/tidb-faq.md @@ -124,12 +124,12 @@ The accessed Region is not available. A Raft Group is not available, with possib #### ERROR 9006 (HY000): GC life time is shorter than transaction duration -The interval of `GC Life Time` is too short. The data that should have been read by long transactions might be deleted. You can add `GC Life Time` using the following command: +The interval of `GC Life Time` is too short. The data that should have been read by long transactions might be deleted. You can adjust [`tidb_gc_life_time`](/system-variables.md#tidb_gc_life_time) using the following command: {{< copyable "sql" >}} ```sql -update mysql.tidb set variable_value='30m' where variable_name='tikv_gc_life_time'; +SET GLOBAL tidb_gc_life_time = '30m'; ``` > **Note:** diff --git a/garbage-collection-configuration.md b/garbage-collection-configuration.md index 088e6aa0c7489..d1c37f419c53b 100644 --- a/garbage-collection-configuration.md +++ b/garbage-collection-configuration.md @@ -1,134 +1,18 @@ --- -title: GC Configuration +title: Garbage Collection Configuration summary: Learn about GC configuration parameters. aliases: ['/docs/dev/garbage-collection-configuration/','/docs/dev/reference/garbage-collection/configuration/'] --- -# GC Configuration +# Garbage Collection Configuration -The GC (Garbage Collection) configuration and operational status are recorded in the `mysql.tidb` system table. You can use SQL statements to query or modify them: +Garbage collection is configured via the following system variables: -{{< copyable "sql" >}} - -```sql -select VARIABLE_NAME, VARIABLE_VALUE from mysql.tidb where VARIABLE_NAME like "tikv_gc%"; -``` - -```sql -+--------------------------+----------------------------------------------------------------------------------------------------+ -| VARIABLE_NAME | VARIABLE_VALUE | -+--------------------------+----------------------------------------------------------------------------------------------------+ -| tikv_gc_leader_uuid | 5afd54a0ea40005 | -| tikv_gc_leader_desc | host:tidb-cluster-tidb-0, pid:215, start at 2019-07-15 11:09:14.029668932 +0000 UTC m=+0.463731223 | -| tikv_gc_leader_lease | 20190715-12:12:14 +0000 | -| tikv_gc_enable | true | -| tikv_gc_run_interval | 10m0s | -| tikv_gc_life_time | 10m0s | -| tikv_gc_last_run_time | 20190715-12:09:14 +0000 | -| tikv_gc_safe_point | 20190715-11:59:14 +0000 | -| tikv_gc_auto_concurrency | true | -| tikv_gc_mode | distributed | -+--------------------------+----------------------------------------------------------------------------------------------------+ -13 rows in set (0.00 sec) -``` - -For example, the following statement makes GC keep history data for the most recent 24 hours: - -```sql -update mysql.tidb set VARIABLE_VALUE="24h" where VARIABLE_NAME="tikv_gc_life_time"; -``` - -> **Note:** -> -> In addition to the following GC configuration parameters, the `mysql.tidb` system table also contains records that store the status of the storage components in a TiDB cluster, among which GC related ones are included, as listed below: -> -> - `tikv_gc_leader_uuid`, `tikv_gc_leader_desc` and `tikv_gc_leader_lease`: Records the information of the GC leader -> - `tikv_gc_last_run_time`: The duration of the latest GC (updated at the beginning of each round of GC) -> - `tikv_gc_safe_point`: The current safe point (updated at the beginning of each round of GC) - -## `tikv_gc_enable` - -- Enables or disables GC -- Default: `true` - -## `tikv_gc_run_interval` - -- Specifies the GC interval, in the format of Go Duration, for example, `"1h30m"`, and `"15m"` -- Default: `"10m0s"` - -## `tikv_gc_life_time` - -- The time limit during which data is retained for each GC, in the format of Go Duration. When a GC happens, the current time minus this value is the safe point. -- Default: `"10m0s"` - - > **Note:** - > - > - In scenarios of frequent updates, a large value (days or even months) for `tikv_gc_life_time` may cause potential issues, such as: - > - Larger storage use - > - A large amount of history data may affect performance to a certain degree, especially for range queries such as `select count(*) from t` - > - If there is any transaction that has been running longer than `tikv_gc_life_time`, during GC, the data since `start_ts` is retained for this transaction to continue execution. For example, if `tikv_gc_life_time` is configured to 10 minutes, among all transactions being executed, the transaction that starts earliest has been running for 15 minutes, GC will retain data of the recent 15 minutes. - -## `tikv_gc_mode` - -- Specifies the GC mode. Possible values are: - - - `"distributed"` (default): Distributed GC mode. In the [Do GC](/garbage-collection-overview.md#do-gc) step, the GC leader on the TiDB side uploads the safe point to PD. Each TiKV node obtains the safe point respectively and performs GC on all leader Regions on the current node. This mode is supported from TiDB 3.0. - - - `"central"`: Central GC mode. In the [Do GC](/garbage-collection-overview.md#do-gc) step, the GC leader sends GC requests to all Regions. This mode is adopted by TiDB 2.1 or earlier versions. Starting from TiDB 5.0, this mode is not supported. Clusters set to this mode automatically switch to the `distributed` mode. - -## `tikv_gc_auto_concurrency` - -- Controls whether to let TiDB automatically specify the GC concurrency, or the maximum number of GC threads allowed concurrently. - - When `tikv_gc_mode` is set to `"distributed"`, GC concurrency works in the [Resolve Locks](/garbage-collection-overview.md#resolve-locks) step. When `tikv_gc_mode` is set to `"central"`, it is applied to both the Resolve Locks and [Do GC](/garbage-collection-overview.md#do-gc) steps. - - - `true`(default): Automatically use the number of TiKV nodes in the cluster as the GC concurrency - - `false`: Use the value of [`tikv_gc_concurrency`](#tikv_gc_concurrency) as the GC concurrency - -## `tikv_gc_concurrency` - -- Specifies the GC concurrency manually. This parameter works only when you set [`tikv_gc_auto_concurrency`](#tikv_gc_auto_concurrency) to `false`. -- Default: 2 - -## `tikv_gc_scan_lock_mode` (**experimental feature**) - -> **Warning:** -> -> Green GC is still an experimental feature. It is recommended **NOT** to use it in the production environment. - -This parameter specifies the way of scanning locks in the Resolve Locks step of GC, that is, whether to enable Green GC (experimental feature) or not. In the Resolve Locks step of GC, TiKV needs to scan all locks in the cluster. With Green GC disabled, TiDB scans locks by Regions. Green GC provides the "physical scanning" feature, which means that each TiKV node can bypass the Raft layer to directly scan data. This feature can effectively mitigate the impact of GC wakening up all Regions when the [Hibernate Region](/tikv-configuration-file.md#hibernate-regions-experimental) feature is enabled, thus improving the execution speed in the Resolve Locks step. - -- `"legacy"` (default): Uses the old way of scanning, that is, disable Green GC. -- `"physical"`: Uses the physical scanning method, that is, enable Green GC. - -> **Note:** -> -> The configuration of Green GC is hidden. Execute the following statement when you enable Green GC for the first time: -> -> {{< copyable "sql" >}} -> -> ```sql -> insert into mysql.tidb values ('tikv_gc_scan_lock_mode', 'legacy', ''); -> ``` - -## Notes on GC process changes - -Since TiDB 3.0, some configuration options have changed with support for the distributed GC mode and concurrent Resolve Locks processing. The changes are shown in the following table: - -| Version/Configuration | Resolve Locks | Do GC | -|-------------------|---------------|----------------| -| 2.x | Serial | Concurrent | -| 3.0
`tikv_gc_mode = centered`
`tikv_gc_auto_concurrency = false` | Concurrent | Concurrent | -| 3.0
`tikv_gc_mode = centered`
`tikv_gc_auto_concurrency = true` | Auto-concurrent | Auto-concurrent | -| 3.0
`tikv_gc_mode = distributed`
`tikv_gc_auto_concurrency = false` | Concurrent | Distributed | -| 3.0
`tikv_gc_mode = distributed`
`tikv_gc_auto_concurrency = true`
(default) | Auto-concurrent | Distributed | - -- Serial: requests are sent from TiDB Region by Region. -- Concurrent: requests are sent to each Region concurrently based on the number of threads specified in the `tikv_gc_concurrency`. -- Auto-concurrent: requests are sent to each Region concurrently with the number of TiKV nodes as concurrency value. -- Distributed: no need for TiDB to send requests to TiKV to trigger GC because each TiKV handles GC on its own. - -In addition, if Green GC (experimental feature) is enabled, that is, setting the value of [`tikv_gc_scan_lock_mode`](#tikv_gc_scan_lock_mode-experimental-feature) to `physical`, the processing of Resolve Lock is not affected by the concurrency configuration above. +* [`tidb_gc_enable`](/system-variables.md#tidb_gc_enable) +* [`tidb_gc_run_interval`](/system-variables.md#tidb_gc_run_interval) +* [`tidb_gc_life_time`](/system-variables.md#tidb_gc_life_time) +* [`tidb_gc_concurrency`](/system-variables.md#tidb_gc_concurrency) +* [`tidb_gc_scan_lock_mode`](/system-variables.md#tidb_gc_scan_lock_mode) ## GC I/O limit @@ -144,9 +28,17 @@ You can dynamically modify this configuration using tikv-ctl: tikv-ctl --host=ip:port modify-tikv-config -m server -n gc.max_write_bytes_per_sec -v 10MB ``` -## GC in Compaction Filter +## Changes in TiDB 5.0 + +In previous releases of TiDB, garbage collection was configured via the `mysql.tidb` system table. While changes to this table continue to be supported, it is recommended to use the system variables provided. This helps ensure that any changes to configuration can be validated, and prevent unexpected behavior ([#20655](https://github.com/pingcap/tidb/issues/20655)). + +The `CENTRAL` garbage collection mode is no longer supported. The `DISTRIBUTED` GC mode (which has been the default since TiDB 3.0) will automatically be used in its place. This mode is more efficient, since TiDB no longer needs to send requests to each TiKV region to trigger garbage collection. + +For information on changes in previous releases, refer to earlier versions of this document using the _TIDB version selector_ in the left hand menu. + +### GC in Compaction Filter -Since v5.0.0-rc, TiDB introduces the mechanism of GC in Compaction Filter. Based on the distributed GC mode, the mechanism uses the compaction process of RocksDB, instead of a separate GC worker thread, to run GC. This new GC mechanism helps to avoid extra disk read caused by GC. Also, after clearing the obsolete data, it avoids a large number of left tombstone marks which degrade the sequential scan performance. This GC mechanism is disabled by default. The following example shows how to enable the mechanism in the TiKV configuration file: +Since v5.0.0-rc, TiDB introduces the mechanism of GC in Compaction Filter. Based on the `DISTRIBUTED` GC mode, the mechanism uses the compaction process of RocksDB, instead of a separate GC worker thread, to run GC. This new GC mechanism helps to avoid extra disk read caused by GC. Also, after clearing the obsolete data, it avoids a large number of left tombstone marks which degrade the sequential scan performance. This GC mechanism is disabled by default. The following example shows how to enable the mechanism in the TiKV configuration file: {{< copyable "" >}} diff --git a/garbage-collection-overview.md b/garbage-collection-overview.md index eb85a8923d312..0beb59ea1e264 100644 --- a/garbage-collection-overview.md +++ b/garbage-collection-overview.md @@ -18,7 +18,7 @@ GC runs periodically on TiDB. For each GC, TiDB firstly calculates a timestamp c 2. Delete Ranges. During this step, the obsolete data of the entire range generated from the `DROP TABLE`/`DROP INDEX` operation is quickly cleared. 3. Do GC. During this step, each TiKV node scans data on it and deletes unneeded old versions of each key. -In the default configuration, GC is triggered every 10 minutes. Each GC retains data of the recent 10 minutes, which means that the the GC life time is 10 minutes by default (safe point = the current time - GC life time). If one round of GC has been running for too long, before this round of GC is completed, the next round of GC will not start even if it is time to trigger the next GC. In addition, for long-duration transactions to run properly after exceeding the GC life time, the safe point does not exceed the start time (start_ts) of the ongoing transactions. +In the default configuration, GC is triggered every 10 minutes. Each GC retains data of the recent 10 minutes, which means that the GC life time is 10 minutes by default (safe point = the current time - GC life time). If one round of GC has been running for too long, before this round of GC is completed, the next round of GC will not start even if it is time to trigger the next GC. In addition, for long-duration transactions to run properly after exceeding the GC life time, the safe point does not exceed the start time (start_ts) of the ongoing transactions. ## Implementation details @@ -28,7 +28,7 @@ The TiDB transaction model is implemented based on [Google's Percolator](https:/ The Resolve Locks step clears the locks before the safe point. This means that if the primary key of a lock is committed, this lock needs to be committed; otherwise, it needs to be rolled back. If the primary key is still locked (not committed or rolled back), this transaction is seen as timing out and rolled back. -In the Resolve Lock step, the GC leader sends requests to all Regions to scan obsolete locks, checks the primary key statuses of scanned locks, and sends requests to commit or roll back the corresponding transaction. By default, this process is performed concurrently, and the concurrency number is the same as the number of TiKV nodes. +By default, TiDB will bypass the Raft layer and directly scans data on each TiKV node. This is configurable via the system variable [`tidb_gc_scan_lock_mode`](/system-variables.md#tidb_gc_scan_lock_mode). In the previous default (`LEGACY`), the GC leader sends requests to all Regions to scan obsolete locks, checks the primary key statuses of scanned locks, and sends requests to commit or roll back the corresponding transaction. ### Delete Ranges @@ -42,4 +42,4 @@ In this step, TiDB only needs to send the safe point to PD, and then the whole r > **Note:** > -> In TiDB v2.1 or earlier versions, the Do GC step is implemented by TiDB sending requests to each Region. In v3.0 or later versions, you can modify the `tikv_gc_mode` to use the previous GC mechanism. For more details, refer to [GC Configuration](/garbage-collection-configuration.md#tikv_gc_mode). +> Starting with TiDB 5.0, the Do GC step will always use the `DISTRIBUTED` gc mode. This replaces the earlier `CENTRAL` gc mode, which was implemented by TiDB servers sending GC requests to each Region. diff --git a/read-historical-data.md b/read-historical-data.md index 55cf5ec4f9f75..242218ef0be7e 100644 --- a/read-historical-data.md +++ b/read-historical-data.md @@ -17,19 +17,19 @@ TiDB implements a feature to read history data using the standard SQL interface ## How TiDB reads data from history versions -The `tidb_snapshot` system variable is introduced to support reading history data. About the `tidb_snapshot` variable: +The [`tidb_snapshot`](/system-variables.md#tidb_snapshot) system variable is introduced to support reading history data. About the `tidb_snapshot` variable: -- The variable is valid in the `Session` scope. -- Its value can be modified using the `Set` statement. +- The variable is valid in the `SESSION` scope. +- Its value can be modified using the `SET` statement. - The data type for the variable is text. - The variable accepts TSO (Timestamp Oracle) and datetime. TSO is a globally unique time service, which is obtained from PD. The acceptable datetime format is "2016-10-08 16:45:26.999". Generally, the datetime can be set using second precision, for example "2016-10-08 16:45:26". -- When the variable is set, TiDB creates a Snapshot using its value as the timestamp, just for the data structure and there is no any overhead. After that, all the `Select` operations will read data from this Snapshot. +- When the variable is set, TiDB creates a Snapshot using its value as the timestamp, just for the data structure and there is no any overhead. After that, all the `SELECT` operations will read data from this Snapshot. > **Note:** > > Because the timestamp in TiDB transactions is allocated by Placement Driver (PD), the version of the stored data is also marked based on the timestamp allocated by PD. When a Snapshot is created, the version number is based on the value of the `tidb_snapshot` variable. If there is a large difference between the local time of the TiDB server and the PD server, use the time of the PD server. -After reading data from history versions, you can read data from the latest version by ending the current Session or using the `Set` statement to set the value of the `tidb_snapshot` variable to "" (empty string). +After reading data from history versions, you can read data from the latest version by ending the current Session or using the `SET` statement to set the value of the `tidb_snapshot` variable to "" (empty string). ## How TiDB manages the data versions @@ -37,10 +37,10 @@ TiDB implements Multi-Version Concurrency Control (MVCC) to manage data versions In TiDB, Garbage Collection (GC) runs periodically to remove the obsolete data versions. For GC details, see [TiDB Garbage Collection (GC)](/garbage-collection-overview.md) -Pay special attention to the following two variables: +Pay special attention to the following: -- `tikv_gc_life_time`: It is used to configure the retention time of the history version. You can modify it manually. -- `tikv_gc_safe_point`: It records the current `safePoint`. You can safely create the snapshot to read the history data using the timestamp that is later than `safePoint`. `safePoint` automatically updates every time GC runs. +- [`tidb_gc_life_time`](/system-variables.md#tidb_gc_life_time): This system variable is used to configure the retention time of earlier modifications (default: `10m0s`). +- The output of `SELECT * FROM mysql.tidb WHERE variable_name = 'tikv_gc_safe_point'`. This is the current `safePoint` where you can read historical data up to. It is updated every time the garbage collection process is run. ## Example diff --git a/sql-statements/sql-statement-flashback-table.md b/sql-statements/sql-statement-flashback-table.md index 814540f68b499..cb4ab73204387 100644 --- a/sql-statements/sql-statement-flashback-table.md +++ b/sql-statements/sql-statement-flashback-table.md @@ -8,14 +8,16 @@ aliases: ['/docs/dev/sql-statements/sql-statement-flashback-table/','/docs/dev/r The `FLASHBACK TABLE` syntax is introduced since TiDB 4.0. You can use the `FLASHBACK TABLE` statement to restore the tables and data dropped by the `DROP` or `TRUNCATE` operation within the Garbage Collection (GC) lifetime. -Use the following command to query the TiDB cluster's `tikv_gc_safe_point` and `tikv_gc_life_time`. As long as the table is dropped by `DROP` or `TRUNCATE` statements after the `tikv_gc_safe_point` time, you can restore the table using the `FLASHBACK TABLE` statement. +The system variable [`tidb_gc_life_time`](/system-variables.md#tidb_gc_life_time) (default: `10m0s`) defines the retention time of earlier versions of rows. The current `safePoint` of where garabage collection has been performed up to can be obtained with the following query: {{< copyable "sql" >}} ```sql -select * from mysql.tidb where variable_name in ('tikv_gc_safe_point','tikv_gc_life_time'); +SELECT * FROM mysql.tidb WHERE variable_name = 'tikv_gc_safe_point'; ``` +As long as the table is dropped by `DROP` or `TRUNCATE` statements after the `tikv_gc_safe_point` time, you can restore the table using the `FLASHBACK TABLE` statement. + ## Syntax {{< copyable "sql" >}} diff --git a/system-variables.md b/system-variables.md index 66deafffaf2c7..f3a5978ab41b2 100644 --- a/system-variables.md +++ b/system-variables.md @@ -553,6 +553,46 @@ For a system upgraded to v5.0.0-rc from an earlier version, if you have not modi - This variable is used to change the default priority for statements executed on a TiDB server. A use case is to ensure that a particular user that is performing OLAP queries receives lower priority than users performing OLTP queries. - You can set the value of this variable to `NO_PRIORITY`, `LOW_PRIORITY`, `DELAYED` or `HIGH_PRIORITY`. +### tidb_gc_concurrency + +- Scope: GLOBAL +- Default: -1 +- Specifies the number of threads in the [Resolve Locks](/garbage-collection-overview.md#resolve-locks) step of GC. A value of `-1` means that TiDB will automatically decide the number of garbage collection threads to use. + +### tidb_gc_enable + +- Scope: GLOBAL +- Default value: ON +- Enables garbage collection for TiKV. Disabling garbage collection will reduce system performance, as old versions of rows will no longer be purged. + +## tidb_gc_life_time + +- Scope: GLOBAL +- Default: `"10m0s"` +- The time limit during which data is retained for each GC, in the format of Go Duration. When a GC happens, the current time minus this value is the safe point. + +> **Note:** +> +> - In scenarios of frequent updates, a large value (days or even months) for `tidb_gc_life_time` may cause potential issues, such as: +> - Larger storage use +> - A large amount of history data may affect performance to a certain degree, especially for range queries such as `select count(*) from t` +> - If there is any transaction that has been running longer than `tidb_gc_life_time`, during GC, the data since `start_ts` is retained for this transaction to continue execution. For example, if `tidb_gc_life_time` is configured to 10 minutes, among all transactions being executed, the transaction that starts earliest has been running for 15 minutes, GC will retain data of the recent 15 minutes. + +### tidb_gc_run_interval + +- Scope: GLOBAL +- Default value: `"10m0s"` +- Specifies the GC interval, in the format of Go Duration, for example, `"1h30m"`, and `"15m"` + +### tidb_gc_scan_lock_mode + +- Scope: GLOBAL +- Default value: `PHYSICAL` +- Possible values: + - `LEGACY`: Uses the old way of scanning, that is, disable Green GC. + - `PHYSICAL`: Uses the physical scanning method, that is, enable Green GC. +- This parameter specifies the way of scanning locks in the Resolve Locks step of GC. When set to `LEGACY`, TiDB scans locks by Regions. The value `PHYSICAL` enables each TiKV node to bypass the Raft layer and directly scan data. This feature can effectively mitigate the impact of GC wakening up all Regions when the [Hibernate Region](/tikv-configuration-file.md#hibernate-regions-experimental) feature is enabled, thus improving the execution speed in the Resolve Locks step. + ### tidb_general_log - Scope: INSTANCE diff --git a/tidb-troubleshooting-map.md b/tidb-troubleshooting-map.md index 276e2222acec1..d68847e73c085 100644 --- a/tidb-troubleshooting-map.md +++ b/tidb-troubleshooting-map.md @@ -555,7 +555,7 @@ Check the specific cause for busy by viewing the monitor **Grafana** -> **TiKV** The transaction duration exceeds the GC lifetime (10 minutes by default). - You can increase the GC lifetime by modifying the `mysql.tidb` table. Generally, it is not recommended to modify this parameter, because changing it might cause many old versions to pile up if this transaction has a large number of `update` and `delete` statements. + You can increase the GC lifetime by modifying the [`tidb_gc_life_time`](/system-variables.md#tidb_gc_life_time) system variable. Generally, it is not recommended to modify this parameter, because changing it might cause many old versions to pile up if this transaction has a large number of `UPDATE` and `DELETE` statements. - 7.1.2 `txn takes too much time`. @@ -582,12 +582,10 @@ Check the specific cause for busy by viewing the monitor **Grafana** -> **TiKV** - 7.1.5 `distsql.go` reports `inconsistent index`. - The data index seems to be inconsistent. Run the `admin check table ` command on the table where the reported index is. If the check fails, close GC by running the following command, and [report a bug](https://github.com/pingcap/tidb/issues/new?labels=type%2Fbug&template=bug-report.md): + The data index seems to be inconsistent. Run the `admin check table ` command on the table where the reported index is. If the check fails, disable garbage collection by running the following command, and [report a bug](https://github.com/pingcap/tidb/issues/new?labels=type%2Fbug&template=bug-report.md): ```sql - begin; - update mysql.tidb set variable_value='72h' where variable_name='tikv_gc_life_time'; - commit; + SET GLOBAL tidb_gc_enable = 0; ``` ### 7.2 TiKV