Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update garbage collection docs #4552

Merged
merged 21 commits into from
Mar 17, 2021
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions alert-rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -445,7 +445,7 @@ Emergency-level alerts are often caused by a service or node failure. Manual int

* Solution:

1. Perform `select VARIABLE_VALUE from mysql.tidb where VARIABLE_NAME = "tikv_gc_leader_desc"` to locate the `tidb-server` corresponding to the GC leader;
1. Perform `SELECT VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME = "tikv_gc_leader_desc"` to locate the `tidb-server` corresponding to the GC leader;
2. View the log of the `tidb-server`, and grep gc_worker tidb.log;
3. If you find that the GC worker has been resolving locks (the last log is "start resolve locks") or deleting ranges (the last log is “start delete {number} ranges”) during this time, it means the GC process is running normally. Otherwise, contact [[email protected]](mailto:[email protected]) to resolve this issue.

Expand Down Expand Up @@ -633,7 +633,7 @@ For the critical-level alerts, a close watch on the abnormal metrics is required
* Solution:

1. It is normally because the GC concurrency is set too high. You can moderately lower the GC concurrency degree, and you need to first confirm that the failed GC is caused by the busy server.
2. You can moderately lower the concurrency degree by running `update set VARIABLE_VALUE="{number}” where VARIABLE_NAME=”tikv_gc_concurrency”`.
2. You can moderately lower the concurrency degree by adjusting [`tikv_db_concurrency`](/system-variables.md#tidb_gc_concurrency).

### Warning-level alerts

Expand Down
18 changes: 9 additions & 9 deletions backup-and-restore-using-dumpling-lightning.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,30 +59,30 @@ The steps to manually modify the GC time are as follows:
{{< copyable "sql" >}}

```sql
SELECT * FROM mysql.tidb WHERE VARIABLE_NAME = 'tikv_gc_life_time';
SHOW GLOBAL VARIABLES LIKE 'tidb_gc_life_time';
```

```sql
+-----------------------+------------------------------------------------------------------------------------------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+-----------------------+------------------------------------------------------------------------------------------------+
| tikv_gc_life_time | 10m0s |
+-----------------------+------------------------------------------------------------------------------------------------+
1 rows in set (0.02 sec)
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| tidb_gc_life_time | 10m0s |
+-------------------+-------+
1 row in set (0.03 sec)
```

{{< copyable "sql" >}}

```sql
UPDATE mysql.tidb SET VARIABLE_VALUE = '720h' WHERE VARIABLE_NAME = 'tikv_gc_life_time';
SET GLOBAL tidb_gc_life_time = '720h';
```

2. After executing the `dumpling` command, restore the GC value of the TiDB cluster to the initial value in step 1:

{{< copyable "sql" >}}

```sql
UPDATE mysql.tidb SET VARIABLE_VALUE = '10m' WHERE VARIABLE_NAME = 'tikv_gc_life_time';
SET GLOBAL tidb_gc_life_time = '10m';
```

## Restore data into TiDB
Expand Down
25 changes: 5 additions & 20 deletions br/backup-and-restore-tool.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ aliases: ['/docs/dev/br/backup-and-restore-tool/','/docs/dev/reference/tools/br/

# BR Tool Overview

[BR](http://github.com/pingcap/br) (Backup & Restore) is a command-line tool for distributed backup and restoration of the TiDB cluster data. It is supported to use BR only in TiDB v3.1 and later versions.
[BR](http://github.com/pingcap/br) (Backup & Restore) is a command-line tool for distributed backup and restoration of the TiDB cluster data.

Compared with [`dumpling`](/backup-and-restore-using-dumpling-lightning.md), BR is more suitable for scenarios of huge data volume.

This document describes BR's implementation principles, recommended deployment configuration, usage restrictions, several methods to use BR, etc.
This document describes BR's implementation principles, recommended deployment configuration, usage restrictions and several methods to use BR.

## Implementation principles

Expand Down Expand Up @@ -113,21 +113,15 @@ After the restoration operation is completed, BR performs a checksum calculation
>
> - If you do not mount a network disk or use other shared storage, the data backed up by BR will be generated on each TiKV node. Because BR only backs up leader replicas, you should estimate the space reserved for each node based on the leader size.
>
> - Meanwhile, because TiDB v4.0 uses leader count for load balancing by default, leaders are greatly different in size, resulting in uneven distribution of backup data on each node.
> - Because TiDB uses leader count for load balancing by default, leaders can greatly differ in size. This might resulting in uneven distribution of backup data on each node.

### Usage restrictions

The following are the limitations of using BR for backup and restoration:

- It is supported to use BR only in TiDB v3.1 and later versions.
- When BR restores data to the upstream cluster of TiCDC/Drainer, TiCDC/Drainer cannot replicate the restored data to the downstream.
- BR supports operations only between clusters with the same [`new_collations_enabled_on_first_bootstrap`](/character-set-and-collation.md#collation-support-framework) value because BR only backs up KV data. If the cluster to be backed up and the cluster to be restored use different collations, the data validation fails. Therefore, before restoring a cluster, make sure that the switch value from the query result of the `select VARIABLE_VALUE from mysql.tidb where VARIABLE_NAME='new_collation_enabled';` statement is consistent with that during the backup process.

- For v3.1 clusters, the new collation framework is not supported, so you can see it as disabled.
- For v4.0 clusters, check whether the new collation is enabled by executing `SELECT VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME='new_collation_enabled';`.

For example, assume that data is backed up from a v3.1 cluster and will be restored to a v4.0 cluster. The `new_collation_enabled` value of the v4.0 cluster is `true`, which means that the new collation is enabled in the cluster to be restored when this cluster is created. If you perform the restore in this situation, an error might occur.

### Minimum machine configuration required for running BR

The minimum machine configuration required for running BR is as follows:
Expand Down Expand Up @@ -159,20 +153,11 @@ Currently, the following methods are supported to run the BR tool:

#### Use SQL statements

In TiDB v4.0.2 and later versions, you can run the BR tool using SQL statements.

For detailed operations, see the following documents:

- [Backup syntax](/sql-statements/sql-statement-backup.md#backup)
- [Restore syntax](/sql-statements/sql-statement-restore.md#restore)
TiDB supports both [`BACKUP`](/sql-statements/sql-statement-backup.md#backup) and [`RESTORE`](/sql-statements/sql-statement-restore.md#restore) SQL statements. The progress of these operations can be monitored with the statement [`SHOW BACKUPS|RESTORES`](/sql-statements/sql-statement-show-backups.md).

#### Use the command-line tool

In TiDB versions above v3.1, you can run the BR tool using the command-line tool.

First, you need to download the binary file of the BR tool. See [download link](/download-ecosystem-tools.md#br-backup-and-restore).

For how to use the command-line tool to perform backup and restore operations, see [Use the BR command-line tool](/br/use-br-command-line-tool.md).
The `br` command-line utility is available as a separate download. For details, see [Use BR Command-line for Backup and Restoration](/br/use-br-command-line-tool.md).
Copy link
Contributor

@yikeke yikeke Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use BR Command-line for Backup and Restoration does not provide the download link?


#### In the Kubernetes environment

Expand Down
24 changes: 4 additions & 20 deletions br/backup-and-restore-use-cases.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ This document aims to help you achieve the following goals:

## Audience

You are expected to have a basic understanding of [TiDB](https://docs.pingcap.com/tidb/v4.0) and [TiKV](https://tikv.org/).
You are expected to have a basic understanding of TiDB and [TiKV](https://tikv.org/).

Before reading on, make sure you have read [BR Tool Overview](/br/backup-and-restore-tool.md), especially [Usage Restrictions](/br/backup-and-restore-tool.md#usage-restrictions) and [Best Practices](/br/backup-and-restore-tool.md#best-practices).

Expand Down Expand Up @@ -82,26 +82,10 @@ Before the backup or restoration operations, you need to do some preparations:

### Preparation for backup

In TiDB v4.0.8 and later versions, BR supports the self-adaptive Garbage Collection (GC). So to avoid manually configuring GC, you only need to register `backupTS` in `safePoint` in PD and make sure that `safePoint` does not move forward during the backup process.
For the detailed usage of the `br backup` command, refer to [Use BR Command-line for Backup and Restoration](/br/use-br-command-line-tool.md).

In TiDB v4.0.7 and earlier versions, you need to manually configure GC before and after the BR backup through the following steps:

1. Before executing the [`br backup` command](/br/use-br-command-line-tool.md#br-command-line-description), check the value of the [`tikv_gc_life_time`](/garbage-collection-configuration.md#tikv_gc_life_time) configuration item, and adjust the value appropriately in the MySQL client to make sure that GC does not run during the backup operation.

{{< copyable "sql" >}}

```sql
SELECT * FROM mysql.tidb WHERE VARIABLE_NAME = 'tikv_gc_life_time';
UPDATE mysql.tidb SET VARIABLE_VALUE = '720h' WHERE VARIABLE_NAME = 'tikv_gc_life_time';
```

2. After the backup operation, set the parameter back to the original value.

{{< copyable "sql" >}}

```sql
UPDATE mysql.tidb SET VARIABLE_VALUE = '10m' WHERE VARIABLE_NAME = 'tikv_gc_life_time';
```
1. Before executing the `br backup` command, ensure that no DDL is running on the TiDB cluster.
2. Ensure that the storage device where the backup will be created has sufficient space.

### Preparation for restoration

Expand Down
13 changes: 0 additions & 13 deletions br/use-br-command-line-tool.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,19 +71,6 @@ Each of the above three sub-commands might still include the following three sub

To back up the cluster data, use the `br backup` command. You can add the `full` or `table` sub-command to specify the scope of your backup operation: the whole cluster or a single table.

If the BR version is earlier than v4.0.8, and the backup duration might exceed the [`tikv_gc_life_time`](/garbage-collection-configuration.md#tikv_gc_life_time) configuration which is `10m0s` by default (`10m0s` means 10 minutes), increase the value of this configuration item.

For example, set `tikv_gc_life_time` to `720h`:

{{< copyable "sql" >}}

```sql
mysql -h${TiDBIP} -P4000 -u${TIDB_USER} ${password_str} -Nse \
"update mysql.tidb set variable_value='720h' where variable_name='tikv_gc_life_time'";
```

Since v4.0.8, BR automatically adapts to GC and you do not need to manually adjust the `tikv_gc_life_time` value.

### Back up all the cluster data

To back up all the cluster data, execute the `br backup full` command. To get help on this command, execute `br backup full -h` or `br backup full --help`.
Expand Down
4 changes: 2 additions & 2 deletions dumpling-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -307,15 +307,15 @@ In other scenarios, if the data size is very large, to avoid export failure due
{{< copyable "sql" >}}

```sql
update mysql.tidb set VARIABLE_VALUE = '720h' where VARIABLE_NAME = 'tikv_gc_life_time';
SET GLOBAL tidb_gc_life_time = '720h';
```

After your operation is completed, set the GC time back (the default value is `10m`):

{{< copyable "sql" >}}

```sql
update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time';
SET GLOBAL tidb_gc_life_time = '10m';
```

Finally, all the exported data can be imported back to TiDB using [Lightning](/tidb-lightning/tidb-lightning-backends.md).
Expand Down
16 changes: 5 additions & 11 deletions error-codes.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,9 +148,9 @@ TiDB is compatible with the error codes in MySQL, and in most cases returns the

* Error Number: 8048

An unsupported database isolation level is set.

If you cannot modify the codes because you are using a third-party tool or framework, consider using `tidb_skip_isolation_level_check` to bypass this check.
An unsupported database isolation level is set.
If you cannot modify the codes because you are using a third-party tool or framework, consider using [`tidb_skip_isolation_level_check`](/system-variables.md#tidb_skip_isolation_level_check) to bypass this check.

{{< copyable "sql" >}}

Expand Down Expand Up @@ -178,16 +178,10 @@ TiDB is compatible with the error codes in MySQL, and in most cases returns the

* Error Number: 8055

The current snapshot is too old. The data may have been garbage collected. You can increase the value of `tikv_gc_life_time` to avoid this problem. The new version of TiDB automatically reserves data for long-running transactions. Usually this error does not occur.

The current snapshot is too old. The data may have been garbage collected. You can increase the value of [`tidb_gc_life_time`](/system-variables.md#tidb_gc_life_time) to avoid this problem. The new version of TiDB automatically reserves data for long-running transactions. Usually this error does not occur.
See [garbage collection overview](/garbage-collection-overview.md) and [garbage collection configuration](/garbage-collection-configuration.md).

{{< copyable "sql" >}}

```sql
update mysql.tidb set VARIABLE_VALUE="24h" where VARIABLE_NAME="tikv_gc_life_time";
```

* Error Number: 8059

The auto-random ID is exhausted and cannot be allocated. There is no way to recover from such errors currently. It is recommended to use bigint when using the auto random feature to obtain the maximum number of assignment. And try to avoid manually assigning values to the auto random column.
Expand Down
12 changes: 1 addition & 11 deletions faq/migration-tidb-faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Restart the TiDB service, add the `-skip-grant-table=true` parameter in the conf

### How to export the data in TiDB?

Currently, TiDB does not support `select into outfile`. You can use the following methods to export the data in TiDB:
You can use the following methods to export the data in TiDB:

- See [MySQL uses mysqldump to export part of the table data](https://blog.csdn.net/xin_yu_xin/article/details/7574662) in Chinese and export data using mysqldump and the `WHERE` clause.
- Use the MySQL client to export the results of `select` to a file.
Expand Down Expand Up @@ -120,13 +120,3 @@ If the amount of data that needs to be deleted at a time is very large, this loo

- The [Lightning](/tidb-lightning/tidb-lightning-overview.md) tool is developed for distributed data import. It should be noted that the data import process does not perform a complete transaction process for performance reasons. Therefore, the ACID constraint of the data being imported during the import process cannot be guaranteed. The ACID constraint of the imported data can only be guaranteed after the entire import process ends. Therefore, the applicable scenarios mainly include importing new data (such as a new table or a new index) or the full backup and restoring (truncate the original table and then import data).
- Data loading in TiDB is related to the status of disks and the whole cluster. When loading data, pay attention to metrics like the disk usage rate of the host, TiClient Error, Backoff, Thread CPU and so on. You can analyze the bottlenecks using these metrics.

### What should I do if it is slow to reclaim storage space after deleting data?

You can configure concurrent GC to increase the speed of reclaiming storage space. The default concurrency is 1, and you can modify it to at most 50% of the number of TiKV instances using the following command:

{{< copyable "sql" >}}

```sql
update mysql.tidb set VARIABLE_VALUE="3" where VARIABLE_NAME="tikv_gc_concurrency";
```
8 changes: 2 additions & 6 deletions faq/sql-faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,13 +134,9 @@ Deleting a large amount of data leaves a lot of useless keys, affecting the quer

## What should I do if it is slow to reclaim storage space after deleting data?

You can configure concurrent GC to increase the speed of reclaiming storage space. The default concurrency is 1, and you can modify it to at most 50% of the number of TiKV instances using the following command:
Because TiDB uses Multiversion concurrency control (MVCC), deleting data does not immediately reclaim space. Garbage collection is delayed so that concurrent transactions are able to see earlier versions of rows. This can be configured via the [`tidb_gc_life_time`](/system-variables.md#tidb_gc_life_time) (default: `10m0s`) system variable.

{{< copyable "sql" >}}

```sql
update mysql.tidb set VARIABLE_VALUE="3" where VARIABLE_NAME="tikv_gc_concurrency";
```
When performing a backup, the `tidb_gc_life_time` is also automatically extended so that the backup can complete successfully.

## Does `SHOW PROCESSLIST` display the system process ID?

Expand Down
4 changes: 2 additions & 2 deletions faq/tidb-faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,12 +124,12 @@ The accessed Region is not available. A Raft Group is not available, with possib

#### ERROR 9006 (HY000): GC life time is shorter than transaction duration

The interval of `GC Life Time` is too short. The data that should have been read by long transactions might be deleted. You can add `GC Life Time` using the following command:
The interval of `GC Life Time` is too short. The data that should have been read by long transactions might be deleted. You can adjust [`tidb_gc_life_time`](/system-variables.md#tidb_gc_life_time) using the following command:

{{< copyable "sql" >}}

```sql
update mysql.tidb set variable_value='30m' where variable_name='tikv_gc_life_time';
SET GLOBAL tidb_gc_life_time = '30m';
```

> **Note:**
Expand Down
Loading