Skip to content

Commit

Permalink
docs: update for 2.1 (pingcap#80)
Browse files Browse the repository at this point in the history
* docs: update for 2.1

* docs: added Chinese translation

* docs: fixed comments

* Update docs/en_US/02-Deployment.md

Co-Authored-By: kennytm <[email protected]>

* Update docs/en_US/02-Deployment.md

Co-Authored-By: kennytm <[email protected]>

* Update docs/en_US/02-Deployment.md

Co-Authored-By: kennytm <[email protected]>

* Update docs/en_US/02-Deployment.md

Co-Authored-By: kennytm <[email protected]>

* Update docs/en_US/05-Errors.md

Co-Authored-By: kennytm <[email protected]>

* Update docs/en_US/05-Errors.md

Co-Authored-By: kennytm <[email protected]>

* Update docs/en_US/05-Errors.md

Co-Authored-By: kennytm <[email protected]>

* Update docs/en_US/05-Errors.md

Co-Authored-By: kennytm <[email protected]>

* docs: addressed comments

* Update README.md

Co-Authored-By: kennytm <[email protected]>
  • Loading branch information
kennytm authored Oct 30, 2018
1 parent f26b834 commit e0d3290
Show file tree
Hide file tree
Showing 20 changed files with 2,667 additions and 300 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ test_*/
*.local
go.sum
*.pyc
*.ezdraw
67 changes: 5 additions & 62 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,66 +1,9 @@
# TiDB Lightning

TiDB Lightning is a data import tool which is used to fast import a large amount of data to the TiDB cluster. Currently, it only supports source data in the Mydumper file format and in the future it will support more formats like CSV.
**TiDB Lightning** is a tool for fast full import of large amounts of data into a TiDB cluster.
Currently, we support reading SQL dump exported via mydumper.

Now TiDB Lightning only supports full import of new tables. During the importing process, the cluster cannot provide services normally; as a result, TiDB Lightning is not suitable for importing data online.
![](docs/en_US/tidb-lightning.svg)

## TiDB Lightning architecture

The following diagram shows the architecture of TiDB Lightning:

![](media/tidb-lightning-architecture.png)

One set of TiDB Lightning has two components:

- `tidb-lightning`

The front-end part of TiDB Lightning. It transforms the source data into Key-Value (KV) pairs and writes the data into `tikv-importer`.

- `tikv-importer`

The back-end part of TiDB Lightning. It caches, sorts, and splits the KV pairs written by `tidb-lightning` and imports the KV pairs to the TiKV cluster.

## TiDB Lightning workflow

1. Before importing data, `tidb-lightning` automatically switches the TiKV mode to the import mode via API.
2. `tidb-lightning` obtains data from the data source, transforms the source data into KV data, and then writes the data into `tikv-importer`.
3. When the data written by `tidb-lightning` reaches a specific size, `tidb-lightning` sends the `Import` command to `tikv-importer`.
4. `tikv-importer` divides and schedules the TiKV data of the target cluster and then imports the data to the TiKV cluster.
5. `tidb-lightning` transforms and imports the source data continuously until it finishes importing the data in the source data directory.
6. `tidb-lightning` performs the `Compact`, `Checksum`, and `Analyze` operation on tables in the target cluster.
7. `tidb-lightning` automatically switches the TiKV mode to the normal mode. Then the TiDB cluster can provide services normally.

## Deploy process

### Notes

Before deploying TiDB Lightning, you should take note that:

- When TiDB Lightning is running, the TiDB cluster cannot provide services normally.
- When you import data using TiDB Lightning, you cannot check some source data constraints such as the primary key conflict and unique index conflict. If needed, you can check using `ADMIN CHECK TABLE` via the MySQL client after importing, but it may take a long time.
- Currently, TiDB Lightning does not support breakpoint. If any error occurs during importing, delete the data from the target cluster using `DROP TABLE` and import the data again.
- If TiDB Lightning exits abnormally, you need to use the `-swtich-mode` command line parameter of `tidb-lightning` to manually close the import mode of the TiKV cluster and change it to the normal mode:

```
./bin/tidb-lightning -switch-mode normal
```

### Hardware requirements

See [Hardware requirements of TiDB Lightning](docs/tidb-lightning-user-guide.md#hardware-requirements)

### Prepare

Before importing, you should:

- Deploy a set of TiDB cluster (TiDB version is 2.0.4 or later) which is the target cluster for importing (the target cluster).
- Prepare the binary file and the configuration file of `tikv-importer`. It is recommended to use standalone deployment.
- Prepare the binary file and the configuration file of `tidb-lightning`. It is recommended to use standalone deployment.

Download the installation packages of `tikv-importer` and `tidb-lightning` via:

https://download.pingcap.org/tidb-lightning-latest-linux-amd64.tar.gz

### Deploy

See [TiDB Lightning User Guide](docs/tidb-lightning-user-guide.md#deploy)
* [Detailed documentation](docs/en_US/README.md)
* [简体中文文档](docs/zh_CN/README.md)
37 changes: 37 additions & 0 deletions docs/en_US/01-Architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
Architecture
============

![Architecture of TiDB Lightning tool set](./tidb-lightning.svg)

The TiDB Lightning tool set consists of two components:

- **`tidb-lightning`** (the "front end") reads the SQL dump and import the database structure
into the TiDB cluster, and also transforms the data into Key-Value (KV) pairs
and sends them to `tikv-importer`.

- **`tikv-importer`** (the "back end") combines and sorts the KV pairs and then
imports these sorted pairs as a whole into the TiKV cluster.

The complete import process is like this:

1. Before importing, `tidb-lightning` switches the TiKV cluster to "import mode", which optimizes
the cluster for writing and disables automatic compaction.

2. `tidb-lightning` creates the skeleton of all tables from the data source.

3. For each table, `tidb-lightning` informs `tikv-importer` via gRPC to create an *engine file*
to store KV pairs. `tidb-lightning` then reads the SQL dump in parallel, transforms the data
into KV pairs according to the TiDB rules, and send them to `tikv-importer`'s engine files.

4. Once a full table of KV pairs are received, `tikv-importer` divides and schedules these data
and imports them into the target TiKV cluster.

5. `tidb-lightning` then performs a checksum comparison between the local data source and
those calculated from the cluster, to ensure there is no data corruption in the process.

6. After all tables are imported, `tidb-lightning` performs a global compaction on the TiKV
cluster, and tell TiDB to `ANALYZE` all imported tables, to prepare for optimal query planning.

7. Finally, `tidb-lightning` switches the TiKV cluster back to "normal mode" so the cluster
resumes normal services.

Loading

0 comments on commit e0d3290

Please sign in to comment.