This repository has been archived by the owner on Jul 24, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 102
br supports fast fail when restoring with incompatible settings #352
Labels
difficulty/2-medium
Medium-difficulty issue
help wanted
Extra attention is needed
priority/P2
Medium priority issue
type/feature-request
New feature or request
Comments
IANTHEREAL
added
type/feature-request
New feature or request
help wanted
Extra attention is needed
labels
Jun 13, 2020
kennytm
changed the title
br supports fast fail
br supports fast fail when restoring with incompatible settings
Jun 13, 2020
kennytm
added
difficulty/2-medium
Medium-difficulty issue
priority/P2
Medium priority issue
labels
Jun 13, 2020
For collation:
Is there any other cluster-wide incompatible settings? |
@bb7133 Is there any other cluster-wide incompatible settings? |
overvenus
pushed a commit
to overvenus/br-1
that referenced
this issue
Dec 29, 2020
overvenus
added a commit
that referenced
this issue
Mar 3, 2021
* restore: write index kv pairs and data kv pairs to separate engine (#132) * restore: write index kvs and data kvs to seperate engine * restore: use single engine file to store index * restore: make index engine limited by table concurrency * restore: modify checkpoint proto * restore: implement checkpoint for index engine file * tests: add failpoint for CheckpointStatusIndexImported * Support CSV (#111) * loader: recognize CSV files * *: generalize the parser interface * mydump: added a CSV parser * mydump,restore: enable CSV parser * tests: added integration test for CSV * mydumper: added test case for empty CSV * config: improved description * restore: fixed a compile error on Go 1.12 * update version of tidb to latest release-2.1 (#138) update tidb version to latest release2.1, and add a test case for issue https://github.com/pingcap/tidb/issues/9532 * tidb-lightning-ctl: added --import-engine and --cleanup-engine commands (#125) * *: parameter type fix (#141) * remove changelog and get change logs from release (#146) * test: fix cleanup script to enable re-run integration test locally (#142) Currently, cleanup script in integration test does not work, if the developer wants to re-run integration test in his/her local environment, he/she has to clean /tmp/lightning_test_result manually * restore: fix #140 lightning breaks on tidb-master (#147) * fix deadlock and goroutine leak in chunk restore (#149) * restore: eager release index engine worker after imported (#150) * checkpoint: fix checkpoint not updated in some scenario (#151) Test `checkpoint_error_destroy` in integration got error randomly, both in local environment and JenkinsCI. Lightning could exit before all checkpoints are saved, as the `WaitGroup` is not always added before this line https://github.com/pingcap/tidb-lightning/blob/master/lightning/restore/restore.go#L204 returns 1. After `RestoreController.Run` is returned, no more messages will be sent to `rc.saveCpCh`, so we can close it safely and ensure all `saveCp` are consumed 2. remove some unused code * Revert "fix deadlock and goroutine leak in chunk restore (#149)" (#152) This reverts commit 852c5e954877692035efe5c3e44ddcf8f915d90b. * restore: add local checksum log (#153) * restore: add local checksum log * Update lightning/restore/restore.go Co-Authored-By: lonng <[email protected]> * tests: add kill lightning test in checkpoint_chunks case (#158) use gofail to control lightning to kill itself after one chunk is imported * *: parse the data source directly into data and skip the KV encoder (#145) * *: parse the data source directly into data and skip the KV encoder This skips the more complex pingcap/parser, and speeds up parsing speed by 50%. We have also refactored the KV delivery mechanism to use channels directly, and revamped metrics: - Make the metrics about engines into its own `engines` counter. The `tables` counter is exclusively about tables now. - Removed `block_read_seconds`, `block_read_bytes`, `block_encode_seconds` since the concept of "block" no longer applies. Replaced by the equivalents named `row_***`. - Removed `chunk_parser_read_row_seconds` for being overlapping with `row_read_seconds`. - Changed `block_deliver_bytes` into a histogram vec, with kind=index or kind=data. Introduced `block_deliver_kv_pairs`. * tests,restore: prevent spurious error in checkpoint_chunks test Only kill Lightning if the whole chunk is imported exactly. The chunk checkpoint may be recorded before a chunk is fully written, and this will hit the failpoint more than 5 times. * kv: use composed interface to simplify some types * kv: properly handle the SQL mode * common: disable IsContextCanceledError() when log level = debug This helps debugging some mysterious cancellation where the log is inhibited. Added IsReallyContextCanceledError() for code logic affected by error type. * restore: made some log more detailed * restore: made the SlowDownImport failpoint apply to index engines too * restore: do not open a write stream when there are no KV pairs to send * tests: ensure we drop the checkpoints DB before re-run * mydump: fixed various off-by-one errors in the CSV parser * *: rename `!IsContextCanceledError` to `ShouldLogError` * *: addressed comments * restore: zero the checksums and column permutations on initialization * *: addressed comments * tests: add back a missing license header * tests: improve a comment. * tests: fix a test failure due to conflict between #145 and #158 (#159) * tests: fix a test failure due to conflict between #145 and #158 * restore: apply the row count limit to failpoint KillIfImportedChunk too * restore: give priority to small tables for importing (#156) Put the large table in the front of the slice which can avoid large table take a long time to import and block small table to release index worker. * config: Allow overriding some config from command line (#157) * config: remove deprecated -compact, -switch-mode flags from tidb-lightning They can still be called from tidb-lightning-ctl. * config: search also conf/tidb-lightning.toml for default -config path Fallback to standard default if -config is not supplied * config: provide command line arguments for some common options * config: stop searching for tidb-lightning.toml if --config is unspecified * go.mod: upgrade dependencies, esp TiDB -> v3.0.0-beta.1 (#160) * Fix interpretation of integers in a BIT column (#161) * tests: fix existing test failure * mydump: fixed conversion of integers into bits We need to create a special branch for integers, since casting 123 and '123' into BIT type behave differently. Also fixed handling of 0x/0b bit strings since Ragel doesn't recognize '+' in a regex -_-. * mydump: store description of `token` in an array instead of switch cases * tests: test behavior of integers for ENUM and SET types as well * *: replace gofail with new failpoint implementation (#165) * *: use pingcap/log (zap) for logging (#162) * *: use pingcap/log (zap) for logging Some redundant logs (e.g. logging about the same thing inside and outside a function) are removed. The {QueryRow,Transact,Exec}WithRetry functions are revamped to include the logger. * common,config: addressed comments * *: addressed comments * restore,verification: addressed comments * main: sync log before exit + update failpoint dep (#168) * kv: fix handling of column default values (#170) * kv: fix handling of column default values * if the column is AUTO_INCREMENT, fill in with row_id (assume it is missing for the entire table instead of just a few values) * if the column has DEFAULT, fill in that value * tests: ensure DEFAULT CURRENT_TIMESTAMP works * tests,restore: re-enable the exotic_filenames test (#172) * config: reduce default table-concurrency from 8 to 6 (#175) Ensures table + index <= the default max-open-engine which is 8. * Support table routing rules (merging sharded tables) (#95) * config: added [[routes]] config * mydump, restore: support table routing * tests: added test case for table routing * mydump: ensure rerouted schemas will not be created * config: allows routes to be case-sensitive * restore: replace CREATE TABLE -> IF NOT EXISTS using tidb/parser * mydump/loader_test: add unit test for route() and refactor * tests: TiDB doesn't support `DECIMAL(20, 0)`. * tests: workaround pingcap/parser#310 * tests: removes the emoji from a test database name (#179) The character is too exotic and breaks TiDB and some old git. * *: fix failpoint-ctl path, unify failpoint runtime and ctl to same version (#180) * restore: fix the potential null pointer exception when logging progress (#178) If no files are completely imported within the first 5 minutes, we get `finished == 0` and the logger will try to log a nil field and crashes. * kv,restore: log which value caused conversion failure (#154) * kv,restore: log row content and column info on failure * restore: log a warning if a column is missing * restore: retry if deliver KVs to importer (#176) * config: automatically discover tidb.pd-addr and tidb.port if not provided (#173) * config: automatically discover tidb.pd-addr and tidb.port if not provided * config: error if port/pd-addr is still wrong after adjust * config: use tidb/config.Config instead of a custom struct * config: remove recognition of AdvertiseAddress TiDB team says it is useless and TiDB won't work with port-forwarding. * *: added linters (#183) * Makefile: added `make check` to perform static linting Moved the failpoint tool into `tools/bin` for consistency. Renamed the phony targets `failpoint-{enable,disable}` to `failpoint_{enable,disable}` for consistency. Renamed the `install-failpoint` target to the real path name. * *: fix gofmt suggestions * go.mod,Makefile: unify the two failpoint versions Added a script to ensure they never differ. * Makefile: fixed the scope of golangci-lint * common: improve unit test coverage of 'common' package (#186) * common: improve unit test coverage of 'kv' package (#187) * Make the parsers stricter, and improve unit test coverage of `mydump` package (#185) * lightning: improve unit test coverage of 'lightning' package (#188) * README: update coverage status badge (#189) * config: improve unit test coverage of 'config' package (#192) Signed-off-by: Lonng <[email protected]> * Add unit tests for kv/importer and restore/checkpoints, plus some bug fixes (#191) * kv: add unit test for importer (based on a mocked gRPC client) * restore: remove the unnecessary SHOW CREATE TABLE calls We can now reconstruct the table info directly from the HTTP reply, so the SHOW CREATE TABLE results are now useless. Better drop them. * restore: add unit test for tidb.go * common: ensure sqlmock errors are not retryable * restore: fix error where checkpoint status of index engine is not updated Also, made the WholeTableEngineID constant public. * restore: rename a confusing variable * restore: fix bug where --checkpoint-error-destroy=all skips index engine * restore: prevent NPE when getting missing table from file checkpoint * restore: add unit tests for checkpoints * kv: address importer comments * kv: also exposes the mock Importer constructor to other tests * Add unit tests for 'restore.go' (TableRestore and chunkRestore) (#193) * go.mod: update dependencies (#197) * config,lightning: Implements server mode (#198) * test: speed up TestGetJSON Force a shorter timeout on the HTTP client, so that accessing `http://not-exists` won't take 30 seconds. * config,lightning: implement "server mode" In "Server Mode" Lightning will wait for tasks submitted via the HTTP API `POST /tasks`, and will keep running until Ctrl+C. Multiple tasks are executed sequentially. The config is split into "Global config" and "Task config", which shares the same structure for compatibility and simplicity. The pprof-port setting has been deprecated in favor of status-addr, for compatibility with other tools. * lightning,config: cover some of the new code * lightning: added `GET /tasks` API to get number of queued tasks * *: addressed comments * config,lightning: use a linked hash map to store queued configs Changed /task to return JSON. This is to prepare for an API removing a queued task, and also to remove the artificial task queue size limit. * config: change TaskID to record the current timestamp * go.mod: update dependencies (#200) * Post restore config fix (#202) * fix the mistake in lightning config file * fix typo * remote the "also" in description * Introduce a basic web interface (#199) * restore,checkpoints: move checkpoints into its own package This allows both the "restore" package to import the "web" package, and allow the "web" package to use "checkpoints", without leading to circular dependency. * verification: implemented json.Marshaler for KVChecksum * *: expose the current import progress to HTTP interface * common: added "Pauser" synchronization primitive * lightning: allows status address to reliably use port 0 for testing * config: ensure AllIDs() return a deterministic order * lightning,restore: support pausing, moving and deleting tasks through HTTP Also fixed some goroutine leaks and crashes after canceling. * common: fixed the bug where checksum is not cancelable * config: added configlist.{MoveToFront, MoveToBack} * web,lightning: added a web interface * web: explain the web interface * web: added OpenAPI (Swagger) spec of the HTTP API * common: avoid double-close a channel The channel may be double-closed given this sequence: 0. [B] p.Pause() 1. [A] p.Wait(ctx), run until the select 2. [B] p.Resume(), run until the for loop 3. [C] cancel the ctx 4. [A] continue from select, and close the channel 5. [B] continue the for loop, using the old copy of waiters, it will close the channel again, causing double-close error. We just avoid closing the waiter when ctx expired. * common: added a test to check for contended pause/resume flip * common: fixed a potential race condition * verification: change JSON field of checksum from cksum to checksum * web: document the OpenAPI def and why we don't support webpack-dev-server Fixed a potential typing error (see TypeStrong/atom-typescript#1053). * config: prevents task ID conflict which may happen with a coarse clock * restore: prevent encodeLoop panicking if deliverResult is closed? * checkpoints,lightning: address comments * Update README.md (#207) * Improve errors and logs on syntax error / conversion failure (#201) * mydump: ensure syntax error won't log more than 256 bytes of content * kv: log only the affected column on cast failure Also annotate the returned error by the failed value * restore: annotate encode failure with the current file being processed * checkpoints: reduce log level of missing checkpoint file from warn to info * tests,web: exclude these directories from the Go module (#209) * lightning/restore: fix ColumnPermutation calculation (#210) Signed-off-by: Lonng <[email protected]> * build(deps): bump lodash from 4.17.11 to 4.17.14 in /web (#213) * build(deps): bump lodash from 4.17.11 to 4.17.14 in /web Bumps [lodash](https://github.com/lodash/lodash) from 4.17.11 to 4.17.14. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.11...4.17.14) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <[email protected]> * web: execute `npm audit fix` * web: update corresponding Go code * *: adjust solution for TOOL-1420 and add a test case (#214) Move the ToLower operation from (*TableRestore).initializeColumns() to (*ChunkParser).ReadRow(). In fact this is the same as the CSV parser. * tests: add test case for simple partitioned tables (#206) * checkpoints: remove node_id field and rename the schema on keep-after-success (#208) * lightning: reduce the chance of spurious error in web server * checkpoints: remove node_id We intend to separate checkpoints from multiple nodes into different schemas instead. Since one node now owns the entire schema, when removing all checkpoints we just drop the entire schema. We also changed the file driver to delete the file instead of just emptying the content. * lightning: set the task ID even outside server mode * checkpoints,restore: move the checkpoints database on keep-after-success The schema is renamed as `*.{taskID}.bak`. * checkpoints: addressed comments * config: attempt to solve TOOL-1405 and modify old test cases (#217) * remove outdated target of Makefile * check unused toml keys * remove outdated toml config keys that block * config: improve readablity from review suggestions * config: improve readability * Update lightning/config/config.go Co-Authored-By: kennytm <[email protected]> * restore: fix gc life time not recovered after table restore (#218) * restore: fix gc life time not recovered after table restore * empty commit to refresh cla * address comment * address comment * address comment * fix ci * *: abstract the Importer communication into an interface (#215) * common: allow retry on ErrWriteConflictInTiDB (mysql error 8005) * *: perform SwitchMode and Compact directly through ImportSSTService This allows us to perform these actions without relying on Importer. * *: abstracted *kv.Importer into kv.Backend * common,kv: fix comments * kv,restore: addressed comments * restore: fix comments * restore: update and restore GCLifeTime once when parallel (#220) * restore: update and restore GCLifeTime once when parallel * Update lightning/restore/restore.go Co-Authored-By: amyangfei <[email protected]> * Update lightning/restore/restore.go Co-Authored-By: amyangfei <[email protected]> * Update lightning/restore/restore.go Co-Authored-By: amyangfei <[email protected]> * restore: fix bug in reviewing * restore: call ObtainGCLifeTime closer to use * restore: improve readabiilty * restore: reduce struct copy and pointer deref * restore: fix bug in reviewing * Update lightning/restore/restore.go Co-Authored-By: kennytm <[email protected]> * restore: improve readability * restore: refine mock expectation order * restore: adjust detail * *: support MySQL backend (#221) * *: support MySQL backend * config: use a constant for backend and checkpoint.driver * kv: address comments * backend: rename `kv` package to `backend` package * config: always skip the system databases (#225) * backend: update mysql backend to tidb backend (#228) * update mysql backend to tidb backend * fix test * lightning/common: add unit test (#229) * add unit test * add unit test * Update lightning/common/util_test.go Co-Authored-By: kennytm <[email protected]> * Update lightning/common/util_test.go Co-Authored-By: kennytm <[email protected]> * mock: update rpc endpoint (#226) * cmd: do not exit(1) if failed to sync log (#230) * mock: update rpc endpoint (#234) * backend: dynamically calculate the maximum auto-inc ID (#227) This makes sure the final AUTO_INCREMENT value is exactly the maximum value of the auto-inc column. Fix #222. * checkpoint: fix empty map become nil after unmarshall (#237) * checkpoint: fix empty map become nil after unmarshall * checkpoint: unify code style * checkpoint: remove hard-coded path * config: increase default concurrency (#244) * backend/tidb: use REPLACE INTO or INSERT IGNORE INTO to provide idempotent insertion (#243) * *: use fixed timestamp to ensure import stability wrt CURRENT_TIMESTAMP (#235) * *: update dependencies (tidb -> 3.0.4) (#246) * improve the log when encountering invalid checkpoint (#247) * restore: improve the log when encountering invalid checkpoint * config: fix typo in CLI * config: document the new `[tikv-importer] on-duplicate` setting * config: adding `[tidb] max-allowed-packet` config (#248) this allows sending rows > 4 MiB when using TiDB backend * Fix +incompatible suffix not allowed error reported by go mod (#249) * allow use a separate pauser instead of global pauser (#251) * Add password as command line argument (#253) * Add password as command line argument * Adds -tidb-pwd to config tests * Changes tidb-pwd to tidb-password cli flag * metrics: copy the grafana board into this repo (#256) * metrics: copy the grafana board into this repo * metrics: add the alert rule too * *: Update dependencies and fix unit test on Windows (#254) * go.mod: update Go dependencies * *: changes code so unit test can be run on Windows mainly replaces `path.Join` to `filepath.Join` * web: update web interface dependencies * tests/exotic_filenames: generate the directory content at runtime this should make git behave better on windows * config: synchronize actual default value with the toml file (#255) * config: synchronize actual default value with the toml file * config: expose --check-requirements in command line * lightning: ensure the web interface still works outside server mode (#259) * Upgrade TiDB to 4.0.0-beta, and recognize @@tidb_row_format_version = 2 (#268) * *: remove the deprecated kvencoder.KvPair Create our own struct at `common` package and use those. * go.mod: update dependencies (TiDB -> 4.0.0-beta) * backend: support using row format v2 * backend: still needs to implement (*transaction).Get() * Support TLS; Reduce the need of config.toml in integration tests (#270) * *: go fmt * *: support TLS * tests: enable TLS for all components in the integration test * tests: specify TLS and most default arguments via command line refactored the tests so only essential settings remained in config.toml * config: the default csv.null should be a capital \N not small \n * security: clone the http.DefaultTransport rather than shallow copy * tests: break PD retry loop * backend: fix unit test failure * tests: replace curl by wget The `curl` on CI is too old to handle ECC keys. But `wget` somehow works. * tests: fix test failure * *: fix comments * backend: define a reusable BufStore when creating a new session (#274) * Update Chinese doc url (#276) * Update chinese doc url Update chinese doc url * update English doc link Co-Authored-By: kennytm <[email protected]> Co-authored-by: kennytm <[email protected]> * Set session var for every new conn (#280) Note the previous `setSessionConcurrencyVars` will set one connection from the db connection pool, so we can't make sure the session will take affect later using the `sql.DB`. * lightning: split large csv file if possible (#272) * lightning: split large csv file if possible * gofmt * gofmt * unit test * add unit test * tiny change * tiny refine * fix ci * remove useless code * fix ci * fix ci * address comments * go fmt for all * address comment * correct the estimateChunkCount Co-authored-by: kennytm <[email protected]> * send a batch of kv in encodeLoop (#279) * send a batch of kv in encodeLoop * regine test * address comment * Fix wrongly pass nil columnNames. Should get column names after `parser.ReadRow()` that it's setted while parser the row. * Update lightning/restore/restore.go Co-Authored-By: kennytm <[email protected]> Co-authored-by: kennytm <[email protected]> Co-authored-by: Jiahao Huang <[email protected]> * Replace CSV Ragel parser by a hand-written parser copied from encoding/csv (#275) The Ragel-based CSV parser is much slower than the standard encoding/csv parser. As explained in #111, the parser was using Ragel to reuse existing framework for simplicity. However as more and more customers start to use Lightning with CSV input, this is the time we need to optimize the things. The new parser was inspired by encoding/csv but almost nothing remained except the recordBuffer/fieldIndexes members to reduce the amount of allocations. We cannot reuse encoding/csv because: encoding/csv does not allow us to track the current read position encoding/csv does not recognize backslash-escaped fields (required for MySQL-generated CSV) encoding/csv does not support disabling quoting Implementing these make the new parser still worse than encoding/csv, but much better than the original one. Parser TPC-C "CUSTOMER" Row encoding/csv 1898 ns/op Original parser 6930 ns/op New parser 2426 ns/op So the new parser is 35% of the original parser and encoding/csv is 80% of the new parser. We can investigate how to squeeze out the remaining 20% later. * backend: fix issue 282 (#283) * backend: fix all wrong escape of '\Z' as '\x26' (which is '&') * tests: try to workaround spurious failure of check_requirements * tests: make the lightning exit detection more precise * tests: fix test * optimize the performance of lightning (#281) * lightning: split large csv file if possible * gofmt * gofmt * unit test * add unit test * tiny change * tiny refine * fix ci * remove useless code * fix ci * fix ci * address comments * go fmt for all * Replace CSV Ragel parser by a hand-written parser copied from encoding/csv Conflicts: lightning/mydump/csv_parser_generated.go * fix conflict * update * update again * send a batch of kv in encodeLoop * use sync.Pool * Close channel instead of push one entry. * Use copy instead append * Fix test and failpoint version * Reuse slice of record This expected to avoid about 3.5% of alloc_objects alloc_objects: Total: 773496750 773873722 (flat, cum) 7.18% 177 . . parser.fieldIndexes = parser.fieldIndexes[:0] 178 . . 179 . . isEmptyLine := true ... 225 386621314 386621314 str := string(parser.recordBuffer) // Convert to string once to batch allocations 226 386875436 386875436 dst := make([]string, len(parser.fieldIndexes)) * Use pool for mutation This take most alloc in WriteRows: ROUTINE ======================== github.com/pingcap/tidb-lightning/lightning/backend.(*importer).WriteRows in /Users/huangjiahao/go/src/github.com/pingcap/tidb-lightning/lightning/backend/importer.go 797370418 980241246 (flat, cum) 9.09% of Total . . 155: kvs := rows.(kvPairs) ... ... . . 192: for i, pair := range kvs { 772641868 772641868 193: mutations[i] = &kv.Mutation{ . . 194: Op: kv.Mutation_Put, . . 195: Key: pair.Key, . . 196: Value: pair.Val, . . 197: } . . 198: } * Set GC percent as 500 default Lightning allocates too many transient objects and heap size is small, so garbage collections happen too frequently and lots of time is spent in GC component. In a test of loading the table `order_line.csv` of 14k TPCC. The time need of `encode kv data and write` step reduce from 52m4s to 37m30s when change GOGC from 100 to 500, the total time needed reduce near 15m too. The cost of this is the memory of lightnin at runtime grow from about 200M to 700M, but it's acceptable. So we set the gc percentage as 500 default to reduce the GC frequency instead of 100. * Remove MaxKVPairs in Mydump has been move to Importer part * Remove outdate code * Update tidb version For https://github.com/pingcap/tidb/commit/495f8b74382fb31924b4948374c0dbba6f2d87cd disable UpdateDeltaForTable if TxnCtx is nil * Address comment * Remain append Co-authored-by: xuhuaiyu <[email protected]> * Some SwitchMode improvements (#287) * backend: ignore Unimplemented error in SwitchMode and Compact * tidb-lightning-ctl: added --fetch-mode subcommand * go.mod,web: update dependencies (#289) * go.mod,web: update dependencies * *: fix unit test failure * Support store version format generated by `git describe --tags` (#295) * Support store version format: git describe --tags Signed-off-by: Tong Zhigao <[email protected]> * add tests Signed-off-by: Tong Zhigao <[email protected]> * Warn for single large file, change switch mode log level to info (#315) * warn single large file, chan switch mode log level to info * address comment * restore: fix typo (#304) * print lightning log to local file (#313) * save logs in local file, print only necessary info * split error stack info and err info in two lines * *: avoid accessing internal ports when backend=tidb (#312) * config: remove strict mode from default SQL mode (#316) * config: remove strict mode from default SQL mode * tests: fixed typo * check table id when loading checkpoint (#317) * add tableID checkpoint check * update tidb-tools to latest (#319) * support alter random && update tidb dependency to latest (#324) * update tidb * rebase auto random column * backend: add local kv storage backend to get rid of importer (#326) * backend: add local backend, try to move importer's sort kv and split & scatter into lightning * address comment * use sync.Map * write to every peer * add retry on write and ingest * fix split region * udpate leveldb config * use badger * use pebble * update pebble config * restrict concurrency * use workerPool to restrict concurrency for all engines * make range concurrency as config * flush db at close engine * wip: use sstable to split range * fast split ranges by encode key and file size * fix split bug * use bigEndian to split range * fix split region bug & ingest lost write meta bug * add send-kv-pairs config * fix duplicate write error * fix checksum in small table * use go routine to write tikv and ingest * fix * update sort * update sort * fix * fix * fix retry * update * fix concurrency bug * fix retry bug * fix iter.Next * try fix checksum mismatch * fix deadlock * update * fix * optimize memory usage * add checkpoint for local mode * fix checkpoint * do not write chunk checkpoint in local mode * fix remote checkpoint * fix checkpoint * manually destroy checkpoint * only flush index if checkpoint is on * remove some useless code * format code * fix unit test * fix test * fix * fix tls * fix review comment * add c comment for local.Close * add some comment * fix local backend checkpoint * fix unit test * refine some test with local backend * address comment * fix close engine * checkpoint integrateion_test for local backend * try fix * return nil if engine not exist in CloseEngine and ImportEngine * address comment * test localbackend checkpoint * adjust config to save coverage * fix review comments * change ParseIndexKey method * remove saveCpChan channel buf * add test to save coverage * test ingest failed * inject failpoint before dataengine importer * fix review comments * fix test format Co-authored-by: luancheng <[email protected]> * make lightning compatible with allow_auto_random_explicit_insert (#328) * fix system error when tidb not support allow_auto_random_explicit_insert * address comment * config: update example config file (#331) * update example config * fix comment Co-authored-by: kennytm <[email protected]> * Fix test cases on release-3.0 (#330) * tests: allow running a single test * tests: record the cluster version * tests: skip 'local' backend tests if cluster is below v4.0.0 * Jenkinsfile: move the CI script into git * Jenkinsfile: run integration tests in parallel * tests: allow-auto-random is no longer experimental * tests: skip local backend better * Jenkinsfile: moved the file elsewhere * optimize parse csv and local backend write tikv (#334) * optimize parse csv and local backend write tikv * fix * remove IndexAnyUtf8 because the implement buggy for csv parser * update pebble and options * fix checkpoint cleanup (#336) * lightning: fix web page not showing when not using server mode (#337) Co-authored-by: Ian <[email protected]> * config,mydumper: replace black-white-list by table-filter (#332) * optimize encoder and adjust some config (#338) * optimize local backend * update some config * update batch size * fix type in config.toml and tidy go.mod * update tools failpoint * fix session * update test config * reset default batch size for not-local backend * reset batch-size for example toml * log: fix log file path (#345) * fix log path * support special log path '-' for stdout * fix local backend index split range (#347) * add log for environment http proxy setting (#340) * add log for environment http proxy setting * fix comments Co-authored-by: kennytm <[email protected]> * do not always change auto increment id (#348) * server: check open file ulimit for local backend (#343) * check open file ulimit for local backend * fix comment and add a test * fix tests * remove useless comments * fix Co-authored-by: Neil Shen <[email protected]> * restore: do not rebase auto-id or generate auto row id for table with common handle (#349) * update for common handle * update * remove parentless * Fix verbose log message for shell (#352) * local: fix batch split retry alway failed error (#356) * fix split * only retry failed keys Co-authored-by: 3pointer <[email protected]> * backend: fix handling of empty binary literals (#357) Co-authored-by: glorv <[email protected]> Co-authored-by: Ian <[email protected]> * add log when execute statement failed (#359) * check checkpoint schema (#354) Co-authored-by: kennytm <[email protected]> * parser: fix csv parse header with empty line (#364) * fix csv header * fix infinite loop * add a test * restore: fix missing colum infos when restore from checkpoint (#362) * fix missing colum infos when restore from checkpoint * add a test * fix * fix * add failpoint for minDeliveryBytes * fix type * fix * update * save column permutation when update chunk checkpoint * fix test * fix review comments * update * local: fix import with common handle (#367) * fix common handle * update * update tidb * fix test * fix test * add new line * enable cluster index for test common handle * reset global variable after common handle test * add check version for common handle test * restore: don't switch mode in tidb backend (#368) * avoid run switch mode in tidb backend * fix comments * restore: support split csv source file with header (#363) * restore: support file level routing (#366) * support file level routings * deprecate table route * add router * set default config * update default regex * set default rule enable of no rule is set and update example config file * remote compression test * fix parse compression * update example config * save file meta to checkpoint * remove size from checkpoint * sort source data files by sort key * fix checkpoint * use fileInfo instead of SourceFileMeta * replace file routing in chunk restore by file meta * remove useless fileRegions * revert some test package * update import * revert changes in loader_test * update imports * add some comments and change type name * fix field extractor to math go regex definition and add some tests * fix test router pattern * simplify pattern check and support files.path * resolve comments * return error if applyed error is invalid * remove useless code and update import * tidb_tools: update dependency (#371) Co-authored-by: glorv <[email protected]> * restore: check header columns (#372) * check csv header columns * resolve comments * fix test * fix unit test * web: update dependencies (#374) * backend: update committs from unix timestamp to pd tso (#379) * update pd dependencies (#380) * local: return error if write to tikv returns no leader info (#381) Co-authored-by: kennytm <[email protected]> * encoder: check string value for tidb encoder (#378) * valid string value for tidb encoder * don't check valid utf8 for blob types * check string according to charset * fix test and remove useless print Co-authored-by: kennytm <[email protected]> * Fix running unit tests on Windows (#375) * backend: make the rlimit check unix-only * mydump: always use '/' as file path separator Co-authored-by: 3pointer <[email protected]> * checkpoint: verify checkpoint when resume from checkpoint (#376) * test: change integration test script to allow run tests in parallel (#382) * backend: split and ingest region size more precise (#369) * wait checkpoint finished if exit before success (#386) * restore: support restore from s3 (#361) * support s3 storage * make filepath compatible with file url * update br * update transaction * update * update br * upda * fix tests * fix tests * fix unit tests * update * adjust test config * revert some redundant changes * fix * add a s3 integration test * fix tests * resolve comments * fix unit test * use c.Mkdir instead of os.TempDir for unit test * update br * backend: fix sample when split region size is small (#387) * fix sample for small value * fix local sample * disable cluster index * add a sleep after set global variables * decrease SlowDownImport sleep time * fix test * backend: use peer address as grpc addr for tiflash store (#392) * fix tiflash and add a test * update * use store.PeerAddress for tiflash * wait for tiflash longer * fix test * update tests README and set longer wait time for tiflash replica * loader: fix store.WalkDir return inaccurate file size for soft link source files (#394) * update br and fix move checkpoints * add integration test for soft link source file * fix test * fix test * use default config * fix unit test * fix chunk checkpoint may reset offset and row id (#395) * make tiflash test more stable (#397) * test: fix integration test for 3.x version (#390) * skip run with local backend for v3.x * add cluster index test for all 3 backends * fix * add check kv pairs count for common handle test * fix test checkpoint_error_destroy * fix common handle test * remove useless comments * fix common handle test * fix test * test: make start tiflash optional in integration test (#398) * make start tiflash optional for v3.0.0 cluster * fix local backend * restore: support restore apache parquet format source files (#373) * backend: fix load partition table with local backend (#402) * fix load partition table with local backend * fix log and typo * fix comment * fix integration test * fix parenthesis * lightning: support dynamically modifying the log level (#393) change the log level through the HTTP API Co-authored-by: 山岚 <[email protected]> * Check Lightning version when reusing checkpoint (#383) * checkpoints: check lightning version too * mydump: decrease the log level of file-route related logs * support new collation for kv encoder (#407) * support new collation * add unit test * remove useless code * fix multi task * add a comment * add a integration for new collation * fix test * resolve comments * mydump: support multi bytes csv delimiter and separator (#406) * more flexible csv * fix config and add unit test * remove useless code * fix unit test * use empty string for default quote * update comments in tidb-lightning.toml for separator and delimiter Co-authored-by: kennytm <[email protected]> * backend: always retry ingest and get region if it's retryble (#405) * retry get region if not region leader available * alway retry if get region return nil * always retry if get region returns nil Co-authored-by: 山岚 <[email protected]> Co-authored-by: kennytm <[email protected]> * Add license scan report and status (#399) Signed off by: fossabot <[email protected]> Co-authored-by: glorv <[email protected]> Co-authored-by: kennytm <[email protected]> * mydump: fix infinite loop in ExportStatement when Read() returns non-EOF (#414) * lightning: start the HTTP server when receiving SIGUSR1 (#415) Co-authored-by: glorv <[email protected]> * backend/tidb: fix issue 410 (#412) * config: fix error on `-d 'C:\Windows\Path'` (#411) * local: fix infinity loop in retry get region (#418) * fix infinity loop in retry get region * update br and parquet-go to fix #416 * update failpoint * update mock ExternalStorage * backend: speed up uploading by open multi TCP connections (#400) * backend: use uncached gRPC channels * backend: use connect pool of gRPCs. * backend: add a conns pool to local backend * backend: remove some unused logs * local: make connpool private * local: make init mutex... * local: address comments * post-restore: add optional level for post-restore operations (#421) * add optional level for opst-restore operations * trim leading and suffix '" * use UnmarshalTOML to unmarshal post restore op level * resolve comments and fix unit test * backend/local: do not retry epochNotMatch error when ingest sst (#419) * do not retry epochNotMatch error when ingest sst * add retry ingest for 'Raft raft: proposal dropped' error in ingest * change some retryable error log level from Error to Warn * fix nextKey * add a comment for nextKey * fix comment and add a unit test * wrap time.Sleep in select Co-authored-by: kennytm <[email protected]> * restore: disable some pd scheduler during restore (#408) * disable some pd scheduler during restore * fix br interface * update failpoint for tools * resolve comments * update br * fix context cancel cause restore scheduler failed error * update br * log: simplify some warn log and do retry write for epoch not match error (#425) * simplify some warn log * fix retry * fix comment * remove useless log * fix test (#426) * backend: fix a bug about wrong column info (#420) * fix a bug about wrong column info * add test Co-authored-by: kennytm <[email protected]> * restore: better estimate task remain time progress log (#377) * optimize progress * update Co-authored-by: 3pointer <[email protected]> Co-authored-by: kennytm <[email protected]> * checksum: use gc ttl api for checksum gc safepoint in v4.0 cluster (#396) * use gc ttl for checksum * fix snapshot ts * resolve a comment * fix unit test * resolve comments * add a unit test * split checksum into a separated file and fix comment * udpate * backend/tidb: add rebase auto id for tidb backend (#428) * add rebase autoid for tidb backend * add fetch auto id and a unit test * avoiding create checksum manager for tidb backend * fix unit test * reset the change auto id code since we can depend the logic in tidb side * also rebase auto random id * fix auto random * fix sql * fix sql * don't disable pd schedulers for import backend * simplify the codes * fix test * fix test * update mock * fix autoid for v4.0.0 (#430) * make: hide go.mod to resolve cyclic dependency with tidb (#439) * config: support encode PostOpLevel and Duration as input (#441) * config: support unmarshall number as PostOpLevel * implement MarshalText instead * Update lightning/config/config_test.go Co-authored-by: kennytm <[email protected]> * address comment Co-authored-by: kennytm <[email protected]> * restore: fix several bugs related to column permutations (#437) * fix multi error in column permutation * add unit test * add a integration test * rename test db * change log * dep: update uuid dependency to latest google/uuid (#452) * dep: update satori/go.uuid to latest * fix tests * change to google/uuid * fix build * try fix test * get familiar with google/uuid * address comment * tidb-lightning-ctl: change default of -d to 'noop://' (#453) also add noop:// to supported storage types (to represent an empty store) * restore: fix the bug that gc life time ttl does not take effect (#448) * fix gc ttl loop * resolve comment and add tests * config: filter out all system schemas by default (#459) * backend: fix auto random default value for primary key (#457) * fix auto generate auto random primary key column * fix default for auto random primary key * fix test * use prev row id for auto random and add a test * replace chunck with session opt * fix * fix * mydumper: fix parquet data parser (#435) * fix parquet * reorder imports * fix test * use empty collation * fix a error and add more test cases * add pointer type tests * resolve comments Co-authored-by: kennytm <[email protected]> * backend/local: use range properties to optimize region range estimate (#422) * use range propreties to estimate region range * post-restore: add optional level for post-restore operations (#421) * add optional level for opst-restore operations * trim leading and suffix '" * use UnmarshalTOML to unmarshal post restore op level * resolve comments and fix unit test * backend/local: do not retry epochNotMatch error when ingest sst (#419) * do not retry epochNotMatch error when ingest sst * add retry ingest for 'Raft raft: proposal dropped' error in ingest * change some retryable error log level from Error to Warn * fix nextKey * add a comment for nextKey * fix comment and add a unit test * wrap time.Sleep in select Co-authored-by: kennytm <[email protected]> * update * use range properties to optimze region range estimate * update pebble * change the default value for batch-size * add unit tests and reslove comments * add a comment to range properties test * add a comment * add a test for range property with pebble * rename const variable Co-authored-by: kennytm <[email protected]> * fix pd service id is empty (#460) * fix s3 parquet reader (#461) Co-authored-by: Neil Shen <[email protected]> * fix service gc ttl again (#465) * mydumper: verify file routing config (#470) * fix file routing * remove useless line * remove redundant if check * config: allow four byte-size config to be specified using human-readable units ("100 GiB") (#471) * Makefile: add `make finish-prepare` action * config: accept human-readable size for most byte-related config e.g. allow `region-split-size = '96M'` in additional to `= 100663296` (known issue: these values' precisions will be truncated to 53 bits instead of supporting all 63 bits) * restore: reduce chance of spurious errors from TestGcTTLManagerSingle Co-authored-by: glorv <[email protected]> * test: change double type syntax (#474) * restore: add `glue.Glue` interface and other function (#456) * save my work * add notes * save work * save work * fix unit test * remove tidbMgr in RestoreController * remove some comments * remove some comments * change logger in SQLWithRetry * revert replace log.Logger to *zap.Logger * replace tab to space * try another port to fix CI * remove some comment * *: more glue * report info to host TiDB * fix CI * address comment * address comment * rename a method in interface * save work * try fix CI * could work * change ctx usage * try fix CI * try fix CI * refine function interface * refine some fucntion interface * debug CI * address comment * remove debug log * address comment * glue: add GlueCheckpointDB and remove external TiDB usage (#478) * save my work add notes save work save work fix unit test remove tidbMgr in RestoreController remove some comments remove some comments change logger in SQLWithRetry revert replace log.Logger to *zap.Logger dep: update uuid dependency to latest google/uuid (#452) * dep: update satori/go.uuid to latest * fix tests * change to google/uuid * fix build * try fix test * get familiar with google/uuid * address comment tidb-lightning-ctl: change default of -d to 'noop://' (#453) also add noop:// to supported storage types (to represent an empty store) replace tab to space try another port to fix CI remove some comment *: more glue restore: fix the bug that gc life time ttl does not take effect (#448) * fix gc ttl loop * resolve comment and add tests fix CI report info to host TiDB config: filter out all system schemas by default (#459) backend: fix auto random default value for primary key (#457) * fix auto generate auto random primary key column * fix default for auto random primary key * fix test * use prev row id for auto random and add a test * replace chunck with session opt * fix * fix mydumper: fix parquet data parser (#435) * fix parquet * reorder imports * fix test * use empty collation * fix a error and add more test cases * add pointer type tests * resolve comments Co-authored-by: kennytm <[email protected]> address comment backend/local: use range properties to optimize region range estimate (#422) * use range propreties to estimate region range * post-restore: add optional level for post-restore operations (#421) * add optional level for opst-restore operations * trim leading and suffix '" * use UnmarshalTOML to unmarshal post restore op level * resolve comments and fix unit test * backend/local: do not retry epochNotMatch error when ingest sst (#419) * do not retry epochNotMatch error when ingest sst * add retry ingest for 'Raft raft: proposal dropped' error in ingest * change some retryable error log level from Error to Warn * fix nextKey * add a comment for nextKey * fix comment and add a unit test * wrap time.Sleep in select Co-authored-by: kennytm <[email protected]> * update * use range properties to optimze region range estimate * update pebble * change the default value for batch-size * add unit tests and reslove comments * add a comment to range properties test * add a comment * add a test for range property with pebble * rename const variable Co-authored-by: kennytm <[email protected]> fix pd service id is empty (#460) fix s3 parquet reader (#461) Co-authored-by: Neil Shen <[email protected]> fix service gc ttl again (#465) address comment mydumper: verify file routing config (#470) * fix file routing * remove useless line * remove redundant if check rename a method in interface save work try fix CI could work change ctx usage try fix CI try fix CI refine function interface refine some fucntion interface debug CI address comment config: allow four byte-size config to be specified using human-readable units ("100 GiB") (#471) * Makefile: add `make finish-prepare` action * config: accept human-readable size for most byte-related config e.g. allow `region-split-size = '96M'` in additional to `= 100663296` (known issue: these values' precisions will be truncated to 53 bits instead of supporting all 63 bits) * restore: reduce chance of spurious errors from TestGcTTLManagerSingle Co-authored-by: glorv <[email protected]> remove debug log test: change double type syntax (#474) address comment checkpoint: add glue checkpoint resolve cycle import expose Retry refine change interface to cope with TiDB fix SQL string fix SQL adjust interface to embedded in TiDB could import now reduce TLS restore: add `glue.Glue` interface and other function (#456) * save my work * add notes * save work * save work * fix unit test * remove tidbMgr in RestoreController * remove some comments * remove some comments * change logger in SQLWithRetry * revert replace log.Logger to *zap.Logger * replace tab to space * try another port to fix CI * remove some comment * *: more glue * report info to host TiDB * fix CI * address comment * address comment * rename a method in interface * save work * try fix CI * could work * change ctx usage * try fix CI * try fix CI * refine function interface * refine some fucntion interface * debug CI * address comment * remove debug log * address comment modify code add comment refine some code * address comment * add some comments * fix CI and change CREATE TABLE * *: replace context.Backend with app context (#468) * replace context.Backend with app context * remove tls.GetJSON * rename function name * fix test Co-authored-by: lance6716 <[email protected]> * restore: wait sub task finish before exit (#485) * wait sub task finish before exit * add a comment * mydumper: convert parquet columns to lower case (#479) * convert parquet columns to lower case * simplify the for loop Co-authored-by: lance6716 <[email protected]> * test: fix an unstable integration test (#492) * mydumper: optimize parquet reader performance (#482) * get parquet row count faster * add a log * convert parquet columns to lower case * make calculate file regions in parallel * load full content to memory for small files * fix * fix * add file size in file meta * update comment * update * fix dead lock when met error * rename all Size in chunk checkpoint to FileSize * reset * restore: let the tikv checksum manager respect the DistSQLScanConcurrency (#483) Co-authored-by: 3pointer <[email protected]> * support restore view (#417) * support restore view * make router compatible * fix * don't genearte test files in run.sh * execute multi create table stmt in serial * fix unit test * fix test * resolve comments Co-authored-by: 3pointer <[email protected]> * backend/local: more robust range retry strategy (#476) * retry write&ingest range * more robust retry ranges * fix * simplify code and add a log Co-authored-by: lance6716 <[email protected]> Co-authored-by: kennytm <[email protected]> * backend/local: batch split region with batch limit (#487) * kill tiflash by name (#499) * .github: let challenge-bot recognize the default SIG (#498) Co-authored-by: lance6716 <[email protected]> * backend: support stored generated columns in local/importer backends (#505) * tests: fixed an invalid failpoint * backend: support stored generated expressions * tests: add generated columns test case * backend/local: check and return iter.Error when pebble is not valid (#497) * check and return iter.Error when pebble is not valid * replace errors.Annotatef with errors.Annotate * fix * add error check for iter.Last() Co-authored-by: lance6716 <[email protected]> * backend,restore: duplicate more important system variables from downstream (#508) * backend,restore: duplicate more important system variables from downstream * backend: fix test failure * tests: enhance the gencol test case to include sysvar-dep exprs * tests/auto_random_default: relax the check * tests/generated_columns: add a retry loop to ensure sys vars are changed * backend: fix invalid gencol sort algorithm * tests/generated_columns: disable the week test since tidb is buggy Co-authored-by: glorv <[email protected]> * backend/local: set pebble db max file limit (#501) * set pebble db max file limit * fix test * fix imports * reslove review comments * add GetSystemRLimit for windows * mydump: fix issue 519 (#521) * backend/local: remove useless and buggy truncate key (#516) * restore: apply adjust max-pending-peer-count when stop pd schedulers (#517) * backend/local: fix next key (#523) * fix next key * fix integration test * backend: import planner/core package to initialize expression.RewriteAstExpr (#526) * *: add some error description (#527) * test: fix unstable integration test auto_random_default (#529) * post-process: support run table analyze after all tables are finished (#509) * batch split with limit * batch split with limit * update * add log with split region failed * set batch split size to 2048 * add delay to retry split region * set outer loop retry split regions to a bigger value * update * add retry for region scatter * update br * wait some time before retry scatter region * add start/end key to log if scan region failed * update br * fix session * work around a panic * fix unit test * support analyze at last * fix * fix * fix * better naming and add some comments Co-authored-by: lance6716 <[email protected]> * encode retry split key (#531) Signed-off-by: glorv <[email protected]> * restore: fix error lost in create schema (#530) * mydumper: update br to apply auto retry s3 read error (#533) * Sort index rather than insert it into skiplist (#520) * use large writebatch for index engine Signed-off-by: Little-Wallace <[email protected]> * pass more data Signed-off-by: Little-Wallace <[email protected]> * fix mock Signed-off-by: Little-Wallace <[email protected]> * fix data race Signed-off-by: Little-Wallace <[email protected]> * fix test Signed-off-by: Little-Wallace <[email protected]> * fix max file opens Signed-off-by: Little-Wallace <[email protected]> * fix sst path bug Signed-off-by: Little-Wallace <[email protected]> * close datawriter Signed-off-by: Little-Wallace <[email protected]> * fix err get Signed-off-by: Little-Wallace <[email protected]> * flush after index writer close Signed-off-by: Little-Wallace <[email protected]> * less memory Signed-off-by: Little-Wallace <[email protected]> * fix property Signed-off-by: Little-Wallace <[email protected]> * rever deliverLoop encoding Signed-off-by: Little-Wallace <[email protected]> * add test Signed-off-by: Little-Wallace <[email protected]> * fix test Signed-off-by: Little-Wallace <[email protected]> * refactor to reduce copy memory Signed-off-by: Little-Wallace <[email protected]> * move function position Signed-off-by: Little-Wallace <[email protected]> * fix test Signed-off-by: Little-Wallace <[email protected]> * fix sstDir Signed-off-by: Little-Wallace <[email protected]> * fix fmt Signed-off-by: Little-Wallace <[email protected]> * clear writebatch Signed-off-by: Little-Wallace <[email protected]> * do not create sst writer if keys is too small Signed-off-by: Little-Wallace <[email protected]> * revert indexEngineID Signed-off-by: Little-Wallace <[email protected]> * revert irrelevant changes Signed-off-by: Little-Wallace <[email protected]> * fix comment Signed-off-by: Little-Wallace <[email protected]> * close when err occurs Signed-off-by: Little-Wallace <[email protected]> * add mock method call Signed-off-by: Little-Wallace <[email protected]> * *: redact log and error messages, add log-redact parameter (#538) * add --redact-log parameter and redact sensitive log * remove sensitive info in error * mydumper: do not remove more than 1 sep if trim last sep is true (#535) * restore: add error retry for checksum by tikv (#537) * add error retry for checksum by tikv * resolve comments * add retryable error check * post-process: allow run checksum at last and restrict the number of checksum jobs (#540) * restore: don't change TiDB config to support lightning via SQL (#545) * restore: check row value count to avoid unexpected encode result (#528) * check row value count to avoid unexpected encode result * check the '_tidb_row_id' field * resolve comments * fix issue related to '_tidb_rowid' and move column count to tidb encoder * add tidb_opt_write_row_id session var * fix test * resolve comments * update tidb to apply tidb#22062 * fix test * restore: Try to create tables in parallel (#502) * restore: try to create tables in parallel * glue: fix error condition test for db close * restore, glue: remove duplicate db pool implementation * restore: try to prevent db connection too early * restore: try to prevent db connection too early * restore: try to make restore schema run parallel totally * restore: remove impl&test of tidb#InitSchema * restore: make restore schema run parallelly * restore: remove db connection control * restore: a little change of restore schema schedule * wip * restore: keep restore schema job hold the same session(DB connection) * restore: fix log message error * restore: remove purpose array for `restoreSchemaWorker` more details https://github.com/pingcap/tidb-lightning/pull/502#discussion_r542213517 * restore: remove useless sql mode set code for more https://github.com/pingcap/tidb-lightning/pull/502#discussion_r542207095 * restore: restore view statements run after database|table created for more: https://github.com/pingcap/tidb-lightning/pull/502#discussion_r542218427 * restore: interrupt job producing when error happens * util: add SQLDriver interface * restore: make sure single restore schema job vs. single db session * restore: run restore view schema statements in txn * glue: add checkpoints.Session implementation(sqlConnSession) * restore: close whole database connections after restore schema done * restore: revert remove of `InitSchema` for more: https://github.com/pingcap/tidb-lightning/pull/502#discussion_r542208980 * glue: return a new error when sqlConnSesson.CommitTxn called * Revert "util: add SQLDriver interface" This reverts commit 3e2cc16b1037ea4dafdfc2cf0156ed4951f7a64f. * glue: update GetSession(context.Context) for Glue interface * glue: disable more methods of sqlConnSession * restore: disable implicit initiation of `sync.WaitGroup` * restore: cancel nil error throw when restore schema done * restore: replace session map to pool * restore: keep restore table statements ordered * restore: assign single session to `restoreSchemaWorker#doJob`'s goroutine * restore: add log for `restoreSchema` * restore: add quit case when error thrown blocked * restore: Improve the robustness of concurrency pattern * restore: fix channel send/recv logic to avoid blocked forever occurs. * restore: `sync.WaitGroup#Add` first when `restoreSchemaWorker#appendJob` * restore: add impl of `schemaStmtType#String` * restore: avoid to wait whole jobs done forever when goroutine of `doJob` exit unnormal * restore: call cancel function when `makeJobs` exit * restore: a few improvement * test: add unit tests of `RestoreController#restoreSchema()` Co-authored-by: lance6716 <[email protected]> Co-authored-by: glorv <[email protected]> * Compatible for disk quota (#543) * compatible for disk quota Signed-off-by: Little-Wallace <[email protected]> * unlock when err occur Signed-off-by: Little-Wallace <[email protected]> * fix delete Signed-off-by: Little-Wallace <[email protected]> * fix method name Signed-off-by: Little-Wallace <[email protected]> Co-authored-by: kennytm <[email protected]> Co-authored-by: glorv <[email protected]> * *: add method to check whether need local SST of table (#491) * backend/local: skip split regions if engine total size is smaller than region size (#524) * config: change redact log parameter name (#547) * change redact log parameter name * address comment * update lightning.toml * backend/tidb: temporarily disable the strict-mode value check in tidb backend (#551) * temporarily disable the strict-mode value check in tidb backend since it's buggy * add link to related issue * test: fix invalid failpoint and integration test (#510) * test: fix s3 integration test (#555) * grafana dashboards support multiple cluster (#556) * metrics: use tidb_cluster label get variable values (#559) Signed-off-by: zhengjiajin <[email protected]> * restore: add importing progress and optimize the accuracy of restore progress (#506) * backend/local: fallback retryIngest to retryWrite (#554) * backend: implement disk quota (#493) * common: copied the GetStorageSize function from DM * common: recognize multierr in IsRetryableError() * restore: refactor runPeriodicActions * config: …
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
difficulty/2-medium
Medium-difficulty issue
help wanted
Extra attention is needed
priority/P2
Medium priority issue
type/feature-request
New feature or request
Feature Request
Describe your feature request related problem:
one example: if the collation configuration of backup & restore tidb cluster is different, br restore will fail as expected. However, the user only sees the checksum failure through the log, and does not know that it is caused by the inconsistency of the collation configuration.
Describe the feature you'd like:
br should support fast fail In all similar scenarios
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Migration Strategy:
The text was updated successfully, but these errors were encountered: