-
Notifications
You must be signed in to change notification settings - Fork 66
Analyze table timeout in lightning #447
Comments
There are some strage lines in the tikv log that tikv found several extremely large region need to be splited:
Lightning has ingested so much sst files to this region.
But the ingested ranges has no overlap:
From lightning log, we can see the related engine started import and split regions quite early:
And there are no partial ingest finish or ingest failed log for this engine range, And from the tikv log, we can see the region
So I think the root cause is that because in lightning v4.0.7, we don't support disable schedulers, thus when meets engine import quite slow, after 1 hour, the splited region was automated merge again since they are still empty. Then lightning will ingest all the range to the new merged region. |
So I think this issue should be fixed by #408, @shuijing198799 could you please use the newest lightning to retry this test again? |
There are two bugs in lightning that together will cause this issue:
|
This issue occurs 3 times in 4 use 8C 200G ebs with restore 100G data when we use lightning v4.0.7, Use 4.0.8 with the same data, this problem not occured in next 3 restores. And @glorv gives the root cause for this issue. so we can make sure this issue has been fixed. |
Bug Report
Please answer these questions before submitting your issue. Thanks!
I use lightning to restore 100G data from s3, at the end of restore processing, the lightning repoert analyze timeout error
What did you expect to see?
restore successful
What did you see instead?
nothing
Versions of the cluster
TiDB-Lightning version (run
tidb-lightning -V
):Operation Log
lightning (2).log
Configuration of the cluster and the task
tidb-lightning.toml
for TiDB-Lightning if possibletikv-importer.toml
for TiKV-Importer if possibleinventory.ini
if deployed by AnsibleThe text was updated successfully, but these errors were encountered: