-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: TiFlash support in Reorganize Partition | tidb-test=pr/2069 #40715
*: TiFlash support in Reorganize Partition | tidb-test=pr/2069 #40715
Conversation
…nto reorg-part-tiflash-and-update-table
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
I have manually test this PR together with pingcap/tiflash#6428 and the data will sync to TiFlash in the new partitions as soon as they are added to the AddingDefinitions when going to StateDeleteOnly. |
@@ -2373,20 +2398,25 @@ func (w *worker) onReorganizePartition(d *ddlCtx, t *meta.Meta, job *model.Job) | |||
// For available state, the new added partition should wait its replica to | |||
// be finished, otherwise the query to this partition will be blocked. | |||
count := tblInfo.TiFlashReplica.Count | |||
needRetry, err := checkPartitionReplica(count, addingDefinitions, d) | |||
needRetry, err := checkPartitionReplica(count, count, addingDefinitions, d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if one partition replica is not available, then the ddl will block wait? Why we make the change from 1 to all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid the situation when a table has 3 replicas and after the DDL the new partitions only has 1 without any warnings etc.
If that is OK and expected, then I can remove the check for all and keep it to only check for one replica.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hehechen I think we can take partition process into account in current code base. It means given a 3 replica table which has 1 partition, if only 1 replica is available since then, we can see available=true and progress=0.3333.
If I am right, then I think the both the original "1 replica" policy or the "all replica" policy in this PR is acceptable, while the best way is somehow reuse the function in ddl_tiflash.go. So what do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think It is OK to keep checking only one replica, because when only one of the three replicas is ready, the progress
will become 0.33.
@@ -1813,6 +1822,22 @@ func (w *worker) onDropTablePartition(d *ddlCtx, t *meta.Meta, job *model.Job) ( | |||
return ver, errors.Trace(err) | |||
} | |||
} | |||
if tblInfo.TiFlashReplica != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We delete every id in physicalTableIDs
from AvailablePartitionIDs
.
Could you explain a bit more? I didn't get the idea here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not find that dropped partitions were ever removed from the AvailablePartitionIDs, so I added it here, since when dropping (or cleaning up after a failed add or reorganize partition) those ids should no longer be in that list.
Not sure if this explains it?
Or how is the AvailablePartitionIDs list handled in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got your point, and I think it makes sense.
However, I am wondering why the system can work before this and what will be changed after this. I have not modified these codes, and I found they exist in release-5.4 which is a very old version, so they actually work for a long time.
IMO, they can work because this is a drop action, a partition is dropped, and will not be accessed then. However, its data will not be deleted until being deleted by the gc worker. During the time gap, if we must access the partition, they are actually available in TiFlash. And if we do a flashback, they will be available again(maybe some tests could be added here). So the author have written like this.
I think it could be better if we can ask the author for a comfirmation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be better if we can ask the author for a comfirmation.
@crazycs520 Could you please take a look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's reasonable to delete dropped partition id from TiFlashReplica.AvailablePartitionIDs
here. You can find similar logic in onTruncateTablePartition
function too.
I didn't delete the dropped partition id in tblInfo.TiFlashReplica.AvailablePartitionIDs
before in onDropTablePartition
function because it didn't affect the correctness. I think this was my previous oversight, even if it didn't affect correctness.
BTW, I think the name of the variable physicalTableIDs
here is misleading, here physicalTableIDs
actually only contains the partition id which in dropping, not all partition's ID.
…nto reorg-part-tiflash
if err != nil { | ||
// need to rollback, since we tried to register the new | ||
// partitions before! | ||
return convertAddTablePartitionJob2RollbackJob(d, t, job, err, tblInfo) | ||
} | ||
// Try for 10 rounds (in case of transient TiFlash issues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why try for 10 rounds instead of tidb_ddl_error_count_limit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just a number taken from thin air. My idea is that we should not block too long on TiFlash issues, but if it takes some time, we will skip the wait and add an entry in the error log, so that the DDL can continue at least with the TiKV data. I would assume that if TiFlash cannot replicate the new empty regions, then it will also have problems with replicating current tables regions.
@hehechen what would you suggest? I can remove this, so it will fail after tidb_ddl_error_count_limit
Or should it wait for tidb_ddl_error_count_limit
errors and then proceed (including resetting the count)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just a number taken from thin air. My idea is that we should not block too long on TiFlash issues, but if it takes some time, we will skip the wait and add an entry in the error log, so that the DDL can continue at least with the TiKV data. I would assume that if TiFlash cannot replicate the new empty regions, then it will also have problems with replicating current tables regions. @hehechen what would you suggest? I can remove this, so it will fail after
tidb_ddl_error_count_limit
Or should it wait fortidb_ddl_error_count_limit
errors and then proceed (including resetting the count)?
If job.ErrorCount > 10, will return convertAddTablePartitionJob2RollbackJob
and then onDropTablePartition
(Line 2243) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If job.ErrorCount > 10, will return convertAddTablePartitionJob2RollbackJob and then onDropTablePartition(Line 2243) ?
@hehechen Yes.
Did we encounter a case that 'we got some problems when making TiFlash replica, which blocks DDL and it retries a lot of time'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an alternative PR, without this change of logic here: #42082 Maybe we should merge that one instead, to get all the tests in and wait with extending the logic until we actually see any issues?
Only add the tests instead, in #42082. |
What problem does this PR solve?
Implementing Reorganize Partition specifics for TiFlash
Issue Number: close #38535
Problem Summary:
What is changed and how it works?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.