Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: TiFlash support in Reorganize Partition | tidb-test=pr/2069 #40715

Closed

Conversation

mjonss
Copy link
Contributor

@mjonss mjonss commented Jan 18, 2023

What problem does this PR solve?

Implementing Reorganize Partition specifics for TiFlash

Issue Number: close #38535

Problem Summary:

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

mjonss and others added 30 commits December 20, 2022 20:36
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jan 18, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • tangenta

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 18, 2023
@mjonss mjonss changed the title *: TiFlash support in Reorganize Partition *: TiFlash support in Reorganize Partition | tidb-test=pr/2069 Jan 18, 2023
@ti-chi-bot ti-chi-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 19, 2023
@mjonss
Copy link
Contributor Author

mjonss commented Jan 20, 2023

I have manually test this PR together with pingcap/tiflash#6428 and the data will sync to TiFlash in the new partitions as soon as they are added to the AddingDefinitions when going to StateDeleteOnly.

@@ -2373,20 +2398,25 @@ func (w *worker) onReorganizePartition(d *ddlCtx, t *meta.Meta, job *model.Job)
// For available state, the new added partition should wait its replica to
// be finished, otherwise the query to this partition will be blocked.
count := tblInfo.TiFlashReplica.Count
needRetry, err := checkPartitionReplica(count, addingDefinitions, d)
needRetry, err := checkPartitionReplica(count, count, addingDefinitions, d)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if one partition replica is not available, then the ddl will block wait? Why we make the change from 1 to all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid the situation when a table has 3 replicas and after the DDL the new partitions only has 1 without any warnings etc.
If that is OK and expected, then I can remove the check for all and keep it to only check for one replica.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hehechen I think we can take partition process into account in current code base. It means given a 3 replica table which has 1 partition, if only 1 replica is available since then, we can see available=true and progress=0.3333.
If I am right, then I think the both the original "1 replica" policy or the "all replica" policy in this PR is acceptable, while the best way is somehow reuse the function in ddl_tiflash.go. So what do you think about this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think It is OK to keep checking only one replica, because when only one of the three replicas is ready, the progress will become 0.33.

@@ -1813,6 +1822,22 @@ func (w *worker) onDropTablePartition(d *ddlCtx, t *meta.Meta, job *model.Job) (
return ver, errors.Trace(err)
}
}
if tblInfo.TiFlashReplica != nil {
Copy link
Member

@CalvinNeo CalvinNeo Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We delete every id in physicalTableIDs from AvailablePartitionIDs.
Could you explain a bit more? I didn't get the idea here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find that dropped partitions were ever removed from the AvailablePartitionIDs, so I added it here, since when dropping (or cleaning up after a failed add or reorganize partition) those ids should no longer be in that list.
Not sure if this explains it?
Or how is the AvailablePartitionIDs list handled in this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got your point, and I think it makes sense.
However, I am wondering why the system can work before this and what will be changed after this. I have not modified these codes, and I found they exist in release-5.4 which is a very old version, so they actually work for a long time.
IMO, they can work because this is a drop action, a partition is dropped, and will not be accessed then. However, its data will not be deleted until being deleted by the gc worker. During the time gap, if we must access the partition, they are actually available in TiFlash. And if we do a flashback, they will be available again(maybe some tests could be added here). So the author have written like this.
I think it could be better if we can ask the author for a comfirmation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be better if we can ask the author for a comfirmation.

@crazycs520 Could you please take a look?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's reasonable to delete dropped partition id from TiFlashReplica.AvailablePartitionIDs here. You can find similar logic in onTruncateTablePartition function too.

I didn't delete the dropped partition id in tblInfo.TiFlashReplica.AvailablePartitionIDs before in onDropTablePartition function because it didn't affect the correctness. I think this was my previous oversight, even if it didn't affect correctness.

BTW, I think the name of the variable physicalTableIDs here is misleading, here physicalTableIDs actually only contains the partition id which in dropping, not all partition's ID.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jan 20, 2023
if err != nil {
// need to rollback, since we tried to register the new
// partitions before!
return convertAddTablePartitionJob2RollbackJob(d, t, job, err, tblInfo)
}
// Try for 10 rounds (in case of transient TiFlash issues)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why try for 10 rounds instead of tidb_ddl_error_count_limit ?

Copy link
Contributor Author

@mjonss mjonss Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just a number taken from thin air. My idea is that we should not block too long on TiFlash issues, but if it takes some time, we will skip the wait and add an entry in the error log, so that the DDL can continue at least with the TiKV data. I would assume that if TiFlash cannot replicate the new empty regions, then it will also have problems with replicating current tables regions.
@hehechen what would you suggest? I can remove this, so it will fail after tidb_ddl_error_count_limit Or should it wait for tidb_ddl_error_count_limit errors and then proceed (including resetting the count)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just a number taken from thin air. My idea is that we should not block too long on TiFlash issues, but if it takes some time, we will skip the wait and add an entry in the error log, so that the DDL can continue at least with the TiKV data. I would assume that if TiFlash cannot replicate the new empty regions, then it will also have problems with replicating current tables regions. @hehechen what would you suggest? I can remove this, so it will fail after tidb_ddl_error_count_limit Or should it wait for tidb_ddl_error_count_limit errors and then proceed (including resetting the count)?

If job.ErrorCount > 10, will return convertAddTablePartitionJob2RollbackJob and then onDropTablePartition(Line 2243) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If job.ErrorCount > 10, will return convertAddTablePartitionJob2RollbackJob and then onDropTablePartition(Line 2243) ?

@hehechen Yes.

Did we encounter a case that 'we got some problems when making TiFlash replica, which blocks DDL and it retries a lot of time'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an alternative PR, without this change of logic here: #42082 Maybe we should merge that one instead, to get all the tests in and wait with extending the logic until we actually see any issues?

@mjonss
Copy link
Contributor Author

mjonss commented May 8, 2023

Only add the tests instead, in #42082.

@mjonss mjonss closed this May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.