Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update sync-diff-inspector to perform row-level comparison when table checksums match #784

Closed

Conversation

michaelmdeng
Copy link
Contributor

@michaelmdeng michaelmdeng commented May 4, 2024

What problem does this PR solve?

Current behavior is to perform row-level data comparison only when table checksums don't match and when user requests DML generation. Given #634, there are possible checksum collisions that cause sync-diff-inspector to return false negatives where it considers two different tables to be equal. In these scenarios, sync-diff-inspector will not compare rows and will mistakenly determine tables to be equal.

A checksum can only identify true negatives, ie. two tables are certainly different, and cannot determine true positives, ie. two tables are certainly equal. Thus a more desirable behavior is to use the checksum to remove the need for more expensive row-level data comparison when a simpler/faster checksum can already tell us the tables are different.

However, in the common case that the user wants to use sync-diff-inspector to generate DML statements when the tables are different, we need to perform row-level comparison anyway.

Issue Number: close #634

What is changed and how it works?

Thus, we should perform row-level data comparison if user requests DML generation or if the checksums match. The only case where we shouldn't compare row data is when the user does not desire DML and if the checksums don't match (confirming tables are different w/out data check).

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Generate checksum collision

create table test_1.test(id int primary key,b blob);
insert into test_1.test values (1, x'FF00'), (2, x'00FF');
create table test_2.test(id int primary key,b blob);
insert into test_2.test values (1, x'00FF'), (2, x'FF00');

Confirm that current state considers tables equal. Confirm changed state considers tables different

Code changes

  • Has exported function/method change
  • Has exported variable/fields change
  • Has interface methods change
  • Has persistent data change

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

This change will run row-level data comparison in more cases than previous, which can be more resource-intensive than simply checksumming. However this is in exchange for more correctness in cases where checksum collisions occur.

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation
  • Need to be included in the release note

Copy link

ti-chi-bot bot commented May 4, 2024

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

ti-chi-bot bot commented May 4, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign joccau for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@CLAassistant
Copy link

CLAassistant commented May 4, 2024

CLA assistant check
All committers have signed the CLA.

Copy link

ti-chi-bot bot commented May 4, 2024

Welcome @michaelmdeng!

It looks like this is your first PR to pingcap/tidb-tools 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!



Thank you, and welcome to pingcap/tidb-tools. 😃

@lyzx2001
Copy link

lyzx2001 commented May 7, 2024

/retest

@lyzx2001
Copy link

lyzx2001 commented May 7, 2024

Although this pr can fix the false positive issue, the row-level comparison will cause huge performance regression, as it abandons the distributed computing in tikv.

We do not prefer to fix an issue with very low probability to happen at the cost of performance regression for most cases.

For now, we may still use this pr #707 to solve the problem.

@michaelmdeng
Copy link
Contributor Author

Although this pr can fix the false positive issue, the row-level comparison will cause huge performance regression, as it abandons the distributed computing in tikv.

We do not prefer to fix an issue with very low probability to happen at the cost of performance regression for most cases.

For now, we may still use this pr #707 to solve the problem.

Ack, I've opened #787 as an alternative approach using the checksum improvements you reference.

@michaelmdeng
Copy link
Contributor Author

Closing in favor of #787

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BIT_ XOR function has a collision example that is easy to occur
4 participants