Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log-backup: use the span tree(instead of the naïve algorithm) to calculate the checkpoint #39122

Merged
merged 17 commits into from
Nov 29, 2022

Conversation

YuJuncen
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #N/A

Problem Summary:
We were using a naive algorithm to advance the checkpoint before:
it simply maintains a mapping from timestamp to ranges. Which would probably store too many radiant items.

What is changed and how it works?

This PR added a new data structure named spans.ValueSorted in the code.
It contains a non-overlapped, full range tree as the "primary key" and a binary search tree with timestamp as key range as value for indexing.
The primary tree has some characteristics:

  • FULL: The union of all ranges it contains is the full range (the universal set).
  • NON-OVERLAPPED: Any pair of ranges from it doesn't overlap.

For keeping these characteristics, the interface it provides is limited, when we inserting some ranges into it, it would "merge" those new ranges into its current state.

An example here:

Each flush would generate a new checkpoint: usually, it updates the checkpoint of some range by the regions it flushed. In a more detailed view, you can imagine the key space as a line:

|___________________________________________________________________________|
""                                                                        inf
Then the checkpoint c of each range can be presented as:
|___________________________________________________________________________|
^-----------------^-----------------^-----------------^---------------------^
|      c = 42     |      c = 43     |     c = 45      |      c = 41         |

Once querying a checkpoint of some range, we simply choose the minimal of all its intersected ranges.

|___________________________________________________________________________|
^-----------------^-----------------^-----------------^---------------------^
|      c = 42     |      c = 43     |     c = 45      |      c = 41         |
    ^--------------------------^
    |   c = min(42, 43) = 42   |

A flush would update the checkpoint of a subrange. At this time, we fill the points in the range with max(new_checkpoint, old_checkpoint), we are going to call this step Merge, just like:

|___________________________________________________________________________|
^-----------------^-----------------^-----------------^---------------------^
|      c = 42     |      c = 43     |     c = 45      |      c = 41         |
                       ^--------------------------^
                       |      merge(c = 44)       |

Would Give:

|___________________________________________________________________________|
^-----------------^----^------------^-------------^---^---------------------^
|      c = 42     | 43 |   c = 44   |     c = 45      |      c = 41         |
                                    |-------------|
                                    Unchanged, because 44 < 45.

Check List

Tests

  • Unit test
  • [] Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Nov 14, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • Leavrth
  • joccau

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 14, 2022
@YuJuncen YuJuncen added skip-issue-check Indicates that a PR no need to check linked issue. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 14, 2022
Signed-off-by: hillium <[email protected]>
@ti-chi-bot ti-chi-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Nov 14, 2022
br/pkg/streamhelper/advancer.go Outdated Show resolved Hide resolved
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Nov 25, 2022
Signed-off-by: hillium <[email protected]>
Copy link
Member

@joccau joccau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reset LGTM

br/pkg/streamhelper/spans/utils.go Show resolved Hide resolved
br/pkg/streamhelper/spans/sorted.go Show resolved Hide resolved
Copy link
Member

@joccau joccau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Nov 29, 2022
@YuJuncen
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 7ba6318

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Nov 29, 2022
@joccau
Copy link
Member

joccau commented Nov 29, 2022

/run-mysql-test

1 similar comment
@YuJuncen
Copy link
Contributor Author

/run-mysql-test

@ti-chi-bot ti-chi-bot merged commit 52d137d into pingcap:master Nov 29, 2022
@sre-bot
Copy link
Contributor

sre-bot commented Nov 29, 2022

TiDB MergeCI notify

🔴 Bad News! New failing [1] after this pr merged.
These new failed integration tests seem to be caused by the current PR, please try to fix these new failed integration tests, thanks!

CI Name Result Duration Compare with Parent commit
idc-jenkins-ci-tidb/sqllogic-test-2 🟥 failed 1, success 27, total 28 6 min 42 sec New failing
idc-jenkins-ci-tidb/integration-common-test 🟢 all 17 tests passed 25 min Existing passed
idc-jenkins-ci/integration-cdc-test 🟢 all 40 tests passed 19 min Existing passed
idc-jenkins-ci-tidb/common-test 🟢 all 11 tests passed 14 min Existing passed
idc-jenkins-ci-tidb/sqllogic-test-1 🟢 all 26 tests passed 7 min 14 sec Existing passed
idc-jenkins-ci-tidb/integration-ddl-test 🟢 all 6 tests passed 6 min 28 sec Existing passed
idc-jenkins-ci-tidb/mybatis-test 🟢 all 1 tests passed 6 min 15 sec Existing passed
idc-jenkins-ci-tidb/tics-test 🟢 all 1 tests passed 6 min 3 sec Existing passed
idc-jenkins-ci-tidb/integration-compatibility-test 🟢 all 1 tests passed 3 min 53 sec Existing passed
idc-jenkins-ci-tidb/plugin-test 🟢 build success, plugin test success 4min Existing passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. skip-issue-check Indicates that a PR no need to check linked issue. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants