Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tikv: refine commit backoff slow log #11757

Merged
merged 6 commits into from
Aug 26, 2019
Merged

tikv: refine commit backoff slow log #11757

merged 6 commits into from
Aug 26, 2019

Conversation

lysu
Copy link
Contributor

@lysu lysu commented Aug 16, 2019

What problem does this PR solve?

  1. we don't know how much time cause by backoff for a slow commit

for _, batch1 := range batches {

will fork or clone a new backoff to parallel handle key groups and this let backoff time be miss in commit slow log

  1. when we have known a commit is slow due to backoff, but we don't know why it backoff

What is changed and how it works?

  • record delta sleep into parent backoff to let backoff accurate for commit operation
  • add backoffTypes fields to commit slow log

after them, we can got log like this:

# Time: 2019-08-16T16:32:02.281654197+08:00
# Txn_start_ts: 410502908375465985
# User: [email protected]
# Conn_ID: 1
# Query_time: 0.132051478
# Prewrite_time: 0.101346878 Commit_time: 0.030348021 Get_commit_ts_time: 0.000101586 Commit_backoff_time: 0.1 Backoff_types: [tikvRPC regionMiss] Local_latch_wait_time: 2.288e-06 Write_keys: 1 Write_size: 23 Prewrite_region: 2
# DB: test
# Is_internal: false
# Digest: db305e24a5f7a0ae2c46a1770efa5150459b121ac3dc1ce39ba5645a4f8521ff
# Num_cop_tasks: 0
# Cop_proc_avg: 0 Cop_proc_p90: 0 Cop_proc_max: 0 Cop_proc_addr: 
# Cop_wait_avg: 0 Cop_wait_p90: 0 Cop_wait_max: 0 Cop_wait_addr: 
# Succ: false
insert into t VALUES(1);

this insert is slow due to prewrite is slow and backoff take most time in prewrite.

and backoff type is Backoff_types: [tikvRPC regionMiss]

Commit_backoff_time: 0.1 Backoff_types: [tikvRPC regionMiss]

Check List

Tests

  • Manual test

Code changes

  • log record change

Side effects

  • n/a
    Related changes

  • Need to cherry-pick to the release branch


This change is Reviewable

@lysu
Copy link
Contributor Author

lysu commented Aug 16, 2019

/run-all-tests

@codecov
Copy link

codecov bot commented Aug 16, 2019

Codecov Report

Merging #11757 into master will decrease coverage by 0.0921%.
The diff coverage is 87.2727%.

@@               Coverage Diff               @@
##            master     #11757        +/-   ##
===============================================
- Coverage   81.558%   81.4658%   -0.0922%     
===============================================
  Files          435        435                
  Lines        94095      94156        +61     
===============================================
- Hits         76742      76705        -37     
- Misses       11867      11963        +96     
- Partials      5486       5488         +2

@lysu
Copy link
Contributor Author

lysu commented Aug 16, 2019

/run-all-tests

@@ -96,7 +97,7 @@ type twoPhaseCommitter struct {
// We use it to guarantee GC worker will not influence any active txn. The value
// should be less than GC life time.
maxTxnTimeUse uint64
detail *execdetails.CommitDetails
detail unsafe.Pointer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change to unsafe.Pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the previous, test was failed due to race condition in https://github.com/pingcap/tidb/pull/11757/files#diff-499c236856cd9ce3300d3f5ccde41a23R441, after this PR detail will be accessed in forked goroutine.

@lysu
Copy link
Contributor Author

lysu commented Aug 21, 2019

/run-all-tests

@lysu lysu requested review from jackysp and tiancaiamao and removed request for tiancaiamao August 21, 2019 06:03
Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jackysp
Copy link
Member

jackysp commented Aug 21, 2019

PTAL @crazycs520

@lysu
Copy link
Contributor Author

lysu commented Aug 21, 2019

/run-all-tests

store/tikv/2pc.go Outdated Show resolved Hide resolved
Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lysu lysu added the status/can-merge Indicates a PR has been approved by a committer. label Aug 26, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Aug 26, 2019

/run-all-tests

@sre-bot sre-bot merged commit fce8d61 into pingcap:master Aug 26, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Aug 26, 2019

cherry pick to release-3.0 failed

@sre-bot
Copy link
Contributor

sre-bot commented Aug 26, 2019

cherry pick to release-2.1 failed

@you06
Copy link
Contributor

you06 commented Aug 26, 2019

/run-cherry-picker

@sre-bot
Copy link
Contributor

sre-bot commented Aug 26, 2019

cherry pick to release-3.0 failed

@you06
Copy link
Contributor

you06 commented Aug 26, 2019

/run-cherry-picker

@sre-bot
Copy link
Contributor

sre-bot commented Aug 26, 2019

cherry pick to release-3.0 failed

@you06
Copy link
Contributor

you06 commented Aug 26, 2019

/run-cherry-picker

@sre-bot
Copy link
Contributor

sre-bot commented Aug 26, 2019

cherry pick to release-3.0 failed

@sre-bot
Copy link
Contributor

sre-bot commented Apr 7, 2020

It seems that, not for sure, we failed to cherry-pick this commit to release-2.1. Please comment '/run-cherry-picker' to try to trigger the cherry-picker if we did fail to cherry-pick this commit before. @lysu PTAL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tikv status/can-merge Indicates a PR has been approved by a committer. type/usability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants