Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix resource leak in select for update when 'tidb_low_resolution_tso' is set #57012

Merged
merged 2 commits into from
Nov 6, 2024

Conversation

tiancaiamao
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #55468

Problem Summary:

The cause of that DATA RACE is a resource leak.

After executing "select * from low_resolution_tso for update" when tidb_low_resolution_tso is on,
error is return but the executor is not closed, so the background goroutine leak.
The leaked goroutine still referencing session ctx.
So when the next SQL is executed, it races visiting the session ctx.

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-triage-completed release-note-none Denotes a PR that doesn't merit a release note. labels Oct 30, 2024
@ti-chi-bot ti-chi-bot bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Oct 30, 2024
Copy link

tiprow bot commented Oct 30, 2024

Hi @tiancaiamao. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

codecov bot commented Oct 30, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 58.2548%. Comparing base (cf5a617) to head (b27c914).
Report is 96 commits behind head on master.

Additional details and impacted files
@@                Coverage Diff                @@
##             master     #57012         +/-   ##
=================================================
- Coverage   73.2316%   58.2548%   -14.9768%     
=================================================
  Files          1650       1812        +162     
  Lines        455646     659689     +204043     
=================================================
+ Hits         333677     384301      +50624     
- Misses       101479     250300     +148821     
- Partials      20490      25088       +4598     
Flag Coverage Δ
integration 40.6466% <0.0000%> (?)
unit 73.6004% <50.0000%> (+1.0748%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.9478% <ø> (ø)
parser ∅ <ø> (∅)
br 62.8158% <ø> (+16.8342%) ⬆️

@@ -598,6 +598,7 @@ func (a *ExecStmt) Exec(ctx context.Context) (_ sqlexec.RecordSet, err error) {

if a.isSelectForUpdate {
if sctx.GetSessionVars().UseLowResolutionTSO() {
terror.Log(exec.Close(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the exec.Close(e) also be called at L618 before returning errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be.
Updated.

Copy link
Contributor

@cfzjywxk cfzjywxk Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if the exec.Close(e) is idempotent, is it better to add the logic in the defer block of the function, if the executor is opened?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just do the ad-hoc fix for the bug first here.
I'm afraid of introducing bugs when doing too much changes, although it may reduce potential bug and reduce the maintance burden.
@lcwangchao is doing the refactoring work, I'd like hear his idea

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems there is no better way to make the close more elegant and keep a minimum change of the code. The following function handlePessimisticSelectForUpdate and handleNoDelay may close or replace the executor internally. If we want to make the code more clear, we should redesign such implementations, introducing more assumption or make executor.Close idempotent...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tiancaiamao @lcwangchao
So we merge this fix first and file an refactor or enhancment issue for the more elegant solution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree!

@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Nov 1, 2024
Copy link

ti-chi-bot bot commented Nov 6, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cfzjywxk, lcwangchao

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [cfzjywxk,lcwangchao]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Nov 6, 2024
Copy link

ti-chi-bot bot commented Nov 6, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-11-01 01:54:14.357942623 +0000 UTC m=+573967.197098171: ☑️ agreed by cfzjywxk.
  • 2024-11-06 07:59:07.06026467 +0000 UTC m=+1027859.899420216: ☑️ agreed by lcwangchao.

@tiancaiamao
Copy link
Contributor Author

/test build

Copy link

tiprow bot commented Nov 6, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test build

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot merged commit af930be into pingcap:master Nov 6, 2024
23 checks passed
@tiancaiamao tiancaiamao deleted the issue55468 branch November 7, 2024 01:11
@ti-chi-bot ti-chi-bot bot added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Nov 14, 2024
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #57367.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DATA RACE in the stmtctx.(*StatementContext)
4 participants