Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor,server: re-implement the kill statement by checking the Next() function #10841

Merged
merged 8 commits into from
Jun 20, 2019

Conversation

tiancaiamao
Copy link
Contributor

What problem does this PR solve?

Re-implement the kill statement in a better way.

In #9844, the implementation of kill is changed to use executor.Close. Some goroutine may be calling executor.Next while another one is calling the executor.Close. It's not thread-safe, and panic sometimes (you can imagine that one of them is using e.result while another one set e.result to nil).

What is changed and how it works?

In this commit, I introduce a Next() function that wraps the e.Next() function, and ctx.Done() is checked in the new Next() function.
In this way, we can make sure kill signal could be checked as long as Next() is called, so we can stop an executor eventually.

Check List

Tests

It's not easy to write unit tests for the kill statement. I tried to make a 'black hole' table (which generates fake data forever when it's read) to support long-running transactions for kill, but it involves many more changes and those changes are irrelevant. So I decide to leave tests along until I find some better ways.

Code changes

  • Has exported function/method change

Side effects

  • Possible performance regression

Related changes

  • Need to cherry-pick to the release branch

@tiancaiamao
Copy link
Contributor Author

PTAL @lysu @qw4990 @XuHuaiyu

@tiancaiamao
Copy link
Contributor Author

/run-all-tests

@codecov
Copy link

codecov bot commented Jun 18, 2019

Codecov Report

Merging #10841 into master will decrease coverage by 0.0229%.
The diff coverage is 85.4166%.

@@               Coverage Diff               @@
##             master     #10841       +/-   ##
===============================================
- Coverage   80.8814%   80.8585%   -0.023%     
===============================================
  Files           419        419               
  Lines         88668      88661        -7     
===============================================
- Hits          71716      71690       -26     
- Misses        11726      11742       +16     
- Partials       5226       5229        +3

@qw4990 qw4990 requested review from qw4990, lysu and XuHuaiyu and removed request for qw4990 and lysu June 18, 2019 14:02
@jackysp jackysp added type/bugfix This PR fixes a bug. priority/release-blocker This issue blocks a release. Please solve it ASAP. labels Jun 18, 2019
zz-jason
zz-jason previously approved these changes Jun 19, 2019
Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

executor/executor.go Outdated Show resolved Hide resolved
executor/executor.go Outdated Show resolved Hide resolved
Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mechanism, we need to make sure the following rules hold:

  1. context should be passed from the root executor to the deep leaf executor in the tree.
  2. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.
  3. Close() should be safe to call after the upper layer called Cancel().

@zz-jason zz-jason dismissed their stale review June 19, 2019 01:12

by mistake...

@lonng
Copy link
Contributor

lonng commented Jun 19, 2019

Can we use an atomic variable to determine whether the Executor was been canceled? I think it's more effective than check context.Done() in every iteration.

@tiancaiamao
Copy link
Contributor Author

Can we use an atomic variable to determine whether the Executor was been canceled? I think it's more effective than check context.Done() in every iteration.

Use many different mechanisms to implement the same thing makes the code less maintainable.
And another drawback of checking atomic variable is that you can't handle block waiting conditions:

select {
    case <-wait:
    for atomic is false ? {
    }
}

@lonng

@tiancaiamao
Copy link
Contributor Author

  1. context should be passed from the root executor to the deep leaf executor in the tree.

No problem... passing context is the golang idiom

  1. context.Done() should be checked in every background goroutine for every parallel executor. And once the upper layer calls Cancel(), all the background goroutines should stop.

I don't mean to check every background goroutine any more. It's impractical as we have proved: difficult to maintain and easy to cause goroutine leak.
The parallel background goroutine will not check the ctx.Done any more, instead, it handles those errors Next() returned.
In this way, we can guarantee the cancel of the execution within two Next call.
Maybe we still need some check of ctx.Done in the parallel executor goroutine (if the Next itself involves too many computing), but I don't want to make the ctx.Done check diffuse.

  1. Close() should be safe to call after the upper layer called Cancel().

Cancel will not cancel the background goroutine now ... it just makes Next return error.
So the release of resources should be safe.

@lonng
Copy link
Contributor

lonng commented Jun 19, 2019

@tiancaiamao I mean to set an atomic variable to 1 when killing an executor and check the atomic variable while every iteration.

@tiancaiamao
Copy link
Contributor Author

I know an atomic variable is more effective, but it's not the only factor we should take into consideration @lonng

@qw4990 qw4990 self-requested a review June 19, 2019 03:08
@qw4990
Copy link
Contributor

qw4990 commented Jun 19, 2019

Can we guarantee that after canceling an Executor by this way, the Close() function of this Executor must be called by some goroutine, otherwise resources leak would happen?

@tiancaiamao
Copy link
Contributor Author

Close will be called in the session's goroutine, this is the same as before (I mean before PR #9844)
Calling Close even when executor.Next meets error, is a prerequisite.
@qw4990

@lysu
Copy link
Contributor

lysu commented Jun 19, 2019

there are maybe many duplicated check ctx.Done in Next() calls?

for example:

   limit - unparallel project - tablescan

check ctx.Done seems be not well- -... check in limit then go to project quickly but need check again...

and high frequent select ctx.Done make pressure to schedule, maybe just check a shared variable in wrapped Next method is better?(there are no block-logic in wrapped Next)

maybe need a benchmark- -?

@tiancaiamao tiancaiamao changed the title executor,server: re-implement the kill statement by checking ctx.Done in the Next() function executor,server: re-implement the kill statement by checking the Next() function Jun 19, 2019
@tiancaiamao
Copy link
Contributor Author

Now I change the PR to use an atomic flag @zz-jason @lysu @lonng @qw4990

@CLAassistant
Copy link

CLAassistant commented Jun 19, 2019

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@XuHuaiyu XuHuaiyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/release-blocker This issue blocks a release. Please solve it ASAP. sig/execution SIG execution type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants