Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Backofff Handler utils #499

Merged
merged 2 commits into from
Nov 8, 2018
Merged

Conversation

freehan
Copy link
Contributor

@freehan freehan commented Sep 27, 2018

Extract the backoff retry logic from batchSyncer and write 2 util package that can be reused by different syncers, plus unit testing:

  • backoffHandler calculates exponential backoff delays
  • backoffRetryHandler triggers retryFunc after back off delay.

These utils are building blocks for the new transactionSyncer

cc: @agau4779

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 27, 2018
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 27, 2018
@freehan freehan force-pushed the backoff branch 3 times, most recently from fbe0648 to 5b73089 Compare October 2, 2018 17:30
@freehan freehan mentioned this pull request Oct 2, 2018
var MaxRetryError = fmt.Errorf("maximum retry exceeded")

// backoffHanlder handles delays for back off retry
type backoffHanlder interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling

}

// boundedExponentialBackendOffHandler is a back off handler that implements bounded exponential back off.
type boundedExponentialBackendOffHandler struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"bounded" is probably extraneous (I would think every implementation has a bound)

return handler.lastRetryDelay, nil
}

func (handler *boundedExponentialBackendOffHandler) ResetRetryDelay() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just create a new one instead of reset?

"time"
)

var MaxRetryError = fmt.Errorf("maximum retry exceeded")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ErrRetriesExceeded

// NextRetryDelay returns the delay for next retry or error if maximum number of retries exceeded.
NextRetryDelay() (time.Duration, error)
// ResetRetryDelay resets the retry delay
ResetRetryDelay()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative: construct a new backoffHandler instead of reset

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That basically push more load on Golang GC. Imagine I need to create a new backoffHandler object after every successful NEG sync.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is usually not the case for a modern GC (efficiency issues due to GC), unless you are creating an incredible # of objects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not having reset() usually makes it easier to reason about the object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine there are multiple go routines running NEG API calls. And when they finished succesfully, every one will call Reset. Or, they need to create duplicated objects.

}

// NextRetryDelay returns the next back off delay for retry.
func (handler *boundedExponentialBackOffHandler) NextRetryDelay() (time.Duration, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add jitter. Does the code vendored in from client-go offer any resuable bits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added jitter.

@freehan freehan force-pushed the backoff branch 2 times, most recently from 79b87b7 to c3f0147 Compare October 15, 2018 21:06
@freehan
Copy link
Contributor Author

freehan commented Oct 15, 2018

Fixed the comments. Ready for review

Copy link
Contributor

@agau4779 agau4779 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the tests pass if we run them with -parallel ?

handler.lock.Lock()
defer handler.lock.Unlock()
handler.retryCount += 1
if handler.maxRetries > 0 && handler.retryCount > handler.maxRetries {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - don't need the > 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

> 0 is for the case where one do not want to have a maxRetries limit


func TestStartAndStopBackoffRetryHandler(t *testing.T) {
helper := &retryHandlerTestHelper{}
handler := NewDelayRetryHandler(helper.incrementCount, NewExponentialBackendOffHandler(15, testMinRetryDelay, testMaxRetryDelay))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and line 88 - could we put the max retries in a variable at the top? const testMaxRetries = 10?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

pkg/neg/syncers/retry_handler.go Outdated Show resolved Hide resolved
pkg/neg/syncers/retry_handler.go Show resolved Hide resolved
pkg/neg/syncers/retry_handler.go Outdated Show resolved Hide resolved
pkg/neg/syncers/retry_handler.go Show resolved Hide resolved
retryFunc func()
}

func NewDelayRetryHandler(retryFunc func(), backoff backoffHandler) *backoffRetryHandler {
Copy link
Member

@MrHohn MrHohn Oct 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure I understand it correctly:

  • Each NEG can get one retry handler.
  • handler.Run() starts before any NE operation for this NEG is made.
  • Each NEG can trigger multiple go routines that handle NE operations in parallel. Some of them may fail at different time.
  • All failed operations will attempt to call handler.Retry(). Conceptually all calls to handler.Retry() will be merged into one until the next retry runs.
  • Each retryFunc() can trigger go routines that execute NE operations necessary for the entire NEG based on its current state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

One more thing, For successful operations, they will call Reset().

If some are successful and some failed, the successful operations will not be retried. Eventually, only the failed ones will be retries plus back off exponentially.

pkg/neg/syncers/retry_handler.go Outdated Show resolved Hide resolved
pkg/neg/syncers/retry_handler.go Show resolved Hide resolved
pkg/neg/syncers/retry_handler.go Outdated Show resolved Hide resolved
@freehan
Copy link
Contributor Author

freehan commented Oct 31, 2018

Ready for another round

@MrHohn
Copy link
Member

MrHohn commented Oct 31, 2018

The retry handler part LGTM.

@freehan
Copy link
Contributor Author

freehan commented Nov 2, 2018

Can I get this merged?

@freehan freehan mentioned this pull request Nov 2, 2018
@freehan
Copy link
Contributor Author

freehan commented Nov 5, 2018

Ping

@MrHohn
Copy link
Member

MrHohn commented Nov 8, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: freehan, MrHohn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 411b66d into kubernetes:master Nov 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants