Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make EBS controllerexpansion idempotent #552

Conversation

gnufied
Copy link
Contributor

@gnufied gnufied commented Aug 28, 2020

Fixes #498

This PR contains following fixes:

  • When describeVolume returns volume size to be greater than requested size, we also verify if volume has any modifications pending. I have observed that, AWS can set targetSize on volume even with pending volume modifications. This is the root cause of EBS resize flake.
  • When volume modifications check return volume resizing to be completed, we also verify volume size by describing volume. This is to ensure that - we minimize impact of eventual consistency by making two different API calls.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 28, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 28, 2020
@coveralls
Copy link

coveralls commented Aug 28, 2020

Pull Request Test Coverage Report for Build 1219

  • 44 of 60 (73.33%) changed or added relevant lines in 1 file are covered.
  • 48 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.3%) to 80.27%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/cloud/cloud.go 44 60 73.33%
Files with Coverage Reduction New Missed Lines %
pkg/cloud/cloud.go 2 79.18%
pkg/util/util.go 4 89.74%
pkg/driver/controller.go 42 88.94%
Totals Coverage Status
Change from base Build 1209: 0.3%
Covered Lines: 1489
Relevant Lines: 1855

💛 - Coveralls

@gnufied
Copy link
Contributor Author

gnufied commented Aug 28, 2020

/retest

2 similar comments
@gnufied
Copy link
Contributor Author

gnufied commented Aug 29, 2020

/retest

@gnufied
Copy link
Contributor Author

gnufied commented Aug 30, 2020

/retest

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 1, 2020
@gnufied gnufied force-pushed the make-controller-expansion-idempotent branch 2 times, most recently from bb38bad to a98b32a Compare September 9, 2020 15:32
@gnufied
Copy link
Contributor Author

gnufied commented Sep 10, 2020

/retest

@gnufied
Copy link
Contributor Author

gnufied commented Sep 10, 2020

/assign @bertinatto @jsafrane

pkg/cloud/cloud.go Outdated Show resolved Hide resolved
if latestMod != nil && modFetchError == nil {
state := aws.StringValue(latestMod.ModificationState)
if state == ec2.VolumeModificationStateModifying {
return oldSizeGiB, fmt.Errorf("volume %q is still being expanded to size %d", volumeID, newSizeGiB)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to make this function idempotent, we shouldn't return an error if the volume modification is in progress, no?

It feels like it should return success if the volume is already being modified to newSizeGiB, but the volume is being modified to a value different than newSizeGiB, then it's OK to return an error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that doesn't seem right either, then it would return that resize has succeeded which is a lie.

attach/controllerpublish behaviour is to wait for the attach to complete if it's found to be in progress. The same could be done here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - was just about to push a change that made the whole RPC call block until operation is finished.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that doesn't seem right either, then it would return that resize has succeeded which is a lie.

That's a good point, but currently we're doing the same thing in the first call (because it considers the volume resized once the operation is in the Optimizing state). So if Optimizing mean success for the first call, it should also mean success for the subsequent calls.

attach/controllerpublish behaviour is to wait for the attach to complete if it's found to be in progress. The same could be done here

+1

Yep - was just about to push a change that made the whole RPC call block until operation is finished.

By waiting for the modification to be in the Completed state? If so, can/should we change the behaviour in-tree as well, to be consistent with the CSI driver?

Copy link
Contributor Author

@gnufied gnufied Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright modified so that each subsequent resize call will now wait for previous resize to finish before returning.

By waiting for the modification to be in the Completed state? If so, can/should we change the behaviour in-tree as well, to be consistent with the CSI driver?

No not "Completed" state. I reverted the code to "optimizing" or "completed" check, because the main reason this was breaking was because describeVolume can return updated size before volume has really finished resizing. "optimized" vs "completed" state had no impact on the outcome.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double checking, doc says size can take a few seconds after volume is in optimizing state, and I guess it does not matter because even if fs resize is attempted prematurely, the fs resize will just get retried and must eventually succeed.

"Size changes usually take a few seconds to complete and take effect after a volume is in the Optimizing state. "
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-volume-modifications.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't return size in fs resize, it just blindly tries to resize to the full disk size, whatever it may be at the time. So is it possible that we won't fs expand to the intended size if we don't wait?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shoot you are right.

I am probably misreading the doc, the correct meaning must be that "optimizing" means the size change has already taken effect and only the performance change is still in progress.

Copy link
Contributor Author

@gnufied gnufied Sep 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size changes usually take a few seconds to complete and take effect after a volume is in the Optimizing state.

tbh - the wording here is imprecise in EBS docs. We have been told by EBS engineers that volume is "ready-to-use
after it enters "optimizing" state but if we wait for "VolumeModificationComplete" state then, it can take really long time to complete resize operation and afaict there is no intermediate state between "optimizing" and "complete" (although docs do imply that there is a intermediate state where size changes take effect few seconds after volume enters "optimizing" state).

pkg/cloud/cloud.go Show resolved Hide resolved
@bertinatto
Copy link
Member

/assign @leakingtapan @wongma7

@wongma7
Copy link
Contributor

wongma7 commented Sep 11, 2020

/retest

When returning successful for volume expansion requests we should
verify both volume size reported via DescribeVolume and pending volume
modifications requests
@gnufied gnufied force-pushed the make-controller-expansion-idempotent branch from a98b32a to bbf2ce6 Compare September 14, 2020 20:29
m, err := c.getLatestVolumeModification(ctx, volumeID)
if err != nil {
return 0, err
m, modFetchError := c.getLatestVolumeModification(ctx, volumeID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is checking the ModificationState immediately after calling ModifyVolume still necessary, won't waitForVolumeSize do the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed most of this code. Still keeping some of the check in VolumeModification object after modification, so as we can return success immediately if volume was previously expnaded to same size already..

@gnufied gnufied force-pushed the make-controller-expansion-idempotent branch from 9baf3f2 to 68686b1 Compare September 21, 2020 19:22
@wongma7
Copy link
Contributor

wongma7 commented Sep 21, 2020

lgtm
/retest

@gnufied
Copy link
Contributor Author

gnufied commented Sep 22, 2020

/retest

Now we are preventing such errors by checking volume
modifications first
@gnufied gnufied force-pushed the make-controller-expansion-idempotent branch from 68686b1 to 0f5ece6 Compare September 22, 2020 18:35
@gnufied
Copy link
Contributor Author

gnufied commented Sep 22, 2020

/retest

1 similar comment
@gnufied
Copy link
Contributor Author

gnufied commented Sep 22, 2020

/retest

@wongma7
Copy link
Contributor

wongma7 commented Sep 22, 2020

@bertinatto this look good to u?

Copy link
Member

@bertinatto bertinatto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 23, 2020
@k8s-ci-robot k8s-ci-robot merged commit e95bb1d into kubernetes-sigs:master Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Attach volume fails after resizing completed
8 participants