Add support for multi-writer PD #415

sschmitt · 2019-10-23T23:08:37Z

/kind feature

What this PR does / why we need it:
Adds multi writer support (currently alpha in GCP).

Special notes for your reviewer:
There's a bit of code duplication due to alpha copies of certain methods. I found it challenging to reduce the duplication but I'd be open to suggestions on how to refactor. On the other hand it might just be a temporary situation until the API's move to beta / GA.

Does this PR introduce a user-facing change?:

Add support for multi-writer raw block devices.

k8s-ci-robot · 2019-10-23T23:08:38Z

Welcome @sschmitt!

It looks like this is your first PR to kubernetes-sigs/gcp-compute-persistent-disk-csi-driver 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/gcp-compute-persistent-disk-csi-driver has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2019-10-23T23:08:51Z

Hi @sschmitt. Thanks for your PR.

I'm waiting for a kubernetes-sigs or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

msau42 · 2019-10-23T23:17:16Z

/ok-to-test
/assign @davidz627

davidz627 · 2019-10-24T21:04:34Z

pkg/gce-cloud-provider/compute/gce-compute.go

@@ -288,6 +297,66 @@ func (cloud *CloudProvider) insertRegionalDisk(ctx context.Context, volKey *meta
 	return nil
 }

+func (cloud *CloudProvider) insertRegionalAlphaDisk(ctx context.Context, volKey *meta.Key, diskType string, capBytes int64, capacityRange *csi.CapacityRange, replicaZones []string, snapshotID, diskEncryptionKmsKey string, multiWriter bool) error {
+	diskToCreateAlpha := &computealpha.Disk{


Thanks for doing this, It looks pretty good as-is but I think there's also a way to de-dupe some of this copy-pasted code.

Assuming zonal here:

Have one cloud.insertDisk that takes in all the parameters

Create a v1 disk with logic up to line 318 here.

Make insertOp a interface{} type (this part is kind of nasty, suggestions welcome)

if multiWriter { alphadisk := convertV1DiskToV1AlphaDisk(disk) // you have to write this insertOp = cloud.alphaService.Insert(alphadisk) } else{ insertOp = cloud.service.Insert(disk) }

Then the rest of the error handling logic is shared
4) typeAssert on insertOp and call the respective waitForOp on it
5) Share the other logic

This would dedupe most of the code and only have branching for the actual GCE API Insert call and the WaitForOp.
Let me know what you think. Happy to discuss more

/cc @msau42 @misterikkit

Thanks for the review. I notice the waitForOp methods only refer to op.Name. I wonder if it's okay to call the non-alpha operations API to check on the status of an alpha operation. My thinking there is that we're not leveraging alpha features of the operations API.
I'm not sure so I'll err on the side of caution here. The solution you recommended works just fine.

ack. I didn't notice that - we could just grab the op name then and avoid the whole type assertion thing

wait... looked at it again and we still might need to since we're doing the op get on svc.ZoneOperations.Get(project, zone, op.Name).Context(ctx).Do() - thats the v1 service. If we have an alpha service op maybe we can't "find it" unless we use a v1alpha.ZoneOperations call? Could you verify

David, I ran some tests and found that the v1 operations API is able to get the status of beta and alpha operations.

This can easily be tested using the in-browser API Explorer:
https://cloud.google.com/compute/docs/reference/rest/beta/disks/insert
https://cloud.google.com/compute/docs/reference/rest/v1/zoneOperations/get

I think there also might be some opportunity to reduce the duplication between Zonal and Regional methods but I'll skip that for now. You can let me know your thoughts there. I'm happy to make additional changes.

In the meantime I'll address the rest of your comments and add another commit.

lets keep this change focused on adding the multiwriter stuff - feel free to open up a fix to reduce zonal+regional duplication afterwards if you're interested. That would be really cool 👍

Ack on the ops - lets dedupe them and make sure to add a comment in the code with that exact finding (so someone doesn't come in and waste cycles trying to "fix" it later)

davidz627

Mostly looks good, just some comments about code de-dupe and maybe some missed pieces.

Could you please write some E2E tests for this too? You can see examples in:
test/e2e/tests/single_zone_e2e_test.go
and run them locally with:
test/run-e2e-local.sh

davidz627 · 2019-10-24T21:08:55Z

pkg/gce-cloud-provider/compute/gce-compute.go

+	insertOp, err := cloud.alphaService.RegionDisks.Insert(cloud.project, volKey.Region, diskToCreateAlpha).Context(ctx).Do()
+	if err != nil {
+		if IsGCEError(err, "alreadyExists") {
+			disk, err := cloud.GetDisk(ctx, volKey)


does GetDisk need to be changed as well? If we are to get a disk that is multi-writer what happens (since it is only on the alpha struct)?

you may be able to leverage CloudDisk here (originally created for this different api problem for RePD)

Yes, GetDisk has to be updated. CloudDisk looks to be the way to go.

davidz627 · 2019-10-24T21:09:13Z

pkg/gce-cloud-provider/compute/gce-compute.go

+			if err != nil {
+				return err
+			}
+			err = cloud.ValidateExistingDisk(ctx, disk, diskType,


validation might also need to be changed to check if multiWriter field is equal

davidz627 · 2019-10-24T21:10:48Z

pkg/gce-pd-csi-driver/controller_test.go

@@ -346,14 +346,19 @@ func TestCreateVolumeArguments(t *testing.T) {
 			},
 		},
 		{
-			name: "fail with MULTI_NODE_MULTI_WRITER capability",
+			name: "success with block/MULTI_NODE_MULTI_WRITER capabilities",


what about fs/MULTI_NODE_MULTI_WRITER should we fail?

@nikhilkathare Any details here?

@mattcary : Hi matt, fs/MULTI_NODE_MULTI_WRITER has been handled above as mount/MULTI_NODE_MULTI_WRITER capability.

msau42 · 2019-10-24T21:20:58Z

@davidz627 I think the e2e is going to require the boskos projects to be whitelisted before we can merge it

sschmitt · 2019-10-29T19:36:39Z

Mostly looks good, just some comments about code de-dupe and maybe some missed pieces.

Could you please write some E2E tests for this too? You can see examples in:
test/e2e/tests/single_zone_e2e_test.go
and run them locally with:
test/run-e2e-local.sh

@davidz627 E2E is a little tricky because this feature is still in Alpha. I also notice that the E2E tests are hardcoded for us-central1. Multi-writer PD is currently only available in us-east1-a in whitelisted projects.

Should I write E2E tests and leave them commented out or skip them?

davidz627 · 2019-10-29T20:36:19Z

I also notice that the E2E tests are hardcoded for us-central1

This could probably be changed pretty easily - please do so unless it seems like it would be significant additional investment.

in whitelisted projects.

I will take a look at how to resolve this and try get the projects all whitelisted - I will update this PR when I know more.

Should I write E2E tests and leave them commented out or skip them?

Yes, have them skipped for now. However, you should be able to run the tests in your own whitelisted project[s]

Thanks!

sschmitt · 2019-10-29T20:38:23Z

@davidz627 Perfect. Thanks for the guidance.

…riter PD size of 200GB.

davidz627 · 2019-11-22T00:31:48Z

pkg/gce-cloud-provider/compute/fake-gce.go

@@ -182,15 +182,22 @@ func (cloud *FakeCloudProvider) ValidateExistingDisk(ctx context.Context, resp *
 		return fmt.Errorf("disk already exists with incompatible type. Need %v. Got %v",
 			diskType, respType[len(respType)-1])
 	}
+
+	// We are assuming here that a multiWriter disk could be used as non-multiWriter
+	if multiWriter && !resp.GetMultiWriter() {


how about the other way around

The other way around would be when the existing disk is enabled for multi-writer but the user didn't ask for that capability. I assume here that the user would be okay with that.

It's actually quite challenging to check the opposite because the user might not have Alpha API access. I suppose we could first try alpha and if that fails fall back to v1.

davidz627 · 2019-11-22T00:33:03Z

pkg/gce-pd-csi-driver/controller.go

@@ -298,7 +305,7 @@ func (gceCS *GCEControllerServer) ControllerPublishVolume(ctx context.Context, r
 		PublishContext: nil,
 	}

-	_, err = gceCS.CloudProvider.GetDisk(ctx, volKey)
+	_, err = gceCS.CloudProvider.GetDisk(ctx, volKey, gce.V1)


why is this definitely V1? Couldn't it be a alpha multiwriter disk that we want to get here

The v1 API returns disks that use alpha features, it just doesn't have any information with respect to those features. Here the API call is used to detect the existence of the disk. The disk itself is thrown away and only the error code is parsed. I didn't see a need to use anything other than v1.

davidz627 · 2019-11-22T00:38:19Z

pkg/gce-cloud-provider/compute/gce-compute.go

+		alphaDiskToCreate := convertV1DiskToAlphaDisk(diskToCreate)
+		alphaDiskToCreate.MultiWriter = multiWriter
+		insertOp, err = cloud.alphaService.RegionDisks.Insert(cloud.project, volKey.Region, alphaDiskToCreate).Context(ctx).Do()
+		if err == nil {


nit: if insertOp != nil instead?

davidz627 · 2019-11-22T00:39:16Z

pkg/gce-cloud-provider/compute/gce-compute.go

+	var (
+		err        error
+		opName     string
+		apiVersion = V1


Lets call this gceAPIVersion to disambiguate.

sschmitt · 2020-01-23T17:17:26Z

@davidz627 Are there any further concerns? You had a question here, I wasn't sure if you saw my response.

mattcary · 2020-07-17T19:23:47Z

pkg/gce-pd-csi-driver/controller_test.go

@@ -346,14 +346,19 @@ func TestCreateVolumeArguments(t *testing.T) {
 			},
 		},
 		{
-			name: "fail with MULTI_NODE_MULTI_WRITER capability",
+			name: "success with block/MULTI_NODE_MULTI_WRITER capabilities",


@nikhilkathare Any details here?

mattcary · 2020-07-17T19:31:53Z

test/e2e/tests/multi_zone_e2e_test.go

@@ -257,7 +257,11 @@ func testLifecycleWithVerify(volID string, volName string, instance *remote.Inst
 	if secondMountVerify != nil {
 		// Mount disk somewhere else
 		secondPublishDir := filepath.Join("/tmp/", volName, "secondmount")


Sorry if I'm missing something from the test setup, but will this test multizonal shared PD?

Shared PD is yet not supported on multizone. This is the fix added to handle function testLifecycleWithVerify with useBlock=true.

Thanks for clarifying.

nikhilkathare · 2020-07-23T11:30:47Z

/assign @mattcary

mattcary · 2020-07-27T22:08:50Z

Thanks for resurrecting this PR, Nikhil. It looks good to me, just a couple of general questions:

the multiwriter PD API is beta now. But I think continuing to work here through the alpha API makes sense in case there are new alpha-only features that we end up having to use. But maybe not, WDYT?
I believe that in beta only scsi is supported by PD, but by GA it will be NVMe only. Do you see that causing any problems?

nikhilkathare · 2020-07-29T15:11:59Z

Hi Matt, Thanks for your review.
Response for your questions.

Agreed. Let us go ahead and use Alpha APIs for now. Beta APIs changes can be done later as required.
We don’t see any changes required for NVMe but if required we can make the necessary changes after testing on NVMe PDs whenever it becomes available.

mattcary · 2020-07-29T23:51:04Z

Hi Matt, Thanks for your review.
Response for your questions.

Agreed. Let us go ahead and use Alpha APIs for now. Beta APIs changes can be done later as required.

We don’t see any changes required for NVMe but if required we can make the necessary changes after testing on NVMe PDs whenever it becomes available.

Cool, thanks for the info.

mattcary · 2020-07-29T23:51:14Z

/lgtm

nikhilkathare · 2020-08-06T04:55:19Z

/cc @msau42 @mattcary We are not clear if anything is pending from our side. Can we know when will this PR be approved and merged ?

msau42 · 2020-08-06T17:53:24Z

@nikhilkathare this lgtm! Thanks for picking this up! One last thing, would you be able to squash the commits to reduce some of the "merge branch" and "address comments" commits? If you haven't tried it before, I would suggest practicing in another branch before doing it here. Take a look here for pointers on how to do the squshing.

nikhilkathare · 2020-08-10T06:38:29Z

/cc @msau42 @mattcary Thanks for reviewing the code. Tried to squash the commits using rebase but I could squash only final commits(last 5 commits) which were done by me. When tried to squash further, rebase is erroring out with conflicts while applying commits and exiting to resolve errors manually in other commits. As commits in this PR are done with lot of gap there are many commits that have got in between which squash is not able to resolve. Should we create new branch and take change from this branch to new branch or any other better approach. Let us know what approach you would like us to take ?

msau42 · 2020-08-10T20:58:59Z

Hi @nikhilkathare can you try just squashing your own commits then? It's not a big deal if we can't get it to work, it's mainly to just try to cleanup the commit listing a little bit.

nikhilkathare · 2020-08-11T11:06:52Z

@msau42 Thanks for quick response. Yes I have squashed my five changes to one and pushed, this led to removal of tag lgtm from PR. Now no further squash could be done as there is merge(which has other commits) between our PR commits.

msau42 · 2020-08-11T21:21:46Z

/lgtm
/approve

Thank you so much!

k8s-ci-robot · 2020-08-11T21:21:57Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msau42, sschmitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [msau42]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

msau42 · 2020-08-11T22:58:15Z

/retest

msau42 · 2020-08-11T23:53:02Z

/retest

msau42 · 2020-08-12T01:57:08Z

/retest

k8s-ci-robot requested a review from msau42 October 23, 2019 23:08

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 23, 2019

k8s-ci-robot assigned davidz627 Oct 23, 2019

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 23, 2019

Adds support for multi-writer PD

d32b658

sschmitt force-pushed the shared-pd branch from 2351b67 to d32b658 Compare October 24, 2019 19:59

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 24, 2019

davidz627 reviewed Oct 24, 2019

View reviewed changes

Adds multi-writer validation, reduces code duplication, add E2E test

119875e

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 4, 2019

Stephen Schmitt added 2 commits November 4, 2019 17:47

Fixes gofmt

99c4619

Add pending test for multi-writer attach. Updated for minimum multi-w…

2e6ca14

…riter PD size of 200GB.

davidz627 reviewed Nov 22, 2019

View reviewed changes

disambiguates apiVersion and fix nil check

6470835

k8s-ci-robot assigned mattcary Jul 10, 2020

mattcary mentioned this pull request Jul 17, 2020

REQUEST: New membership for mattcary kubernetes/org#2043

Closed

6 tasks

mattcary reviewed Jul 17, 2020

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 29, 2020

shared-pd addressing comments and fixing e2e build error

6af5e68

nikhilkathare force-pushed the shared-pd branch from 9eb436f to 6af5e68 Compare August 9, 2020 14:29

k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 9, 2020

Merge branch 'master' into shared-pd

abeb772

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 9, 2020

go fmt fixes

9e7eb61

k8s-ci-robot assigned msau42 Aug 11, 2020

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 11, 2020

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 11, 2020

k8s-ci-robot merged commit 58011dc into kubernetes-sigs:master Aug 12, 2020

Add support for multi-writer PD #415

Add support for multi-writer PD #415

Conversation

sschmitt commented Oct 23, 2019

k8s-ci-robot commented Oct 23, 2019

k8s-ci-robot commented Oct 23, 2019

msau42 commented Oct 23, 2019

davidz627 Oct 24, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidz627 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikhilkathare Jul 20, 2020 • edited Loading

Choose a reason for hiding this comment

msau42 commented Oct 24, 2019

sschmitt commented Oct 29, 2019

davidz627 commented Oct 29, 2019 • edited Loading

sschmitt commented Oct 29, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sschmitt commented Jan 23, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikhilkathare commented Jul 23, 2020

mattcary commented Jul 27, 2020

nikhilkathare commented Jul 29, 2020

mattcary commented Jul 29, 2020

mattcary commented Jul 29, 2020

nikhilkathare commented Aug 6, 2020

msau42 commented Aug 6, 2020

nikhilkathare commented Aug 10, 2020

msau42 commented Aug 10, 2020

nikhilkathare commented Aug 11, 2020

msau42 commented Aug 11, 2020

k8s-ci-robot commented Aug 11, 2020

msau42 commented Aug 11, 2020

msau42 commented Aug 11, 2020

msau42 commented Aug 12, 2020

davidz627 Oct 24, 2019 •

edited

Loading

nikhilkathare Jul 20, 2020 •

edited

Loading

davidz627 commented Oct 29, 2019 •

edited

Loading