Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add force detach #477

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions csi.proto
Original file line number Diff line number Diff line change
Expand Up @@ -788,6 +788,17 @@ message ControllerUnpublishVolumeRequest {
// This field is OPTIONAL. Refer to the `Secrets Requirements`
// section on how to use this field.
map<string, string> secrets = 3 [(csi_secret) = true];

// Indicates SP MUST make the volume inaccessible to the node or nodes
// it is being unpublished from. Any attempt to read or write data
// to a volume from a node that has been fenced MUST NOT succeed,
// even if the volume remains staged and/or published on the node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are a couple of pods accessing the volume on same node, we would have a state where more than one publish and just one stage. Considering the nodeunpublish request arrives for a particular volume handle and if its fenced on stage level, its not the desired outcome we intended here with this enhancement/spec, Isnt it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the normal set of NodeUnpublish and NodeUnstage calls still have to be made for each volume on each node to return to the original state. The trigger for the quarantined state is the ControllerUnpublish call with the fence flag. The above statement just means that, without communicating with the node, the SP is expected to render the volume inaccessible to the node regardless of the node's state, and the node will be forced to clean up under conditions where it can't access the volume. The only way the node will be able to access the volume again is for the node-side cleanup to complete and then for a subsequent ControllerPublish to happen for that node again.

// CO MUST NOT set this field to true unless SP has the
// UNPUBLISH_FENCE controller capability.
// The SP MAY make the volume inaccessible even when this field is
// false.
// This is an OPTIONAL field.
bool fence = 4 [(alpha_field) = true];
}

message ControllerUnpublishVolumeResponse {
Expand Down Expand Up @@ -1079,6 +1090,10 @@ message ControllerServiceCapability {
// SINGLE_NODE_SINGLE_WRITER and/or SINGLE_NODE_MULTI_WRITER are
// supported, in order to permit older COs to continue working.
SINGLE_NODE_MULTI_WRITER = 13 [(alpha_enum_value) = true];

// Indicates the SP supports the
// ControllerUnpublishVolume.fence field.
UNPUBLISH_FENCE = 14 [(alpha_enum_value) = true];
}

Type type = 1;
Expand Down Expand Up @@ -1316,6 +1331,13 @@ message NodeUnstageVolumeRequest {
// system/filesystem, but, at a minimum, SP MUST accept a max path
// length of at least 128 bytes.
string staging_target_path = 2;

// Indicates that the SP should prefer to successfully unstage the
// volume, even if data loss would occur as a result.
// CO MUST NOT set this field to true unless SP has the
// FORCE_UNPUBLISH node capability.
// This in an OPTIONAL field.
bool force = 3 [(alpha_field) = true];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what driver will do differently with force is set to true or false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force would suggest that a "umount -f" would be acceptable rather than a "umount". Generally umount will not succeed if buffered data would be lost because the underlying block device is unwritable. The -f flag tells umount to not worry about this and just kill the mount.
There can be similar checks at lower levels that might ordinarily fail if they can't be done "safely" (i.e. without losing data) that should be skipped in the presence of the force flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It case it's not clear, the REASON for this is because if you've already cut off the node from the storage and moved the workload elsewhere (because the CO falsely detected the node was down and used the fence operation), then flushing the buffered data would result in corruption, so it's correct to discard it, but the SP cannot know this fact, so the flag gives the CO a way to communicate it. Ordinarily SPs are expected to fail unpublish/unstage if they can't be done safely because not failing in those cases could cause data corruption for a different reason.

}

message NodeUnstageVolumeResponse {
Expand Down Expand Up @@ -1400,6 +1422,13 @@ message NodeUnpublishVolumeRequest {
// system/filesystem, but, at a minimum, SP MUST accept a max path
// length of at least 128 bytes.
string target_path = 2;

// Indicates that the SP should prefer to successfully unpublish the
// volume, even if data loss would occur as a result.
// CO MUST NOT set this field to true unless SP has the
// FORCE_UNPUBLISH node capability.
// This in an OPTIONAL field.
bool force = 3 [(alpha_field) = true];
}

message NodeUnpublishVolumeResponse {
Expand Down Expand Up @@ -1527,6 +1556,12 @@ message NodeServiceCapability {
// with provided volume group identifier during node stage
// or node publish RPC calls.
VOLUME_MOUNT_GROUP = 6 [(alpha_enum_value) = true];

// Indicates that the Node Plugin supports the
// NodeUnpublishVolume.force field. Also indicates that the
// Node Plugin supports the NodeUnstageVolume.force field if
// it also has the STAGE_UNSTAGE_VOLUME capability.
FORCE_UNPUBLISH = 7 [(alpha_enum_value) = true];
}

Type type = 1;
Expand Down
Loading