Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access modes for single node single/multi writer #476

Conversation

chrishenzie
Copy link
Contributor

@chrishenzie chrishenzie commented Apr 26, 2021

Access Modes

In the CSI spec there are conflicting definitions of the SINGLE_NODE_WRITER access mode. By definition, SINGLE_NODE_WRITER means "Can only be published once as read/write on a single node, at any given time." The problem is how this access mode is used during NodePublishVolume, which is typically where volume mounting is performed.

The CSI spec defines that when NodePublishVolume is called a second time for a volume with a non-MULTI_NODE access mode and with a different target path, the plugin should return FAILED_PRECONDITION. For CSI plugins that strictly adhere to the spec, this guarantees that a volume can only be mounted to a single target path, which means SINGLE_NODE_WRITER restricts access to a single pod on a single node. This behavior conflicts with the original definition. Due to this conflict, we do not have an access mode that represents multiple writers on the same node. See #465 (comment) for more details.

The proposed solution for this issue is to introduce two new access modes: SINGLE_NODE_SINGLE_WRITER and SINGLE_NODE_MULTI_WRITER. These access modes exist to replace SINGLE_NODE_WRITER and disambiguate the number of writers on a single node.

Capabilities

Included in this change are capability bits in the controller and node services to indicate that a driver understands these new access modes. This approach was chosen in order to preserve backwards compatibility. When a client calls a CSI driver that supports the SINGLE_NODE_MULTI_WRITER capability, the client should replace any uses of the SINGLE_NODE_WRITER access mode with the appropriate SINGLE_NODE_{SINGLE,MULTI}_WRITER access mode depending on their use case.

@chrishenzie chrishenzie force-pushed the single-node-access-modes branch 2 times, most recently from 62b8ef5 to 5156940 Compare April 28, 2021 22:54
spec.md Outdated
| | T1=T2, P1=P2 | T1=T2, P1!=P2 | T1!=T2, P1=P2 | T1!=T2, P1!=P2 |
|--------------------------------|-----------------|----------------|---------------------|---------------------|
| SINGLE_NODE_SINGLE_WRITER | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION |
| Non SINGLE_NODE_SINGLE_WRITER | OK (idempotent) | ALREADY_EXISTS | OK | OK |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did Non SINGLE_NODE_SINGLE_WRITER meant -> SINGLE_NODE_MULTI_WRITER ? We are saying above that - this table applies for SINGLE_NODE_MULTI_WRITER but we don't use that word in the table - so it is bit confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I went back and forth on the wording here. I had a row for SINGLE_NODE_MULTI_WRITER but realized I would probably also need a row for "Everything else" but it would be treated in the same way as SINGLE_NODE_MULTI_WRITER so I combined them into the "Non" row. I think having a row for this to make it explicit is a good idea, I'll add this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@chrishenzie chrishenzie force-pushed the single-node-access-modes branch 2 times, most recently from 1bbe297 to b17438c Compare May 7, 2021 17:01
@chrishenzie chrishenzie force-pushed the single-node-access-modes branch 2 times, most recently from d77872e to eaeb3d3 Compare May 10, 2021 22:01
@chrishenzie
Copy link
Contributor Author

@bswartz Would you be able to take a look at this? Thank you!

spec.md Outdated
|---------------------------------------|-----------------|----------------|---------------------|---------------------|
| SINGLE_NODE_SINGLE_WRITER | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION |
| SINGLE_NODE_MULTI_WRITER | OK (idempotent) | ALREADY_EXISTS | OK | OK |
| Non SINGLE_NODE_{SINGLE,MULTI}_WRITER | OK (idempotent) | ALREADY_EXISTS | OK | OK |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catch all isn't clear to me. It suggests that everything else should succeed, but I think that multi node should succeed and non-multi-node should not. Either we should expand this second table to cover all the cases, or explicitly refer back to the first table in the 3rd row.

Also don't forget that even for SPs that support the new capability, older COs could continue to use SINGLE_NODE_WRITER and SPs are required to continue to support the old semantics forever alongside the new ones (unless we go through format deprecation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, expanded the table to include the MULTI_NODE and Non MULTI_NODE in the above table, and left a note requiring continued support for SINGLE_NODE_WRITER

@chrishenzie chrishenzie force-pushed the single-node-access-modes branch 2 times, most recently from e6f921a to 2595556 Compare May 12, 2021 01:31
spec.md Outdated
@@ -887,6 +887,14 @@ message VolumeCapability {
// Can be published as read/write at multiple nodes
// simultaneously.
MULTI_NODE_MULTI_WRITER = 5;

// Can only be published once as read/write at a single worklad on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in word worklad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


// Indicates the SP supports the SINGLE_NODE_MULTI_WRITER access
// mode.
SINGLE_NODE_MULTI_WRITER = 13 [(alpha_enum_value) = true];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this exists as a control-plane capability?

Copy link
Contributor Author

@chrishenzie chrishenzie May 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COs require this capability in order to determine which access mode to supply for controller service operations that take VolumeCapability, like CreateVolume(). This capability will be used in the same way as the capability on the node service.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrishenzie just to understand in bit more depth, but for COs with controller service operations, its just about knowing SINGLE_NODE or MULTI_NODE , isnt it ? in that sense, writer thing ( single or multiple) is not relevant for CO's control plane operations ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the number of writers here does not make a difference.

@@ -2232,6 +2244,17 @@ The following table shows what the Plugin SHOULD return when receiving a second
| MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | OK | OK |
| Non MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION|

NOTE: If the Plugin supports the `SINGLE_NODE_MULTI_WRITER` capability, use the following table instead for what the Plugin SHOULD return when receiving a second `NodePublishVolume` on the same volume on the same node:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we might have a problem here - can you query control-plane capabilities from node service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node service has an equivalent capability bit that is used for determining which access mode to supply for node operations that take VolumeCapability.

@bswartz
Copy link
Contributor

bswartz commented May 12, 2021

/lgtm

| Non MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION |

The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode in order to clarify the number of writers for a volume on a single node. Plugins MUST continue to support `SINGLE_NODE_WRITER` to allow older COs to continue working.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to explicitly deprecate SINGLE_NODE_WRITER at some point? I guess we need to wait for new modes to go GA before we can deprecate older one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a good idea, but there are no deprecation plans as of yet.

@gnufied
Copy link
Contributor

gnufied commented May 12, 2021

lgtm

spec.md Outdated
| MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | OK | OK |
| Non MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION |

The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode in order to clarify the number of writers for a volume on a single node. Plugins MUST continue to support `SINGLE_NODE_WRITER` to allow older COs to continue working.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plugins MUST continue to support SINGLE_NODE_WRITER to allow older COs to continue working.

Can you detail what "support" means here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, here is the new wording

Plugins MUST continue to accept and allow use of the SINGLE_NODE_WRITER access mode to support older COs to continue working.

@@ -2232,6 +2244,17 @@ The following table shows what the Plugin SHOULD return when receiving a second
| MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | OK | OK |
| Non MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION|

NOTE: If the Plugin supports the `SINGLE_NODE_MULTI_WRITER` capability, use the following table instead for what the Plugin SHOULD return when receiving a second `NodePublishVolume` on the same volume on the same node:

| | T1=T2, P1=P2 | T1=T2, P1!=P2 | T1!=T2, P1=P2 | T1!=T2, P1!=P2 |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you add an explanation before or after these tables explaining what T1, T2, P1, P2 are in this context (I realize existing table doesn't do this, but let's fix it while we're here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. There was a little section on that below the table. I moved it to the top of the first table so it's the first thing the user sees.


// Can be published as read/write at multiple workloads on a
// single node simultaneously.
SINGLE_NODE_MULTI_WRITER = 7 [(alpha_enum_value) = true];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this change, I am trying to reason on MULTI_NODE_SINGLE_WRITER. There are two possible interoperation

  1. only a single pod from a node can access it,
  2. multiple pods on the same single node can access it

From the comments of this mode, it seems refer to 2. Is that the intention? Do we want to support both? @saad-ali

// Can be published at multiple nodes simultaneously. Only one of
// the node can be used as read/write. The rest will be readonly.

Copy link
Contributor Author

@chrishenzie chrishenzie May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because the name mentions multiple nodes, it's saying this volume is attachable to multiple machines and readable by multiple machines, but only one workload on one of the machines has write access.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see it mentioned "one workload on one of the machines has write access" in the doc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 99% sure that Kubernetes never uses the MULTI_NODE_SINGLE_WRITER access mode for anything, so while it suffers from the same vagueness issues as other single writer access modes, nobody actually experiences problems related to this. Ideally we could sharpen up whether the single writer for that access mode is a single workload or all the workloads on a single node, but I don't see that as a priority.

Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

mod nit

@ddebroy can you PTAL?

spec.md Outdated
| MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | OK | OK |
| Non MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION |

The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode to clarify the number of writers for a volume on a single node. Plugins MUST continue to accept and allow use of the `SINGLE_NODE_WRITER` access mode to support older COs to continue working.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode to clarify the number of writers for a volume on a single node. Plugins MUST continue to accept and allow use of the `SINGLE_NODE_WRITER` access mode to support older COs to continue working.
The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode to clarify the number of writers for a volume on a single node. Plugins MUST continue to accept and allow use of the `SINGLE_NODE_WRITER` access mode to permit older COs to continue working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this.

Copy link
Contributor

@ddebroy ddebroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one suggestion for a potential clarification. Otherwise LGTM.

spec.md Outdated
@@ -2227,12 +2239,23 @@ If this RPC failed, or the CO does not know if it failed or not, it MAY choose t
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE capability (i.e., `access_mode` is either `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the the volume has MULTI_NODE capability bit in the sentence above be amended to also include SINGLE_NODE_MULTI_WRITER?

Copy link
Contributor Author

@chrishenzie chrishenzie May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is a good idea. Added, here is the amended sentence. Is this understandable?

This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different target_path and/or other arguments if the volume has MULTI_NODE or SINGLE_NODDE_MULTI_WRITER (see second table) access modes. Example MULTI_NODE access modes include MULTI_NODE_READER_ONLY, MULTI_NODE_SINGLE_WRITER or MULTI_NODE_MULTI_WRITER.

@chrishenzie chrishenzie force-pushed the single-node-access-modes branch 2 times, most recently from 7456c03 to fb7cd94 Compare May 26, 2021 15:55
@jdef
Copy link
Member

jdef commented May 26, 2021

thanks for putting this together, looking now..

spec.md Outdated
@@ -2224,15 +2236,26 @@ If the volume corresponding to the `volume_id` has already been published at the

If this RPC failed, or the CO does not know if it failed or not, it MAY choose to call `NodePublishVolume` again, or choose to call `NodeUnpublishVolume`.

This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE capability (i.e., `access_mode` is either `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`).
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE or SINGLE_NODDE_MULTI_WRITER (see second table) access modes. Example MULTI_NODE access modes include `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE or SINGLE_NODDE_MULTI_WRITER (see second table) access modes. Example MULTI_NODE access modes include `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`.
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE or SINGLE_NODE_MULTI_WRITER (see second table) access modes. Example MULTI_NODE access modes include `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this


// Can only be published once as read/write at a single workload
// on a single node, at any given time.
SINGLE_NODE_SINGLE_WRITER = 6 [(alpha_enum_value) = true];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CO's using the experimental capability SINGLE_NODE_MULTI_WRITER should probably avoid specifying SINGLE_NODE_WRITER, yes? if for no other reason than for clarity. if so, please mention that here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, COs should not use SINGLE_NODE_WRITER in this case. Added this to the description.


// Can be published as read/write at multiple workloads on a
// single node simultaneously.
SINGLE_NODE_MULTI_WRITER = 7 [(alpha_enum_value) = true];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chhsia0 any initial thoughts re: Mesos compatibility?

Copy link
Contributor

@chhsia0 chhsia0 Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems okay at a first glance. At least this is the same as how Mesos shared persistent volumes work: http://mesos.apache.org/documentation/latest/shared-resources/

BTW I'm no longer actively involved in Mesos development; this is just my personal opinion.

spec.md Outdated
| MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | OK | OK |
| Non MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION |

The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode to clarify the number of writers for a volume on a single node. Plugins MUST continue to accept and allow use of the `SINGLE_NODE_WRITER` access mode to permit older COs to continue working.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode to clarify the number of writers for a volume on a single node. Plugins MUST continue to accept and allow use of the `SINGLE_NODE_WRITER` access mode to permit older COs to continue working.
The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode to clarify the number of writers for a volume on a single node. Plugins MUST accept and allow use of the `SINGLE_NODE_WRITER` access mode (subject to the processing rules above), when either `SINGLE_NODE_SINGLE_WRITER` and/or `SINGLE_NODE_MULTI_WRITER` are supported, in order to permit older COs to continue working.

^ is this more precise? or do I not understand what the goal is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this should probably be called out somewhere in the generated .proto too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is useful, thank you.

  • I called this out on the SINGLE_NODE_MULTI_WRITER access mode capabilities in the node and controller services
  • I updated the SINGLE_NODE_MULTI_WRITER capability to explain it implies support for either or both access modes
  • The node service capability description contains "(subject to the processing rules for NodePublishVolume)", I left it out for the controller service capability description since these access modes are also used in other RPCs like CreateVolume where these rules don't apply

@chrishenzie chrishenzie force-pushed the single-node-access-modes branch 2 times, most recently from 0faba95 to 43405ec Compare May 26, 2021 21:43
spec.md Outdated
@@ -2224,15 +2246,26 @@ If the volume corresponding to the `volume_id` has already been published at the

If this RPC failed, or the CO does not know if it failed or not, it MAY choose to call `NodePublishVolume` again, or choose to call `NodeUnpublishVolume`.

This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE capability (i.e., `access_mode` is either `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`).
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE or SINGLE_NODE_MULTI_WRITER (see second table) access modes. Example MULTI_NODE access modes include `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE or SINGLE_NODE_MULTI_WRITER (see second table) access modes. Example MULTI_NODE access modes include `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`.
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE or SINGLE_NODE_MULTI_WRITER (see second table) access modes.
Example MULTI_NODE access modes include `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`.

^ nit: one sentence per line, please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

spec.md Outdated
| MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | OK | OK |
| Non MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION |

The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode to clarify the number of writers for a volume on a single node. Plugins MUST accept and allow use of the `SINGLE_NODE_WRITER` access mode (subject to the processing rules above), when either `SINGLE_NODE_SINGLE_WRITER` and/or `SINGLE_NODE_MULTI_WRITER` are supported, in order to permit older COs to continue working.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode to clarify the number of writers for a volume on a single node. Plugins MUST accept and allow use of the `SINGLE_NODE_WRITER` access mode (subject to the processing rules above), when either `SINGLE_NODE_SINGLE_WRITER` and/or `SINGLE_NODE_MULTI_WRITER` are supported, in order to permit older COs to continue working.
The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode to clarify the number of writers for a volume on a single node.
Plugins MUST accept and allow use of the `SINGLE_NODE_WRITER` access mode (subject to the processing rules above), when either `SINGLE_NODE_SINGLE_WRITER` and/or `SINGLE_NODE_MULTI_WRITER` are supported, in order to permit older COs to continue working.

one sentence per line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

| | T1=T2, P1=P2 | T1=T2, P1!=P2 | T1!=T2, P1=P2 | T1!=T2, P1!=P2 |
|---------------------------------------|-----------------|----------------|---------------------|---------------------|
| SINGLE_NODE_SINGLE_WRITER | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION |
| SINGLE_NODE_MULTI_WRITER | OK (idempotent) | ALREADY_EXISTS | OK | OK |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For many volume types - I suspect they may not support consuming same volume with different capability on same node. For example - would they allow mounting volume with different mount flags or would they allow consuming volume both as block and fs volume at same time?

So I am bit doubtful that if "P1!=P2" and "T1!=T2" - node-publish will be okay (assuming driver really enforces supplied capabilities). I suspect this will lead to drivers ignoring certain capabilities and allowing node-publish anyways. It may be okay, but strictly speaking they are not spec compliant then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently make the clarification that Pn refers to other arguments passed to NodePublishVolume except secrets. We could extend this to also include volume_capability. If we want to play it safe we could make this change just for the new table. Does this sound reasonable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the except secrets wording does not make it clear, if calling NodePublish multiple times on same node has to be done with same secret or secrets can be different. I always found that wording to be ambiguous, so when we add volume_capability to it - does it mean the capabilities has to be same? If yes - may be we should add more fields to the table. Something like:

T1!=T2, P1!=P2, Cap1!=Cap2 -> FAILED_PRECONDITION
T1!=T2, P1!=P2, Cap1==Cap2 -> OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this in the CSI standup, we're adding a sentence to make this explicit to avoid adding another column and more rows to the table. Added this sentence:

COs SHOULD NOT call NodePublishVolume a second time with different volume_capability. If this happens, the Plugin SHOULD return FAILED_PRECONDITION.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bswartz for review on this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I like it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think having explicit wording about capability helps. +1

@@ -420,6 +420,18 @@ message VolumeCapability {
// Can be published as read/write at multiple nodes
// simultaneously.
MULTI_NODE_MULTI_WRITER = 5;

// Can only be published once as read/write at a single workload
Copy link
Contributor

@humblec humblec May 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrishenzie can we mount this same volume as READONLY in the same node ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Access modes are meant to be a floor/minimum level of access requested. If a user wants a volume to have read-only ceiling/maximum level of access, that is enforced at NodePublish time using the readonly flag, which is completely orthagonal to access modes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let me rephrase the clarification which I am looking for, suppose a SP driver declared only SINGLE_NODE_SINGLE_WRITER, can we still publish a volume in READONLY ? because minimum level of access is "allowed writing on single node" in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the choice whether to ControllerPublish or NodePublish any given volume readonly ALWAYS lies with the CO. What the access mode means is that the SP MUST support write access if the CO chooses to request it. Access mode is a promise that the SP makes to the CO, but the CO doesn't have to take advantage of it.

| Non MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION |

The `SINGLE_NODE_SINGLE_WRITER` and `SINGLE_NODE_MULTI_WRITER` access modes are intended to replace the `SINGLE_NODE_WRITER` access mode to clarify the number of writers for a volume on a single node.
Plugins MUST accept and allow use of the `SINGLE_NODE_WRITER` access mode (subject to the processing rules above), when either `SINGLE_NODE_SINGLE_WRITER` and/or `SINGLE_NODE_MULTI_WRITER` are supported, in order to permit older COs to continue working.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need update NodePublish errors? Currently it says:

| Exceeds capabilities | 9 FAILED_PRECONDITION | Indicates that the CO has exceeded the volume's capabilities because the volume does not have MULTI_NODE capability. | Caller MAY choose to call ValidateVolumeCapabilities to validate the volume capabilities, or wait for the volume to be unpublished on the node. |

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to introduce another error if possible. Since the condition is capabilities being exceeded, I think this description could be generalized to any time a CO requests capabilities not supported by the volume.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use the error description for "Exceeds capabilities" used for ControllerExpandVolume and NodeExpandVolume which is more general:

Indicates that CO has specified capabilities not supported by the volume.

Also updated this for NodeStageVolume since we used the old wording there as well.

@saad-ali
Copy link
Member

saad-ali commented Jun 4, 2021

@jdef @humblec @gnufied PTAL

I'd like to merge and cut the next release on Monday.

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM mod comments

spec.md Outdated
@@ -887,6 +887,18 @@ message VolumeCapability {
// Can be published as read/write at multiple nodes
// simultaneously.
MULTI_NODE_MULTI_WRITER = 5;

// Can only be published once as read/write at a single workload
// on a single node, at any given time. Should be used instead of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// on a single node, at any given time. Should be used instead of
// on a single node, at any given time. SHOULD be used instead of

caps .. make it an official recommendation, yes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for the heads up

spec.md Outdated
SINGLE_NODE_SINGLE_WRITER = 6 [(alpha_enum_value) = true];

// Can be published as read/write at multiple workloads on a
// single node simultaneously. Should be used instead of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// single node simultaneously. Should be used instead of
// single node simultaneously. SHOULD be used instead of

spec.md Outdated
@@ -2224,15 +2246,30 @@ If the volume corresponding to the `volume_id` has already been published at the

If this RPC failed, or the CO does not know if it failed or not, it MAY choose to call `NodePublishVolume` again, or choose to call `NodeUnpublishVolume`.

This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE capability (i.e., `access_mode` is either `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`).
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has `MULTI_NODE` or `SINGLE_NODE_MULTI_WRITER` (see second table) access modes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has `MULTI_NODE` or `SINGLE_NODE_MULTI_WRITER` (see second table) access modes.
This RPC MAY be called by the CO multiple times on the same node for the same volume with a possibly different `target_path` and/or other arguments if the volume supports either `MULTI_NODE_...` or `SINGLE_NODE_MULTI_WRITER` access modes (see second table).

spec.md Outdated
@@ -2224,15 +2246,30 @@ If the volume corresponding to the `volume_id` has already been published at the

If this RPC failed, or the CO does not know if it failed or not, it MAY choose to call `NodePublishVolume` again, or choose to call `NodeUnpublishVolume`.

This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE capability (i.e., `access_mode` is either `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`).
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has `MULTI_NODE` or `SINGLE_NODE_MULTI_WRITER` (see second table) access modes.
Example `MULTI_NODE` access modes include `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Example `MULTI_NODE` access modes include `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`.
The possible `MULTI_NODE_...` access modes are `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`.

spec.md Outdated
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has MULTI_NODE capability (i.e., `access_mode` is either `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`).
This RPC MAY be called by the CO multiple times on the same node for the same volume with possibly different `target_path` and/or other arguments if the volume has `MULTI_NODE` or `SINGLE_NODE_MULTI_WRITER` (see second table) access modes.
Example `MULTI_NODE` access modes include `MULTI_NODE_READER_ONLY`, `MULTI_NODE_SINGLE_WRITER` or `MULTI_NODE_MULTI_WRITER`.
COs SHOULD NOT call `NodePublishVolume` a second time with different `volume_capability`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
COs SHOULD NOT call `NodePublishVolume` a second time with different `volume_capability`.
COs SHOULD NOT call `NodePublishVolume` a second time with a different `volume_capability`.

spec.md Outdated
The following table shows what the Plugin SHOULD return when receiving a second `NodePublishVolume` on the same volume on the same node:

(`Tn`: target path of the n-th `NodePublishVolume`, `Pn`: other arguments of the n-th `NodePublishVolume` except `secrets`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(`Tn`: target path of the n-th `NodePublishVolume`, `Pn`: other arguments of the n-th `NodePublishVolume` except `secrets`)
(**T<sub>n</sub>**: target path of the n<sup>th</sup> `NodePublishVolume`; **P<sub>n</sub>**: other arguments of the n<sup>th</sup> `NodePublishVolume` except `secrets`)

spec.md Outdated
|---------------------------------------|-----------------|----------------|---------------------|---------------------|
| SINGLE_NODE_SINGLE_WRITER | OK (idempotent) | ALREADY_EXISTS | FAILED_PRECONDITION | FAILED_PRECONDITION |
| SINGLE_NODE_MULTI_WRITER | OK (idempotent) | ALREADY_EXISTS | OK | OK |
| MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | OK | OK |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| MULTI_NODE | OK (idempotent) | ALREADY_EXISTS | OK | OK |
| MULTI_NODE_... | OK (idempotent) | ALREADY_EXISTS | OK | OK |

spec.md Outdated
@@ -2313,7 +2350,7 @@ The CO MUST implement the specified error recovery behavior when it encounters t
|-----------|-----------|-------------|-------------------|
| Volume does not exist | 5 NOT_FOUND | Indicates that a volume corresponding to the specified `volume_id` does not exist. | Caller MUST verify that the `volume_id` is correct and that the volume is accessible and has not been deleted before retrying with exponential back off. |
| Volume published but is incompatible | 6 ALREADY_EXISTS | Indicates that a volume corresponding to the specified `volume_id` has already been published at the specified `target_path` but is incompatible with the specified `volume_capability` or `readonly` flag. | Caller MUST fix the arguments before retrying. |
| Exceeds capabilities | 9 FAILED_PRECONDITION | Indicates that the CO has exceeded the volume's capabilities because the volume does not have MULTI_NODE capability. | Caller MAY choose to call `ValidateVolumeCapabilities` to validate the volume capabilities, or wait for the volume to be unpublished on the node. |
| Exceeds capabilities | 9 FAILED_PRECONDITION | Indicates that CO has specified capabilities not supported by the volume. | Caller MAY choose to call `ValidateVolumeCapabilities` to validate the volume capabilities, or wait for the volume to be unpublished on the node. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| Exceeds capabilities | 9 FAILED_PRECONDITION | Indicates that CO has specified capabilities not supported by the volume. | Caller MAY choose to call `ValidateVolumeCapabilities` to validate the volume capabilities, or wait for the volume to be unpublished on the node. |
| Exceeds capabilities | 9 FAILED_PRECONDITION | Indicates that the CO has specified capabilities not supported by the volume. | Caller MAY choose to call `ValidateVolumeCapabilities` to validate the volume capabilities, or wait for the volume to be unpublished on the node. |

@chrishenzie
Copy link
Contributor Author

@jdef Done, addressed comments. Thank you for the review!

// Can only be published once as read/write at a single workload
// on a single node, at any given time. SHOULD be used instead of
// SINGLE_NODE_WRITER for COs using the experimental
// SINGLE_NODE_MULTI_WRITER capability.
Copy link
Contributor

@humblec humblec Jun 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrishenzie this should be SINGLE_NODE_SINGLE_WRITER ? or we are kind of defaulting SINGLE_NODE_MULTI_WRITER as a replacement for SINGLE_NODE_WRITER ? if latter is the case, please ignore the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both SINGLE_NODE_SINGLE_WRITER and SINGLE_NODE_MULTI_WRITER are replacements for SINGLE_NODE_WRITER depending on the number of intended writers. The capability SINGLE_NODE_MULTI_WRITER implies support for SINGLE_NODE_MULTI_WRITER and/or SINGLE_NODE_SINGLE_WRITER. The reason for naming the capability like so was because it was the access mode that was missing from the CSI spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The capability SINGLE_NODE_MULTI_WRITER implies support for SINGLE_NODE_MULTI_WRITER and/or SINGLE_NODE_SINGLE_WRITER -> yeah, this aligns with my thought too or what I meant with replacement as a default in above comment, so we are good to mention this over over here ! Thanks @chrishenzie. 👍

@humblec
Copy link
Contributor

humblec commented Jun 7, 2021

lgtm from end. Thanks 👍

@saad-ali
Copy link
Member

saad-ali commented Jun 7, 2021

LGTM mod comments

lgtm from end. Thanks 👍

Thanks folks. Mergining this PR.

@saad-ali saad-ali merged commit 0cedd1b into container-storage-interface:master Jun 7, 2021
@chrishenzie chrishenzie deleted the single-node-access-modes branch June 7, 2021 19:13
@saad-ali saad-ali mentioned this pull request Jun 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants