-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
KEP-3698: Multi-Network requirements
- Loading branch information
Showing
10 changed files
with
395 additions
and
0 deletions.
There are no files selected for viewing
375 changes: 375 additions & 0 deletions
375
keps/sig-network/3698-multi-network-requirements/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,375 @@ | ||
# KEP-2829: Migrate Gateway API to k8s.io Group | ||
|
||
<!-- toc --> | ||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
- [Goals](#goals) | ||
- [Non-Goals](#non-goals) | ||
- [Proposal](#proposal) | ||
- [Personas](#personas) | ||
- [Network Administrator](#network-administrator) | ||
- [User](#user) | ||
- [Terminology](#terminology) | ||
- [User Stories](#user-stories) | ||
- [Story #1](#story-1) | ||
- [Story #1a](#story-1a) | ||
- [Story #2](#story-2) | ||
- [Story #3](#story-3) | ||
- [Story #4](#story-4) | ||
- [Story #5](#story-5) | ||
- [Story #6](#story-6) | ||
- [Story #7](#story-7) | ||
- [Story #8](#story-8) | ||
- [Story #9](#story-9) | ||
- [Requirements](#requirements) | ||
- [Phase I (base API and reference in Pod)](#phase-i-base-api-and-reference-in-pod) | ||
- [Phase II (scheduler, kubelet and API probing)](#phase-ii-scheduler-kubelet-and-api-probing) | ||
- [Phase III (basic Kubernetes features integration)](#phase-iii-basic-kubernetes-features-integration) | ||
- [Phase IV (extended functionality)](#phase-iv-extended-functionality) | ||
- [Risks and Mitigations](#risks-and-mitigations) | ||
- [Design Details](#design-details) | ||
- [Graduation Criteria](#graduation-criteria) | ||
- [Alpha](#alpha) | ||
- [Beta](#beta) | ||
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) | ||
- [Feature Enablement and Rollback](#feature-enablement-and-rollback) | ||
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) | ||
- [Monitoring Requirements](#monitoring-requirements) | ||
- [Dependencies](#dependencies) | ||
- [Implementation History](#implementation-history) | ||
- [Drawbacks](#drawbacks) | ||
- [Alternatives](#alternatives) | ||
<!-- /toc --> | ||
|
||
## Summary | ||
|
||
Today Kubernetes Networking is very straightforward and easy to achieve. Main | ||
requirement is to enable connectivity between the Pods in the cluster. That | ||
simple approach satisfies most of Kubernetes customers, but it is not sufficient | ||
for cases where more complex networking is required: | ||
* some applications leverage different isolated networks, exposed through | ||
different interfaces | ||
* some leverage performance oriented interfaces (e.g. AF_XDP, memif, SR-IOV), | ||
besides the regular management interface | ||
* others require support for specific protocols not yet supported in Kubernetes | ||
|
||
For such requirements we need a solution that allows a user to express this in | ||
the pod specification. This can be done by an external solution leveraging | ||
annotations, but having it incorporated in Kubernetes would be a much cleaner | ||
and safer approach that would allow better compatibility and consistency for | ||
workloads with these needs. | ||
|
||
This KEP is an entry level document for the whole Multi-Network endeavor. Here | ||
we will define a set of requirements to follow. Additionally, we will introduce | ||
a phased approach where each phase will have its own KEP with detailed design. | ||
|
||
## Motivation | ||
|
||
We want to have a common API allowing us to define a catalog of different | ||
networks in the Kubernetes cluster. It would allow attaching a pod to one or | ||
several networks via a given type of interface depending on its connectivity | ||
or performance needs. | ||
|
||
### Goals | ||
|
||
Define user stories and requirements for the Multi-Network effort in Kubernetes. | ||
|
||
### Non-Goals | ||
|
||
Define API and implementation. | ||
|
||
## Proposal | ||
|
||
### Personas | ||
|
||
#### Network Administrator | ||
|
||
With this proposal we will introduce a new “persona”, called **Network | ||
Administrator**. This role will be responsible for defining and managing | ||
“Networks” (name TBD, referred as “Object” in the Requirement section) that | ||
properly describe the infrastructure available for the cluster, namespace and | ||
workloads. This persona can define which users can “attach” to a specific | ||
“Network”. | ||
|
||
#### User | ||
|
||
User is the consumer of Networks (name TBD, referred as “Object” in the | ||
Requirement section) via referencing them in their workloads. Users usually will | ||
not create or remove the Network on their own. | ||
|
||
### Terminology | ||
|
||
* **Default network** - This is the initial cluster-wide Pod networking provided | ||
during cluster creation that is available to the Pod when no additional | ||
networking configuration is provided. | ||
* **Primary network** - This is the network inside the Pod which interface is | ||
used for the default gateway. | ||
|
||
### User Stories | ||
|
||
All user stories represent the type of use cases the multi-networking API should | ||
be able to support. References to technologies or exact products does not | ||
indicate that this API will directly support them. The set of requirements, | ||
defined out of these use cases, are the final indicator of what will be covered | ||
by this API and effort. | ||
|
||
#### Story #1 | ||
As a Cloud Native Network Function (CNF) vendor I require an additional | ||
interface to be provisioned into the Kubernetes Pod. Each of these interfaces | ||
has to be in an isolated network for regulatory compliance. The isolation has to | ||
be done on a Layer-2. | ||
|
||
<p align="center"> | ||
<img src="mn-story-1.png?raw=true" alt="multi-network story 1 network L2 isolation"/> | ||
</p> | ||
|
||
#### Story #1a | ||
As a Cloud NativeNetwork Function (CNF) vendor I require an additional interface | ||
to be provisioned into the Kubernetes Pod. Each of these interfaces has to be in | ||
an isolated network for regulatory compliance. The isolation has to be done on a | ||
Layer-3. | ||
|
||
<p align="center"> | ||
<img src="mn-story-1a.png?raw=true" alt="multi-network story 1a network L3 isolation"/> | ||
</p> | ||
|
||
#### Story #2 | ||
As a Cloud Native Network Function (CNF) vendor I require a HW-based interface | ||
(e.g. SRIOV VF) to be provisioned to my workload Pod. I need to leverage that HW | ||
for performance purposes (high bandwidth, low latency), that my user-space | ||
application (e.g. DPDK-based) can use. The VF will not use the standard netdev | ||
kernel module. The Pod’s scheduling to the nodes should be based on hardware | ||
availability (e.g. devicePlugin or some other way). | ||
|
||
<p align="center"> | ||
<img src="mn-story-2.png?raw=true" alt="multi-network story 2 HW interface"/> | ||
</p> | ||
|
||
#### Story #3 | ||
I have implemented my Kubernetes cluster networking using a virtual switch. In | ||
this implementation I am capable of creating isolated Networks. I need a means | ||
to express to which Network my workloads connect to. | ||
|
||
<p align="center"> | ||
<img src="mn-story-3.png?raw=true" alt="multi-network story 3 virtual switch isolation"/> | ||
</p> | ||
|
||
#### Story #4 | ||
As a Virtual Machine -based compute platform provider that I run on top of | ||
Kubernetes and Kubevirt I require multi-tenancy. The isolation has to be | ||
achieved on Layer-2 for security reasons. | ||
|
||
<p align="center"> | ||
<img src="mn-story-4.png?raw=true" alt="multi-network story 4 Kubevirt VMs"/> | ||
</p> | ||
|
||
#### Story #5 | ||
As a platform operator I need to connect my on-premise networks to my workload | ||
Pods. I need to have the ability to represent these networks in my Kubernetes | ||
cluster in such a way that I can easily use them in my workloads. | ||
|
||
<p align="center"> | ||
<img src="mn-story-5.png?raw=true" alt="multi-network story 5 On-Premise Network representation"/> | ||
</p> | ||
|
||
#### Story #6 | ||
As a Kubernetes cluster administrator I wish to isolate workloads based on | ||
namespaces and network access via assigning different default Network to a | ||
Namespace. I do not want the tenants to change their manifests for that purpose. | ||
Those workloads should have the same level of Kubernetes functionality: | ||
Services, NetworkPolicies, access to Kubernetes API. I wish to support | ||
“profiles” for Namespace, where I can not only change default Network, | ||
but define a set of Networks assigned to given Namespace that Pods created in | ||
that NS are automatically attached to. | ||
|
||
<p align="center"> | ||
<img src="mn-story-6.png?raw=true" alt="multi-network story 6 namespace networks profiles"/> | ||
</p> | ||
|
||
#### Story #7 | ||
As a “Power User” with admin privileges I wish to have the ability to modify my | ||
Pod network namespace without any restrictions. I am aware that by doing this I | ||
might break established contracts for the kubernetes features. | ||
|
||
#### Story #8 | ||
As a Virtual Machine -based compute platform provider that I run on top of | ||
Kubernetes and Kubevirt I need the ability to add/remove Network Interfaces to | ||
existing VMs without re-creating the VM. This would not be applicable for the | ||
Primary Interface of the Pod. | ||
|
||
<p align="center"> | ||
<img src="mn-story-8.png?raw=true" alt="multi-network story 8 network interface hot-plug"/> | ||
</p> | ||
|
||
#### Story #9 | ||
As an infrastructure administrator I wish to support IPv6 SLAAC in the | ||
Kubernetes Pods. SLAAC's nature is to dynamically assign IP addresses and change | ||
them over time. Those should be reflected in Pod status. This would not be | ||
applicable for the Primary Interface of the Pod. | ||
|
||
### Requirements | ||
|
||
Below requirements are divided into phases. Each phase will produce a separate | ||
KEP with detailed design for specified set of requirements. Each of this KEP | ||
will have to take all of the requirements, from each phase, into consideration | ||
when created. | ||
|
||
#### Phase I (base API and reference in Pod) | ||
1. This effort shall not change the behavior of Today existing clusters | ||
2. We need to introduce an “Object” that represents the existing | ||
infrastructure’s networking | ||
3. “Object” is decoupled from the network definition/description created by the | ||
network administrator | ||
4. The “Object” shall not define any implementation specific parameters in that | ||
object | ||
5. “Object” shall provide option to define: | ||
* IPAM mode: external/internal | ||
* List of route prefixes - optional and not forced on the implementations | ||
6. “Object” can reference to implementation-specific parameters | ||
7. “Object” is the consumer-facing declaration for workloads | ||
8. Cluster “Default” “Object” is the network the cluster has been created with | ||
9. Cluster “Default” “Object” cannot be changed or removed | ||
10. Pods shall reference the “object” when trying to attach to the specific | ||
networks | ||
* Pods has to specify all the “object” that it wishes to attach to, | ||
including the Cluster “Default” “Object” | ||
11. Workloads can access “Object” when “attach” RBAC exists at the attachment time | ||
* “attach” RBAC is new verb to allow referencing API objects | ||
12. The Pod reference to a Networks is optional and when NOT specified, Pod | ||
connects to “Default” “Object” (network the cluster has been created with) | ||
13. Pod shall be able to provide additional configuration on how it attaches to | ||
a network | ||
* Identify what “Object” is the Primary network | ||
* Optional parameters: MAC address, IP address, speed, MTU, interface name etc. | ||
14. Every Pod connected to specific network (represented by the “Object”) must | ||
have connectivity within that network between each other within the Cluster | ||
* A Pod connected to specific network (represented by the “Object”) may or | ||
may not have cross connectivity between different networks (represented by | ||
the “Object”) | ||
15. Pods attached to a network are connected to each other in a manner defined | ||
by the “Object” implementation | ||
16. Basic network Interface information for each attachment will be exposed to | ||
runtime Pod (via e.g. environment variables, downward API etc.) | ||
|
||
#### Phase II (scheduler, kubelet and API probing) | ||
17. Networks represented by “Object” can be selectively (per Node) available in | ||
the Cluster. This does NOT apply to the Cluster “Default” “Object” | ||
18. Kubelet network-based probing is optional for the “Object” connections to | ||
Pod | ||
19. “Object” connections to Pod are optionally able to connect to Kubernetes | ||
API - the Pods connections via non-default Pod network does not require access | ||
to Kubernetes API | ||
20. Kubernetes API can optionally reach to “Object” connections to | ||
Pod - Kubernetes API access to the Pod via non-default Pod network is not | ||
required | ||
|
||
#### Phase III (basic Kubernetes features integration) | ||
21. “Object” connections to Pod are optionally able to provide Service, | ||
NetworkPolicies functionality | ||
|
||
#### Phase IV (extended functionality) | ||
22. Have capability to override Cluster “Default” “Object” on the namespace | ||
“level” | ||
23. Have capability to add/remove attachments to/from running Pods | ||
24. Have capability to add/remove IP on running Pods network attachments | ||
|
||
### Risks and Mitigations | ||
|
||
N/A | ||
|
||
## Design Details | ||
|
||
N/A | ||
|
||
### Graduation Criteria | ||
|
||
#### Alpha | ||
|
||
- Approval from subproject owners + KEP reviewers | ||
|
||
#### Beta | ||
|
||
N/A | ||
|
||
## Production Readiness Review Questionnaire | ||
|
||
### Feature Enablement and Rollback | ||
|
||
###### How can this feature be enabled / disabled in a live cluster? | ||
|
||
N/A | ||
|
||
###### Does enabling the feature change any default behavior? | ||
|
||
No | ||
|
||
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? | ||
|
||
N/A | ||
|
||
###### What happens if we reenable the feature if it was previously rolled back? | ||
|
||
N/A | ||
|
||
###### Are there any tests for feature enablement/disablement? | ||
|
||
N/A | ||
|
||
### Rollout, Upgrade and Rollback Planning | ||
|
||
###### How can a rollout or rollback fail? Can it impact already running workloads? | ||
|
||
N/A | ||
|
||
###### What specific metrics should inform a rollback? | ||
|
||
N/A | ||
|
||
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? | ||
|
||
N/A | ||
|
||
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? | ||
|
||
No | ||
|
||
### Monitoring Requirements | ||
|
||
###### How can an operator determine if the feature is in use by workloads? | ||
|
||
N/A | ||
|
||
###### How can someone using this feature know that it is working for their instance? | ||
|
||
N/A | ||
|
||
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? | ||
|
||
N/A | ||
|
||
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? | ||
|
||
N/A | ||
|
||
###### Are there any missing metrics that would be useful to have to improve observability of this feature? | ||
|
||
N/A | ||
|
||
### Dependencies | ||
|
||
###### Does this feature depend on any specific services running in the cluster? | ||
|
||
No | ||
|
||
## Implementation History | ||
|
||
N/A | ||
|
||
## Drawbacks | ||
|
||
N/A | ||
|
||
## Alternatives | ||
|
||
We will not define a unified API, and this feature will live on as just an addon | ||
to Kubernetes. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
title: Multi-Network requirements | ||
kep-number: 3698 | ||
authors: | ||
- "@mskrocki" | ||
owning-sig: sig-network | ||
status: provisional | ||
creation-date: 2022-12-19 | ||
reviewers: | ||
- "@thockin" | ||
- "@khenidak" | ||
- "@danwinship" | ||
- "@aojea" | ||
- "@bowei" | ||
- "@hbagdi" | ||
- "@jpeach" | ||
approvers: | ||
- "@thockin" | ||
|
||
# The target maturity stage in the current dev cycle for this KEP. | ||
stage: alpha |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.