From 2bede532196bb267e79681b9247235489c5d49f5 Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Thu, 11 Jan 2024 13:28:42 -0700 Subject: [PATCH 01/28] init of kni kep --- .../000-k8s-network-interface/README.md | 807 ++++++++++++++++++ .../000-k8s-network-interface/kep.yaml | 46 + 2 files changed, 853 insertions(+) create mode 100644 keps/sig-network/000-k8s-network-interface/README.md create mode 100644 keps/sig-network/000-k8s-network-interface/kep.yaml diff --git a/keps/sig-network/000-k8s-network-interface/README.md b/keps/sig-network/000-k8s-network-interface/README.md new file mode 100644 index 00000000000..7018679e98f --- /dev/null +++ b/keps/sig-network/000-k8s-network-interface/README.md @@ -0,0 +1,807 @@ + +# KEP-NNNN: Your short, descriptive title + + + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [User Stories (Optional)](#user-stories-optional) + - [Story 1](#story-1) + - [Story 2](#story-2) + - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + + + +## Motivation + + + +### Goals + + + +### Non-Goals + + + +## Proposal + + + +### User Stories (Optional) + + + +#### Story 1 + +#### Story 2 + +### Notes/Constraints/Caveats (Optional) + + + +### Risks and Mitigations + + + +## Design Details + + + +### Test Plan + + + +[ ] I/we understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +##### Prerequisite testing updates + + + +##### Unit tests + + + + + +- ``: `` - `` + +##### Integration tests + + + + + +- : + +##### e2e tests + + + +- : + +### Graduation Criteria + + + +### Upgrade / Downgrade Strategy + + + +### Version Skew Strategy + + + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + + + +- [ ] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: + - Components depending on the feature gate: +- [ ] Other + - Describe the mechanism: + - Will enabling / disabling the feature require downtime of the control + plane? + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? + +###### Does enabling the feature change any default behavior? + + + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + + + +###### What happens if we reenable the feature if it was previously rolled back? + +###### Are there any tests for feature enablement/disablement? + + + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +###### What specific metrics should inform a rollback? + + + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +###### How can someone using this feature know that it is working for their instance? + + + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [ ] Other (treat as last resort) + - Details: + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + + + +###### Will enabling / using this feature result in introducing new API types? + + + +###### Will enabling / using this feature result in any new calls to the cloud provider? + + + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + + + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + + + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + + + +###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)? + + + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +###### What are other known failure modes? + + + +###### What steps should be taken if SLOs are not being met to determine the problem? + +## Implementation History + + + +## Drawbacks + + + +## Alternatives + + + +## Infrastructure Needed (Optional) + + diff --git a/keps/sig-network/000-k8s-network-interface/kep.yaml b/keps/sig-network/000-k8s-network-interface/kep.yaml new file mode 100644 index 00000000000..320730ffd5a --- /dev/null +++ b/keps/sig-network/000-k8s-network-interface/kep.yaml @@ -0,0 +1,46 @@ +title: k8s-network-interface +kep-number: NNNN +authors: + - "@mikezappa87" +owning-sig: sig-network +participating-sigs: + - sig-network +status: provisional +creation-date: 2024-01-11 +reviewers: + - TBD +approvers: + - TBD + +see-also: + - "/keps/sig-aaa/1234-we-heard-you-like-keps" + - "/keps/sig-bbb/2345-everyone-gets-a-kep" +replaces: + - "/keps/sig-ccc/3456-replaced-kep" + +# The target maturity stage in the current dev cycle for this KEP. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v1.30" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.30" + beta: "v1.31" + stable: "v1.32" + +# The following PRR answers are required at alpha release +# List the feature gate name and the components for which it must be enabled +feature-gates: + - name: kni + components: + - kubelet + - cri-api +disable-supported: true + +# The following PRR answers are required at beta release +metrics: + - my_feature_metric From 14eeea2ec2b237c46d345a3e4a66cf7efc9f45ab Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Wed, 17 Jan 2024 10:23:11 -0700 Subject: [PATCH 02/28] update issue number --- .../README.md | 0 .../kep.yaml | 2 +- 2 files changed, 1 insertion(+), 1 deletion(-) rename keps/sig-network/{000-k8s-network-interface => 4410-k8s-network-interface}/README.md (100%) rename keps/sig-network/{000-k8s-network-interface => 4410-k8s-network-interface}/kep.yaml (98%) diff --git a/keps/sig-network/000-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md similarity index 100% rename from keps/sig-network/000-k8s-network-interface/README.md rename to keps/sig-network/4410-k8s-network-interface/README.md diff --git a/keps/sig-network/000-k8s-network-interface/kep.yaml b/keps/sig-network/4410-k8s-network-interface/kep.yaml similarity index 98% rename from keps/sig-network/000-k8s-network-interface/kep.yaml rename to keps/sig-network/4410-k8s-network-interface/kep.yaml index 320730ffd5a..58a1fdbaac6 100644 --- a/keps/sig-network/000-k8s-network-interface/kep.yaml +++ b/keps/sig-network/4410-k8s-network-interface/kep.yaml @@ -1,5 +1,5 @@ title: k8s-network-interface -kep-number: NNNN +kep-number: 4410 authors: - "@mikezappa87" owning-sig: sig-network From eae3341015c52ede846a872bf52512b8b1b36e7c Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Wed, 24 Jan 2024 13:59:16 -0700 Subject: [PATCH 03/28] WIP: KNI KEP --- .../4410-k8s-network-interface/README.md | 146 ++++++++++++++++++ .../4410-k8s-network-interface/kep.yaml | 10 +- 2 files changed, 151 insertions(+), 5 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 7018679e98f..15fa69d24e9 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -173,6 +173,8 @@ updates. [documentation style guide]: https://github.com/kubernetes/community/blob/master/contributors/guide/style-guide.md --> +This proposal is to design and implement the KNI [Kubernetes Networking Interface] or better known as Kubernetes Networking reImagined. KNI is a foundational network API that is specific to Kubernetes. KNI will provide the users the ability to make Kubernetes networking completely pluggable. + ## Motivation +1. Design and implement the KNI-API +2. Provide documentation, examples, troubleshooting and FAQ's for KNI. + * we should provide a example network runtime +3. Provide an API that is flexible for experimentation and opinionated use cases + * example extradata map[string] string +4. Provide integration with on premise or cloud systems to provide network status +5. Provide an API that provides networks available on the node +6. Determine the reference implementation +7. Establish feature parity with current [ADD, DEL] +8. Decouple Node and Pod network setup +9. Ensure that the network runtime is consolidated inside of a Pod +10. Design a cool looking t-shirt + ### Non-Goals +1. Any changes to the kube-scheduler +2. Any specific implementation other than the reference implementation. However we should ensure the KNI-API is flexible enough to support + ## Proposal +The proposal of this KEP is to design and implement the KNI-API and make necessary changes to the CRI-API and container runtimes. The scope should be kept to a minimum and we should target feature parity which will include the following: + +AttachNetwork +DetachNetwork +QueryPodNetwork +QueryNetworkStatus +QueryNodeNetworks + ### User Stories (Optional) + + #### Story 1 #### Story 2 @@ -254,6 +282,124 @@ required) or even code snippets. If there's any ambiguity about HOW your proposal will be implemented, this is the place to discuss them. --> +### Draft KNI-API (used in POC) + +We will review with community and take feedback + +``` +service KNI { + rpc AttachNetwork(AttachNetworkRequest) returns (AttachNetworkResponse) {} //MVP + rpc DetachNetwork(DetachNetworkRequest) returns (DetachNetworkResponse) {} //MVP + rpc QueryPodNetwork(QueryPodNetworkRequest) returns (QueryPodNetworkResponse) {} //MVP + rpc SetupNodeNetwork(SetupNodeNetworkRequest) returns (SetupNodeNetworkResponse) {} + rpc QueryNodeNetworks(QueryNodeNetworksRequest) returns (QueryNodeNetworksResponse) {} +} + +message AttachNetworkRequest { + string name = 1; + string id = 2; + string namespace = 3; + Isolation isolation = 4; + DNSConfig dns_config = 5; + repeated PortMapping port_mappings = 6; + map labels = 7; + map annotations = 8; + map extradata = 9; +} + +message AttachNetworkResponse { + map ipconfigs = 1; + map extradata = 2; +} + +message DetachNetworkRequest { + string name = 1; + string id = 2; + string namespace = 3; + Isolation isolation = 4; + map labels = 5; + map annotations = 6; + map extradata = 7; +} + +message DetachNetworkResponse { +} + +message IPConfig { + repeated string ip = 1; + string mac = 2; + map extradata = 3; +} + +message Network { + string name = 1; + bool ready = 2; + map extradata = 3; +} + +message Isolation { + string path = 1; + string type = 2; //Network Namespace, kernel, … + map extradata = 3; +} + +// DNSConfig specifies the DNS servers and search domains of a sandbox. +message DNSConfig { + // List of DNS servers of the cluster. + repeated string servers = 1; + // List of DNS search domains of the cluster. + repeated string searches = 2; + // List of DNS options. See https://linux.die.net/man/5/resolv.conf + // for all available options. + repeated string options = 3; +} + +enum Protocol { + TCP = 0; + UDP = 1; + SCTP = 2; +} + +// PortMapping specifies the port mapping configurations of a sandbox. +message PortMapping { + // Protocol of the port mapping. + Protocol protocol = 1; + // Port number within the container. Default: 0 (not specified). + int32 container_port = 2; + // Port number on the host. Default: 0 (not specified). + int32 host_port = 3; + // Host IP. + string host_ip = 4; +} + +message QueryNodeNetworksRequest { + +} + +message QueryNodeNetworksResponse { + repeated Network networks = 1; + map extradata = 2; +} + +message SetupNodeNetworkRequest { + +} + +message SetupNodeNetworkResponse { + +} + +message QueryPodNetworkRequest { + string name = 1; + string id = 2; + string namespace = 3; +} + +message QueryPodNetworkResponse { + map ipconfigs = 1; + map extradata = 2; +} +``` ### Test Plan -# KEP-NNNN: Your short, descriptive title - - +# KEP-4410: Kubernetes Networking reImagined +Kubernetes networking has traditionally been challenging to understand for users +interacting with the Kubernetes API, and there has been considerable flexibility +in how Container Network Interfaces (CNIs) set up networking within clusters. +This has resulted in a scenario where things like pod networking (including pod +to pod networking) is opaque to users, with different implementations taking +markedly different approaches. This fragmentation and issues with the API have +negatively impacted adoption in sectors such as telecommunications. Our goal is +to transform Kubernetes networking by making networks and their components +actual resources within the Kubernetes API. This will allow for the development +of shared functionalities and their integration into the API. We anticipate that +this new approach will enhance support for areas that are currently struggling, +facilitate the development and promotion of common features, and better define +and accommodate advanced functionalities and potential areas for expansion. ### Goals From 217f1c35162c2e070db6c2c17a525d7d155f7581 Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Fri, 26 Jan 2024 10:14:27 -0700 Subject: [PATCH 07/28] change ordering of goals --- .../4410-k8s-network-interface/README.md | 28 ++++++++----------- 1 file changed, 11 insertions(+), 17 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index c3289770bdc..30895aa2c11 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -192,18 +192,18 @@ List the specific goals of the KEP. What is it trying to achieve? How will we know that this has succeeded? --> -1. Design and implement the KNI-API -2. Provide documentation, examples, troubleshooting and FAQ's for KNI. +1. Design a cool looking t-shirt +2. Design and implement the KNI-API +3. Provide documentation, examples, troubleshooting and FAQ's for KNI. * we should provide a example network runtime -3. Provide an API that is flexible for experimentation and opinionated use cases +4. Provide an API that is flexible for experimentation and opinionated use cases * example extradata map[string] string -4. Provide integration with on premise or cloud systems to provide network status -5. Provide an API that provides networks available on the node -6. Determine the reference implementation -7. Establish feature parity with current [ADD, DEL] -8. Decouple Node and Pod network setup -9. Ensure that the network runtime is consolidated inside of a Pod -10. Design a cool looking t-shirt +5. Provide integration with on premise or cloud systems to provide network status +6. Provide an API that provides networks available on the node +7. Determine the reference implementation +8. Establish feature parity with current [ADD, DEL] +9. Decouple Node and Pod network setup +10. Ensure that the network runtime is consolidated inside of a Pod ### Non-Goals @@ -226,13 +226,7 @@ The "Design Details" section below is for the real nitty-gritty. --> -The proposal of this KEP is to design and implement the KNI-API and make necessary changes to the CRI-API and container runtimes. The scope should be kept to a minimum and we should target feature parity which will include the following: - -AttachNetwork -DetachNetwork -QueryPodNetwork -QueryNetworkStatus -QueryNodeNetworks +The proposal of this KEP is to design and implement the KNI-API and make necessary changes to the CRI-API and container runtimes. The scope should be kept to a minimum and we should target feature parity. ### User Stories (Optional) From 64eca47a1785d9f30fefb92aa3fb1c1519bf188f Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Fri, 26 Jan 2024 23:53:57 -0700 Subject: [PATCH 08/28] update goals and summary --- .../4410-k8s-network-interface/README.md | 35 ++++++++++++++----- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 30895aa2c11..845f10f0d1b 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -167,7 +167,7 @@ updates. [documentation style guide]: https://github.com/kubernetes/community/blob/master/contributors/guide/style-guide.md --> -This proposal is to design and implement the KNI [Kubernetes Networking Interface] or better known as Kubernetes Networking reImagined. KNI is a foundational network API that is specific to Kubernetes. KNI will provide the users the ability to make Kubernetes networking completely pluggable. +This proposal is to design and implement the KNI [Kubernetes Networking Interface] or better known as Kubernetes Networking reImagined. KNI will create a Network resource and provide an API that will provide network status, availability, how to attach a pod to a network, detach the pod from the network and update a pods network. ## Motivation @@ -195,15 +195,21 @@ know that this has succeeded? 1. Design a cool looking t-shirt 2. Design and implement the KNI-API 3. Provide documentation, examples, troubleshooting and FAQ's for KNI. - * we should provide a example network runtime + * we should provide a example network runtime and easy starter project 4. Provide an API that is flexible for experimentation and opinionated use cases * example extradata map[string] string -5. Provide integration with on premise or cloud systems to provide network status -6. Provide an API that provides networks available on the node -7. Determine the reference implementation -8. Establish feature parity with current [ADD, DEL] -9. Decouple Node and Pod network setup -10. Ensure that the network runtime is consolidated inside of a Pod +5. Provide an API to report on a networks status +6. Provide an API to provide network metrics such as available IP addresses +7. Provide an API that provides networks available on the node +8. Provide an API that will attach a one or more networks to a pod +9. Provide an API that will detach a one or more network from a pod +10. Provide an API that will update a network attachment of a pod +11. Determine the reference implementation +12. Establish feature parity with current CNI [ADD, DEL] +13. We should decouple the Pod and Node Network setup (The reporting of this could be different statuses?) +14. Provide the ability to run garbage collection to ensure no resources are left behind +15. We will provide the ability to identify the IP address family without parsing the value (such as a field) +16. Make a design that is backwards compatible with the CNI ### Non-Goals @@ -241,8 +247,19 @@ bogged down. #### Story 1 +As a cluster operator, I need the ability to determine my network(s) is ready so that my pods come up with a working network. + #### Story 2 +As a cluster operator, I need the ability to determine what networks are available on my node so that upstream components can ensure the pod is scheduled on the appropriate node. + +#### Story 3 + +As a Kubernetes developer, I need the ability to have extension points for pod network setup, teardown and update so that I can support future Kubernetes networking features with either reducing the changes to core kubernetes or eliminating them + +#### Story 4 + + ### Notes/Constraints/Caveats (Optional) +"Network Readiness" is an implementation detail. We need to provide this RPC to the user. + ### Risks and Mitigations -"Network Readiness" is an implementation detail. We need to provide this RPC to the user. +The specifics of "Network Readiness" is an implementation detail. We need to provide this RPC to the user. + +We should consider the trade offs to using a Native K8s Network object or CRD's. +Using a native object would allow passing a slice of network type to AttachNetwork + +Since the network runtime can be run separated from the container runtime, you can package everything into a pod and not need to have binaries on disk. This allows the CNI plugins to be isolated in the pod and the pod will never need to mount /opt/cni/bin or /etc/cni/net.d. This offers a potentially more ability to control execution. Keep in mind CNI is the implementation however when this is used chaining is still available. ### Risks and Mitigations @@ -296,7 +307,9 @@ proposal will be implemented, this is the place to discuss them. ### Draft KNI-API (used in POC) -We will review with community and take feedback +We will review with community and take feedback. + +This is the version where we don't have a native k8s network object. ``` service KNI { From 17a0fa61e7101919d3c53312dd74e6f1946a6b6a Mon Sep 17 00:00:00 2001 From: Mike Zappa Date: Mon, 29 Jan 2024 21:21:25 -0700 Subject: [PATCH 10/28] Update keps/sig-network/4410-k8s-network-interface/README.md Co-authored-by: Shane Utt --- keps/sig-network/4410-k8s-network-interface/README.md | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 527a0688bea..9938306c1cb 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -198,16 +198,7 @@ know that this has succeeded? * we should provide a example network runtime and easy starter project 4. Provide an API that is flexible for experimentation and opinionated use cases * example extradata map[string] string -5. Provide an API to report on a networks status -6. Provide an API to provide network metrics such as available IP addresses -7. Provide an API that provides networks available on the node -8. Provide a K8s resource to define a Network - * this should provide the usable cidr block(s) per node - * this should provide the needed information to the controller to reconcile to establish pod to pod networking -8. Provide an API that will create a network on a node -9. Provide an API that will delete a network off a node -8. Provide an API that will attach a one or more networks to a pod -9. Provide an API that will detach a one or more network from a pod +5. Provide APIs for the creation, configuration and management of networks for `Pods`. 10. Provide an API that will update a network attachment of a pod 11. Determine the reference implementation 12. Establish feature parity with current CNI [ADD, DEL] From d547f62aebce312dfb8445558a1c4e4bc49a9b44 Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Mon, 29 Jan 2024 20:38:40 -0700 Subject: [PATCH 11/28] update with shane comments --- .../4410-k8s-network-interface/README.md | 20 ++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 9938306c1cb..0118d37a162 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -199,13 +199,14 @@ know that this has succeeded? 4. Provide an API that is flexible for experimentation and opinionated use cases * example extradata map[string] string 5. Provide APIs for the creation, configuration and management of networks for `Pods`. -10. Provide an API that will update a network attachment of a pod -11. Determine the reference implementation -12. Establish feature parity with current CNI [ADD, DEL] -13. We should decouple the Pod and Node Network setup (The reporting of this could be different statuses?) -14. Provide the ability to run garbage collection to ensure no resources are left behind -15. We will provide the ability to identify the IP address family without parsing the value (such as a field) -16. Make a design that is backwards compatible with the CNI +6. Provide an API that will update a network attachment of a pod +7. Determine the reference implementation +8. Establish feature parity with current CNI [ADD, DEL] +9. We should decouple the Pod and Node Network setup (The reporting of this could be different statuses?) +10. Provide the ability to run garbage collection to ensure no resources are left behind +11. We will provide the ability to identify the IP address family without parsing the value (such as a field) +12. Make a design that is backwards compatible with the CNI +13. Guarantee the network is setup and in a healthy state before containers are started (ephemeral, init, regular) ### Non-Goals @@ -216,7 +217,6 @@ and make progress. 1. Any changes to the kube-scheduler 2. Any specific implementation other than the reference implementation. However we should ensure the KNI-API is flexible enough to support -3. Changes to the Pod specification ## Proposal @@ -266,6 +266,8 @@ Go in to as much detail as necessary here. This might be a good place to talk about core concepts and how they relate. --> +Changes to the pod specification will require hard evidence. + The specifics of "Network Readiness" is an implementation detail. We need to provide this RPC to the user. We should consider the trade offs to using a Native K8s Network object or CRD's. @@ -300,7 +302,7 @@ proposal will be implemented, this is the place to discuss them. We will review with community and take feedback. -This is the version where we don't have a native k8s network object. +This is the version where we don't have a native k8s network object aka a 'message PodNetworkRequest/Response' for the proto ``` service KNI { From abc4210cb38afc74b1634efe3d599469d0b31d16 Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Tue, 30 Jan 2024 09:50:31 -0700 Subject: [PATCH 12/28] add create network --- .../4410-k8s-network-interface/README.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 0118d37a162..4aee62bb3c5 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -306,6 +306,9 @@ This is the version where we don't have a native k8s network object aka a 'messa ``` service KNI { + // Pre RunPodSandbox - for kata + rpc CreateNetwork(CreateNetworkRequest) returns (CreateNetworkResponse) {} //MVP + // Post RunPodSandbox, Pre CreateContainers rpc AttachNetwork(AttachNetworkRequest) returns (AttachNetworkResponse) {} //MVP rpc DetachNetwork(DetachNetworkRequest) returns (DetachNetworkResponse) {} //MVP rpc QueryPodNetwork(QueryPodNetworkRequest) returns (QueryPodNetworkResponse) {} //MVP @@ -313,6 +316,16 @@ service KNI { rpc QueryNodeNetworks(QueryNodeNetworksRequest) returns (QueryNodeNetworksResponse) {} } +message CreateNetworkRequest { + // optional for kubelet to generate the network namespace path + string netns_path = 1; +} + +message CreateNetworkResponse { + string netns_path = 1; + map extradata = 2; +} + message AttachNetworkRequest { string name = 1; string id = 2; From cefc7c9a271f1733a2dcabda0a0c1648d56519f0 Mon Sep 17 00:00:00 2001 From: Shane Utt Date: Tue, 30 Jan 2024 13:08:16 -0500 Subject: [PATCH 13/28] chore: cleanup template text and blank space --- .../4410-k8s-network-interface/README.md | 908 +----------------- 1 file changed, 7 insertions(+), 901 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 4aee62bb3c5..f461d372f05 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -1,77 +1,12 @@ - # KEP-4410: Kubernetes Networking reImagined - +> **NOTE**: for the initial PR we've removed a lot of the templated text and +> aimed to keep this first iteration small and easier to consume. We are only +> focusing on the "What" and "Why" (e.g. motivation, goals, user stories) for +> this iteration so that we can build consensus on those first before we add +> any of the "How". -- [Release Signoff Checklist](#release-signoff-checklist) - [Summary](#summary) - [Motivation](#motivation) - [Goals](#goals) @@ -80,93 +15,10 @@ tags, and then generate with `hack/update-toc.sh`. - [User Stories (Optional)](#user-stories-optional) - [Story 1](#story-1) - [Story 2](#story-2) - - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) - - [Risks and Mitigations](#risks-and-mitigations) -- [Design Details](#design-details) - - [Test Plan](#test-plan) - - [Prerequisite testing updates](#prerequisite-testing-updates) - - [Unit tests](#unit-tests) - - [Integration tests](#integration-tests) - - [e2e tests](#e2e-tests) - - [Graduation Criteria](#graduation-criteria) - - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) - - [Version Skew Strategy](#version-skew-strategy) -- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) - - [Feature Enablement and Rollback](#feature-enablement-and-rollback) - - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) - - [Monitoring Requirements](#monitoring-requirements) - - [Dependencies](#dependencies) - - [Scalability](#scalability) - - [Troubleshooting](#troubleshooting) -- [Implementation History](#implementation-history) -- [Drawbacks](#drawbacks) -- [Alternatives](#alternatives) -- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) -## Release Signoff Checklist - - - -Items marked with (R) are required *prior to targeting to a milestone / release*. - -- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) -- [ ] (R) KEP approvers have approved the KEP status as `implementable` -- [ ] (R) Design details are appropriately documented -- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) - - [ ] e2e Tests for all Beta API Operations (endpoints) - - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) - - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free -- [ ] (R) Graduation criteria is in place - - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) -- [ ] (R) Production readiness review completed -- [ ] (R) Production readiness review approved -- [ ] "Implementation History" section is up-to-date for milestone -- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] -- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes - - - -[kubernetes.io]: https://kubernetes.io/ -[kubernetes/enhancements]: https://git.k8s.io/enhancements -[kubernetes/kubernetes]: https://git.k8s.io/kubernetes -[kubernetes/website]: https://git.k8s.io/website - ## Summary - - This proposal is to design and implement the KNI [Kubernetes Networking Interface] or better known as Kubernetes Networking reImagined. KNI will create a Network resource and provide an API that will provide network status, availability, how to attach a pod to a network, detach the pod from the network and update a pods network. ## Motivation @@ -187,11 +39,6 @@ and accommodate advanced functionalities and potential areas for expansion. ### Goals - - 1. Design a cool looking t-shirt 2. Design and implement the KNI-API 3. Provide documentation, examples, troubleshooting and FAQ's for KNI. @@ -210,37 +57,14 @@ know that this has succeeded? ### Non-Goals - - 1. Any changes to the kube-scheduler 2. Any specific implementation other than the reference implementation. However we should ensure the KNI-API is flexible enough to support ## Proposal - - The proposal of this KEP is to design and implement the KNI-API and make necessary changes to the CRI-API and container runtimes. The scope should be kept to a minimum and we should target feature parity. -### User Stories (Optional) - - - - +### User Stories #### Story 1 @@ -254,17 +78,7 @@ As a cluster operator, I need the ability to determine what networks are availab As a Kubernetes developer, I need the ability to have extension points for pod network setup, teardown and update so that I can support future Kubernetes networking features with either reducing the changes to core kubernetes or eliminating them -#### Story 4 - - -### Notes/Constraints/Caveats (Optional) - - +### Notes/Constraints/Caveats Changes to the pod specification will require hard evidence. @@ -274,711 +88,3 @@ We should consider the trade offs to using a Native K8s Network object or CRD's. Using a native object would allow passing a slice of network type to AttachNetwork Since the network runtime can be run separated from the container runtime, you can package everything into a pod and not need to have binaries on disk. This allows the CNI plugins to be isolated in the pod and the pod will never need to mount /opt/cni/bin or /etc/cni/net.d. This offers a potentially more ability to control execution. Keep in mind CNI is the implementation however when this is used chaining is still available. - -### Risks and Mitigations - - - -## Design Details - - - -### Draft KNI-API (used in POC) - -We will review with community and take feedback. - -This is the version where we don't have a native k8s network object aka a 'message PodNetworkRequest/Response' for the proto - -``` -service KNI { - // Pre RunPodSandbox - for kata - rpc CreateNetwork(CreateNetworkRequest) returns (CreateNetworkResponse) {} //MVP - // Post RunPodSandbox, Pre CreateContainers - rpc AttachNetwork(AttachNetworkRequest) returns (AttachNetworkResponse) {} //MVP - rpc DetachNetwork(DetachNetworkRequest) returns (DetachNetworkResponse) {} //MVP - rpc QueryPodNetwork(QueryPodNetworkRequest) returns (QueryPodNetworkResponse) {} //MVP - rpc SetupNodeNetwork(SetupNodeNetworkRequest) returns (SetupNodeNetworkResponse) {} - rpc QueryNodeNetworks(QueryNodeNetworksRequest) returns (QueryNodeNetworksResponse) {} -} - -message CreateNetworkRequest { - // optional for kubelet to generate the network namespace path - string netns_path = 1; -} - -message CreateNetworkResponse { - string netns_path = 1; - map extradata = 2; -} - -message AttachNetworkRequest { - string name = 1; - string id = 2; - string namespace = 3; - Isolation isolation = 4; - DNSConfig dns_config = 5; - repeated PortMapping port_mappings = 6; - map labels = 7; - map annotations = 8; - map extradata = 9; -} - -message AttachNetworkResponse { - map ipconfigs = 1; - map extradata = 2; -} - -message DetachNetworkRequest { - string name = 1; - string id = 2; - string namespace = 3; - Isolation isolation = 4; - map labels = 5; - map annotations = 6; - map extradata = 7; -} - -message DetachNetworkResponse { -} - -message IPConfig { - repeated string ip = 1; - string mac = 2; - map extradata = 3; -} - -message Network { - string name = 1; - bool ready = 2; - map extradata = 3; -} - -message Isolation { - string path = 1; - string type = 2; //Network Namespace, kernel, … - map extradata = 3; -} - -// DNSConfig specifies the DNS servers and search domains of a sandbox. -message DNSConfig { - // List of DNS servers of the cluster. - repeated string servers = 1; - // List of DNS search domains of the cluster. - repeated string searches = 2; - // List of DNS options. See https://linux.die.net/man/5/resolv.conf - // for all available options. - repeated string options = 3; -} - -enum Protocol { - TCP = 0; - UDP = 1; - SCTP = 2; -} - -// PortMapping specifies the port mapping configurations of a sandbox. -message PortMapping { - // Protocol of the port mapping. - Protocol protocol = 1; - // Port number within the container. Default: 0 (not specified). - int32 container_port = 2; - // Port number on the host. Default: 0 (not specified). - int32 host_port = 3; - // Host IP. - string host_ip = 4; -} - -message QueryNodeNetworksRequest { - -} - -message QueryNodeNetworksResponse { - repeated Network networks = 1; - map extradata = 2; -} - -message SetupNodeNetworkRequest { - -} - -message SetupNodeNetworkResponse { - -} - -message QueryPodNetworkRequest { - string name = 1; - string id = 2; - string namespace = 3; -} - -message QueryPodNetworkResponse { - map ipconfigs = 1; - map extradata = 2; -} -``` -### Test Plan - - - -[ ] I/we understand the owners of the involved components may require updates to -existing tests to make this code solid enough prior to committing the changes necessary -to implement this enhancement. - -##### Prerequisite testing updates - - - -##### Unit tests - - - - - -- ``: `` - `` - -##### Integration tests - - - - - -- : - -##### e2e tests - - - -- : - -### Graduation Criteria - - - -### Upgrade / Downgrade Strategy - - - -### Version Skew Strategy - - - -## Production Readiness Review Questionnaire - - - -### Feature Enablement and Rollback - - - -###### How can this feature be enabled / disabled in a live cluster? - - - -- [ ] Feature gate (also fill in values in `kep.yaml`) - - Feature gate name: - - Components depending on the feature gate: -- [ ] Other - - Describe the mechanism: - - Will enabling / disabling the feature require downtime of the control - plane? - - Will enabling / disabling the feature require downtime or reprovisioning - of a node? - -###### Does enabling the feature change any default behavior? - - - -###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? - - - -###### What happens if we reenable the feature if it was previously rolled back? - -###### Are there any tests for feature enablement/disablement? - - - -### Rollout, Upgrade and Rollback Planning - - - -###### How can a rollout or rollback fail? Can it impact already running workloads? - - - -###### What specific metrics should inform a rollback? - - - -###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? - - - -###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? - - - -### Monitoring Requirements - - - -###### How can an operator determine if the feature is in use by workloads? - - - -###### How can someone using this feature know that it is working for their instance? - - - -- [ ] Events - - Event Reason: -- [ ] API .status - - Condition name: - - Other field: -- [ ] Other (treat as last resort) - - Details: - -###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? - - - -###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? - - - -- [ ] Metrics - - Metric name: - - [Optional] Aggregation method: - - Components exposing the metric: -- [ ] Other (treat as last resort) - - Details: - -###### Are there any missing metrics that would be useful to have to improve observability of this feature? - - - -### Dependencies - - - -###### Does this feature depend on any specific services running in the cluster? - - - -### Scalability - - - -###### Will enabling / using this feature result in any new API calls? - - - -###### Will enabling / using this feature result in introducing new API types? - - - -###### Will enabling / using this feature result in any new calls to the cloud provider? - - - -###### Will enabling / using this feature result in increasing size or count of the existing API objects? - - - -###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? - - - -###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? - - - -###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)? - - - -### Troubleshooting - - - -###### How does this feature react if the API server and/or etcd is unavailable? - -###### What are other known failure modes? - - - -###### What steps should be taken if SLOs are not being met to determine the problem? - -## Implementation History - - - -## Drawbacks - - - -## Alternatives - - - -## Infrastructure Needed (Optional) - - From 1f05981bbbc4e6774415e9781d9192f04a835af4 Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Tue, 30 Jan 2024 11:25:53 -0700 Subject: [PATCH 14/28] support vm/kata --- keps/sig-network/4410-k8s-network-interface/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index f461d372f05..e8493debe6e 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -54,6 +54,7 @@ and accommodate advanced functionalities and potential areas for expansion. 11. We will provide the ability to identify the IP address family without parsing the value (such as a field) 12. Make a design that is backwards compatible with the CNI 13. Guarantee the network is setup and in a healthy state before containers are started (ephemeral, init, regular) +14. Provide support for Kata and other virtualized runtimes ### Non-Goals From e7704864f8f2d4ad8385a565c0bf2fe74627cf7e Mon Sep 17 00:00:00 2001 From: Shane Utt Date: Tue, 30 Jan 2024 14:00:57 -0500 Subject: [PATCH 15/28] docs: another pass at the kni kep goals --- .../4410-k8s-network-interface/README.md | 25 ++++++++----------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index e8493debe6e..6c72140eb92 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -40,21 +40,18 @@ and accommodate advanced functionalities and potential areas for expansion. ### Goals 1. Design a cool looking t-shirt -2. Design and implement the KNI-API +2. Provide Kubernetes APIs for the creation, configuration and management of networks (e.g. `Pod` networks) 3. Provide documentation, examples, troubleshooting and FAQ's for KNI. - * we should provide a example network runtime and easy starter project -4. Provide an API that is flexible for experimentation and opinionated use cases - * example extradata map[string] string -5. Provide APIs for the creation, configuration and management of networks for `Pods`. -6. Provide an API that will update a network attachment of a pod -7. Determine the reference implementation -8. Establish feature parity with current CNI [ADD, DEL] -9. We should decouple the Pod and Node Network setup (The reporting of this could be different statuses?) -10. Provide the ability to run garbage collection to ensure no resources are left behind -11. We will provide the ability to identify the IP address family without parsing the value (such as a field) -12. Make a design that is backwards compatible with the CNI -13. Guarantee the network is setup and in a healthy state before containers are started (ephemeral, init, regular) -14. Provide support for Kata and other virtualized runtimes +4. Establish feature parity with current CNI [ADD, DEL] +5. Handle support levels like Gateway API (e.g. "core" and "extended") +6. Handle implementation-specific use cases through extension points +7. Decouple the Pod and Node Network setup +8. Simplify/enable triggering garbage collection to ensure no resources are left behind +9. Provide the ability to identify the IP address family without parsing the value (such as a field) +10. Provide as much backwards-compatibility with CNI as is feasible +11. Guarantee the network is setup and in a healthy state before containers are started (ephemeral, init, regular) +12. Provide support for Kata and other virtualized runtimes +13. Provide a reference implementation ### Non-Goals From 8a33b316e341f0eada1fbf4aadc054ff8299b095 Mon Sep 17 00:00:00 2001 From: Shane Utt Date: Thu, 1 Feb 2024 14:50:41 -0500 Subject: [PATCH 16/28] docs: add goal about Pod network ns APIs Signed-off-by: Shane Utt --- keps/sig-network/4410-k8s-network-interface/README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 6c72140eb92..d3c4b429b0a 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -50,8 +50,9 @@ and accommodate advanced functionalities and potential areas for expansion. 9. Provide the ability to identify the IP address family without parsing the value (such as a field) 10. Provide as much backwards-compatibility with CNI as is feasible 11. Guarantee the network is setup and in a healthy state before containers are started (ephemeral, init, regular) -12. Provide support for Kata and other virtualized runtimes -13. Provide a reference implementation +12. If feasible, provide API awareness of Pod network namespaces (e.g. interface names) +13. Provide support for Kata and other virtualized runtimes +14. Provide a reference implementation ### Non-Goals From 0c3fb89088970763e8ec4c01febe6c4f48fbffeb Mon Sep 17 00:00:00 2001 From: Shane Utt Date: Thu, 1 Feb 2024 14:51:04 -0500 Subject: [PATCH 17/28] docs: add a user story for network ns goals to KNI KEP Signed-off-by: Shane Utt --- keps/sig-network/4410-k8s-network-interface/README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index d3c4b429b0a..80bb6dfebc2 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -77,6 +77,13 @@ As a cluster operator, I need the ability to determine what networks are availab As a Kubernetes developer, I need the ability to have extension points for pod network setup, teardown and update so that I can support future Kubernetes networking features with either reducing the changes to core kubernetes or eliminating them +#### Story 4 + +As a tool which manages eBPF programs on a Kubernetes cluster (bpfman, +inspektorgadget), I would like to be able to see the network interfaces of a +`Pod` via the Kubernetes API so that I can attach TC/XDP network programs to +those interfaces based on knowing the Pod name. + ### Notes/Constraints/Caveats Changes to the pod specification will require hard evidence. From 17baf99d5f1cc3bfaa9fe9f120cdd6a482a09a16 Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Fri, 2 Feb 2024 15:41:54 -0700 Subject: [PATCH 18/28] update motivation --- .../sig-network/4410-k8s-network-interface/README.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 80bb6dfebc2..24132c4f0c5 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -28,9 +28,11 @@ interacting with the Kubernetes API, and there has been considerable flexibility in how Container Network Interfaces (CNIs) set up networking within clusters. This has resulted in a scenario where things like pod networking (including pod to pod networking) is opaque to users, with different implementations taking -markedly different approaches. This fragmentation and issues with the API have -negatively impacted adoption in sectors such as telecommunications. Our goal is -to transform Kubernetes networking by making networks and their components +markedly different approaches. This fragmentation has spread networking across +all layers of the stack which include k8s components like kube-proxy, netpol agents, +container runtime with CNI plugins and low level runtimes like kata and issues +with the API have negatively impacted adoption in sectors such as telecommunications. +Our goal is to transform Kubernetes networking by making networks and their components actual resources within the Kubernetes API. This will allow for the development of shared functionalities and their integration into the API. We anticipate that this new approach will enhance support for areas that are currently struggling, @@ -65,6 +67,8 @@ The proposal of this KEP is to design and implement the KNI-API and make necessa ### User Stories +We are constantly adding these user stories, please join the community sync to discuss. + #### Story 1 As a cluster operator, I need the ability to determine my network(s) is ready so that my pods come up with a working network. @@ -86,6 +90,8 @@ those interfaces based on knowing the Pod name. ### Notes/Constraints/Caveats +Additional Information/Diagrams: https://docs.google.com/document/d/1Gz7iNtJNMI-zKJhaOcI3aflPCx3etJ01JMxzbtvruKk/edit?usp=sharing + Changes to the pod specification will require hard evidence. The specifics of "Network Readiness" is an implementation detail. We need to provide this RPC to the user. From 1bfd49bc25ee4cfe3bdc4aa36dfbfaf4a5d0788b Mon Sep 17 00:00:00 2001 From: Mike Zappa Date: Fri, 2 Feb 2024 19:01:38 -0700 Subject: [PATCH 19/28] Update keps/sig-network/4410-k8s-network-interface/kep.yaml Co-authored-by: Shane Utt --- keps/sig-network/4410-k8s-network-interface/kep.yaml | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/keps/sig-network/4410-k8s-network-interface/kep.yaml b/keps/sig-network/4410-k8s-network-interface/kep.yaml index 80c5b0fa0c6..607873f06a1 100644 --- a/keps/sig-network/4410-k8s-network-interface/kep.yaml +++ b/keps/sig-network/4410-k8s-network-interface/kep.yaml @@ -9,7 +9,9 @@ participating-sigs: status: provisional creation-date: 2024-01-11 reviewers: - - @shaneutt + - @aojea + - @danwinship + - @thockin approvers: see-also: From 49e56141e180725ba57c385d6aa3b53e36d8f062 Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Wed, 7 Feb 2024 11:28:20 -0700 Subject: [PATCH 20/28] update kep goals per discussions --- .../4410-k8s-network-interface/README.md | 38 ++++++++++++------- 1 file changed, 24 insertions(+), 14 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 24132c4f0c5..00696115bc4 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -41,20 +41,30 @@ and accommodate advanced functionalities and potential areas for expansion. ### Goals -1. Design a cool looking t-shirt -2. Provide Kubernetes APIs for the creation, configuration and management of networks (e.g. `Pod` networks) -3. Provide documentation, examples, troubleshooting and FAQ's for KNI. -4. Establish feature parity with current CNI [ADD, DEL] -5. Handle support levels like Gateway API (e.g. "core" and "extended") -6. Handle implementation-specific use cases through extension points -7. Decouple the Pod and Node Network setup -8. Simplify/enable triggering garbage collection to ensure no resources are left behind -9. Provide the ability to identify the IP address family without parsing the value (such as a field) -10. Provide as much backwards-compatibility with CNI as is feasible -11. Guarantee the network is setup and in a healthy state before containers are started (ephemeral, init, regular) -12. If feasible, provide API awareness of Pod network namespaces (e.g. interface names) -13. Provide support for Kata and other virtualized runtimes -14. Provide a reference implementation +- Design a cool looking t-shirt +- Provide Kubernetes APIs for the creation, configuration and management of interfaces +- Provide documentation, examples, troubleshooting and FAQ's for KNI. +- Establish feature parity with current CNI [ADD, DEL] +- Handle support levels like Gateway API (e.g. "core" and "extended") +- Handle implementation-specific use cases through extension points +- Decouple the Pod and Node Network setup +- Provide garbage collection to ensure no resources created during pod setup such as Linux bridges, ebpf programs, +allocated IP addresses are left behind after pod deletion +- Improve the current IP handling for pods (PodIP) to be handle multiple IP addresses and +a field to identify the IP address family (IPV4 vs IPV6) +- Provide backwards compatibility for the existing CNI approach and migration a path to fully adopt KNI +- Guarantee the network is setup and in a healthy state before containers are started (ephemeral, init, regular) +- If feasible, provide API awareness of Pod network namespaces (e.g. interface names) +- Provide a uniform approach for network setup/teardown for both virtualized (kata) and non-virtualized (runc) +runtimes including kubevirt. This could eliminate the high and low level runtimes from the networking path +- Provide a reference implementation of the KNI network runtime +- Provide the ability to have all the dependencies packaged in the container image (no more CNI binaries in the host file system) +..- No more downloading CNI binaries via initContainers/Mounting /etc/cni/net.d or /opt/cni/bin +- Provide the ability to use native k8s resources for configuration such as a ConfigMap's instead of configuration files in host file system +- Provide an API to indicate network readiness for the node (no more files on disk) +- Eliminate the need to exec binaries and replace with gRPC +- Make troubleshooting easier by having logs accessible via kubectl logs +- Improve network pod startup time ### Non-Goals From 34d21b758ecfccf04e55b71bd1a841013d95c97a Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Wed, 7 Feb 2024 11:32:18 -0700 Subject: [PATCH 21/28] update kep goals per discussions --- keps/sig-network/4410-k8s-network-interface/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 00696115bc4..8895546fe39 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -65,6 +65,7 @@ runtimes including kubevirt. This could eliminate the high and low level runtime - Eliminate the need to exec binaries and replace with gRPC - Make troubleshooting easier by having logs accessible via kubectl logs - Improve network pod startup time +- Provide the ability to prevent additional scheduling of pods if IPAM is out of IP addresses without evicting running pods ### Non-Goals From d6d9a5c462dbf8a217ea626931d69987d77e3cd2 Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Wed, 7 Feb 2024 11:44:02 -0700 Subject: [PATCH 22/28] update kep goals per discussions --- keps/sig-network/4410-k8s-network-interface/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 8895546fe39..ad1e70fdbe4 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -44,7 +44,7 @@ and accommodate advanced functionalities and potential areas for expansion. - Design a cool looking t-shirt - Provide Kubernetes APIs for the creation, configuration and management of interfaces - Provide documentation, examples, troubleshooting and FAQ's for KNI. -- Establish feature parity with current CNI [ADD, DEL] +- KNI should provide the API's required to establish feature parity with current CNI [ADD, DEL] - Handle support levels like Gateway API (e.g. "core" and "extended") - Handle implementation-specific use cases through extension points - Decouple the Pod and Node Network setup From 3177ee4e6d572c8a7457fc663e22bd1918b6f993 Mon Sep 17 00:00:00 2001 From: Mike Zappa Date: Mon, 12 Feb 2024 08:51:16 -0700 Subject: [PATCH 23/28] Update keps/sig-network/4410-k8s-network-interface/README.md --- keps/sig-network/4410-k8s-network-interface/README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index ad1e70fdbe4..c9167ea12f8 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -19,7 +19,8 @@ ## Summary -This proposal is to design and implement the KNI [Kubernetes Networking Interface] or better known as Kubernetes Networking reImagined. KNI will create a Network resource and provide an API that will provide network status, availability, how to attach a pod to a network, detach the pod from the network and update a pods network. +KNI or Kubernetes Networking Interface, is an effort to take a second look at Kubernetes networking and evaluate what are the pain points, and what can we improve. At its core, KNI will be a foundational network API specific for Kubernetes that will provide flexibility and extensibility to solve basic and the most advanced and opinionated networking use cases. + ## Motivation From 61281b5cb672231ab415d9e84af96f1ce69ad8be Mon Sep 17 00:00:00 2001 From: Mike Zappa Date: Wed, 14 Feb 2024 11:03:28 -0700 Subject: [PATCH 24/28] Update keps/sig-network/4410-k8s-network-interface/README.md --- .../sig-network/4410-k8s-network-interface/README.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index c9167ea12f8..68dfdb18f70 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -23,6 +23,18 @@ KNI or Kubernetes Networking Interface, is an effort to take a second look at Ku ## Motivation +Kubernetes networking is an area of complexity and multiple layers which has created several challenges and areas of improvement. These challenges include deployment of the CNI plugins, troubleshooting networking issues and development of new functionality. + +Currently networking happens in three layers of the stack, Kubernetes itself by means of kube-proxy or another controller based solution, the container runtime with network namespace creation and CNI plugins and the OCI runtime which does additional network setup for kernel isolated pods. All of this communication happens through non network specific APIs which to the reader of the code makes it hard to determine where ‘networking’ is happening. Having networking in several layers presents an issue when needing to troubleshoot issues as one needs to check several areas and some cannot be done via kubectl logs such as the CNI execution logs. This becomes more of an effort as multiple uncoordinated processes are making changes to the same resource, the network namespace of either the root or pod. The KNI aims at reducing the complexity by consolidating the networking into a single layer and having a uniform process for both namespaced and kernel isolated pods through a gRPC API. Leveraging gRPC will allow users the ability to migrate away from the current execution model that the CNI currently leverages. + +The next challenge is the deployment of the CNI plugins that provide the network setup and teardown of the pod. The idiomatic way to deploy a workload in Kubernetes is that everything should be in a pod; however the current approach leaves files in the host filesystem such as CNI binaries and CNI configuration. These files are usually downloaded via the init container of the pod after binding, which increases the time for the pod to get to a running state. Since all existing K8s network plugins are running as daemonsets we will take this approach as well, where all the dependencies are packaged into the container image thus adopting a well known approach. This will have added benefits of the network pod startup being much faster as nothing should need to be downloaded. + +Another area that KNI will improve is ‘network readiness’. Currently the container runtime is involved with providing both network and runtime readiness with the Status CRI RPC. The container runtime defines the network as ‘ready’ by the presence of the CNI network configuration in the host file system. The more recent CNI specification does include a status verb, however this is still bound by the current limitations, files on disk and execution model. The KNI will provide an RPC that can be implemented so the kubelet will call the KNI via gRPC. + +KNI aims to help the community and other proposals in the Kubernetes ecosystem. We will do this by providing necessary information via the gRPC service. We should be the API that provides the “what networks are available on this node” so that another effort can make the kube-scheduler aware of networks. We should also provide IPAM status as a common issue, is that the IPAM runs out of assignable IP addresses and pods are no longer able to be scheduled on that node until intervention. We should provide visibility into this so that we can indicate “no more pods” as setting the node to not ready will evict the healthy pods. While the future state of KNI could aim to propose changes to the kube-scheduler, it's not a part of our initial work and instead should try to assist other efforts such as DRA/device plugin to provide the information they need. + +The community may ask for more features, as we are taking a bold approach to reimagining Kubernetes networking by reducing the amount of layers involved in networking. We should prioritize feature parity with the current CNI model and then capture future work. KNI aims to be the foundational network api that is specific for Kubernetes and should make troubleshooting easier, deploying more friendly and innovate faster while reducing the need to make changes to core Kubernetes. + Kubernetes networking has traditionally been challenging to understand for users interacting with the Kubernetes API, and there has been considerable flexibility From 1c3107bd89e94470c8a265286841cd15eb43a360 Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Thu, 15 Feb 2024 10:54:28 -0700 Subject: [PATCH 25/28] update kep and temp remove user stories --- .../4410-k8s-network-interface/README.md | 52 +++---------------- 1 file changed, 8 insertions(+), 44 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 68dfdb18f70..5a4e153a0fb 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -21,8 +21,8 @@ KNI or Kubernetes Networking Interface, is an effort to take a second look at Kubernetes networking and evaluate what are the pain points, and what can we improve. At its core, KNI will be a foundational network API specific for Kubernetes that will provide flexibility and extensibility to solve basic and the most advanced and opinionated networking use cases. - ## Motivation + Kubernetes networking is an area of complexity and multiple layers which has created several challenges and areas of improvement. These challenges include deployment of the CNI plugins, troubleshooting networking issues and development of new functionality. Currently networking happens in three layers of the stack, Kubernetes itself by means of kube-proxy or another controller based solution, the container runtime with network namespace creation and CNI plugins and the OCI runtime which does additional network setup for kernel isolated pods. All of this communication happens through non network specific APIs which to the reader of the code makes it hard to determine where ‘networking’ is happening. Having networking in several layers presents an issue when needing to troubleshoot issues as one needs to check several areas and some cannot be done via kubectl logs such as the CNI execution logs. This becomes more of an effort as multiple uncoordinated processes are making changes to the same resource, the network namespace of either the root or pod. The KNI aims at reducing the complexity by consolidating the networking into a single layer and having a uniform process for both namespaced and kernel isolated pods through a gRPC API. Leveraging gRPC will allow users the ability to migrate away from the current execution model that the CNI currently leverages. @@ -35,38 +35,18 @@ KNI aims to help the community and other proposals in the Kubernetes ecosystem. The community may ask for more features, as we are taking a bold approach to reimagining Kubernetes networking by reducing the amount of layers involved in networking. We should prioritize feature parity with the current CNI model and then capture future work. KNI aims to be the foundational network api that is specific for Kubernetes and should make troubleshooting easier, deploying more friendly and innovate faster while reducing the need to make changes to core Kubernetes. - -Kubernetes networking has traditionally been challenging to understand for users -interacting with the Kubernetes API, and there has been considerable flexibility -in how Container Network Interfaces (CNIs) set up networking within clusters. -This has resulted in a scenario where things like pod networking (including pod -to pod networking) is opaque to users, with different implementations taking -markedly different approaches. This fragmentation has spread networking across -all layers of the stack which include k8s components like kube-proxy, netpol agents, -container runtime with CNI plugins and low level runtimes like kata and issues -with the API have negatively impacted adoption in sectors such as telecommunications. -Our goal is to transform Kubernetes networking by making networks and their components -actual resources within the Kubernetes API. This will allow for the development -of shared functionalities and their integration into the API. We anticipate that -this new approach will enhance support for areas that are currently struggling, -facilitate the development and promotion of common features, and better define -and accommodate advanced functionalities and potential areas for expansion. - ### Goals - Design a cool looking t-shirt - Provide Kubernetes APIs for the creation, configuration and management of interfaces - Provide documentation, examples, troubleshooting and FAQ's for KNI. - KNI should provide the API's required to establish feature parity with current CNI [ADD, DEL] -- Handle support levels like Gateway API (e.g. "core" and "extended") -- Handle implementation-specific use cases through extension points - Decouple the Pod and Node Network setup - Provide garbage collection to ensure no resources created during pod setup such as Linux bridges, ebpf programs, allocated IP addresses are left behind after pod deletion - Improve the current IP handling for pods (PodIP) to be handle multiple IP addresses and a field to identify the IP address family (IPV4 vs IPV6) - Provide backwards compatibility for the existing CNI approach and migration a path to fully adopt KNI -- Guarantee the network is setup and in a healthy state before containers are started (ephemeral, init, regular) - If feasible, provide API awareness of Pod network namespaces (e.g. interface names) - Provide a uniform approach for network setup/teardown for both virtualized (kata) and non-virtualized (runc) runtimes including kubevirt. This could eliminate the high and low level runtimes from the networking path @@ -74,11 +54,11 @@ runtimes including kubevirt. This could eliminate the high and low level runtime - Provide the ability to have all the dependencies packaged in the container image (no more CNI binaries in the host file system) ..- No more downloading CNI binaries via initContainers/Mounting /etc/cni/net.d or /opt/cni/bin - Provide the ability to use native k8s resources for configuration such as a ConfigMap's instead of configuration files in host file system -- Provide an API to indicate network readiness for the node (no more files on disk) +- Provide an API to indicate network readiness for the node (no more CNI network configuration files in host file system) - Eliminate the need to exec binaries and replace with gRPC - Make troubleshooting easier by having logs accessible via kubectl logs - Improve network pod startup time -- Provide the ability to prevent additional scheduling of pods if IPAM is out of IP addresses without evicting running pods +- Provide an API to prevent additional scheduling of pods if IPAM is out of IP addresses without evicting running pods ### Non-Goals @@ -93,26 +73,13 @@ The proposal of this KEP is to design and implement the KNI-API and make necessa We are constantly adding these user stories, please join the community sync to discuss. -#### Story 1 - -As a cluster operator, I need the ability to determine my network(s) is ready so that my pods come up with a working network. - -#### Story 2 - -As a cluster operator, I need the ability to determine what networks are available on my node so that upstream components can ensure the pod is scheduled on the appropriate node. - -#### Story 3 - -As a Kubernetes developer, I need the ability to have extension points for pod network setup, teardown and update so that I can support future Kubernetes networking features with either reducing the changes to core kubernetes or eliminating them +### Notes/Constraints/Caveats -#### Story 4 +## Constraints -As a tool which manages eBPF programs on a Kubernetes cluster (bpfman, -inspektorgadget), I would like to be able to see the network interfaces of a -`Pod` via the Kubernetes API so that I can attach TC/XDP network programs to -those interfaces based on knowing the Pod name. +1. Guarantee the pod interface is setup and in a healthy state before containers are started (ephemeral, init, regular) -### Notes/Constraints/Caveats +## Notes Additional Information/Diagrams: https://docs.google.com/document/d/1Gz7iNtJNMI-zKJhaOcI3aflPCx3etJ01JMxzbtvruKk/edit?usp=sharing @@ -120,7 +87,4 @@ Changes to the pod specification will require hard evidence. The specifics of "Network Readiness" is an implementation detail. We need to provide this RPC to the user. -We should consider the trade offs to using a Native K8s Network object or CRD's. -Using a native object would allow passing a slice of network type to AttachNetwork - -Since the network runtime can be run separated from the container runtime, you can package everything into a pod and not need to have binaries on disk. This allows the CNI plugins to be isolated in the pod and the pod will never need to mount /opt/cni/bin or /etc/cni/net.d. This offers a potentially more ability to control execution. Keep in mind CNI is the implementation however when this is used chaining is still available. +Since the network runtime can be run separated from the container runtime, you can package everything into a pod and not need to have binaries on disk. This allows the CNI plugins to be isolated in the pod and the pod will never need to mount /opt/cni/bin or /etc/cni/net.d. This offers a potentially more ability to control execution. Keep in mind CNI is the implementation however when this is used chaining is still available. \ No newline at end of file From 2081e13895995c946f7368da35c10face8ad43ed Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Thu, 15 Feb 2024 11:09:43 -0700 Subject: [PATCH 26/28] update goals --- .../4410-k8s-network-interface/README.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 5a4e153a0fb..17b718d5b12 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -38,27 +38,28 @@ The community may ask for more features, as we are taking a bold approach to rei ### Goals - Design a cool looking t-shirt -- Provide Kubernetes APIs for the creation, configuration and management of interfaces +- Provide a RPC for the Attachment and Detachment of interface[s] for a Pod +- Provide a RPC for the Querying of Pod network information (interfaces, network namespace path, ip addresses, routes, ...) +- Provide a RPC to prevent additional scheduling of pods if IPAM is out of IP addresses without evicting running pods +- Provide a RPC to indicate network readiness for the node (no more CNI network configuration files in host file system) +- Provide a RPC to provide the user the ability to query what networks are on a node +- KNI should provide the RPC's required to establish feature parity with current CNI [ADD, DEL] - Provide documentation, examples, troubleshooting and FAQ's for KNI. -- KNI should provide the API's required to establish feature parity with current CNI [ADD, DEL] - Decouple the Pod and Node Network setup - Provide garbage collection to ensure no resources created during pod setup such as Linux bridges, ebpf programs, allocated IP addresses are left behind after pod deletion - Improve the current IP handling for pods (PodIP) to be handle multiple IP addresses and a field to identify the IP address family (IPV4 vs IPV6) - Provide backwards compatibility for the existing CNI approach and migration a path to fully adopt KNI -- If feasible, provide API awareness of Pod network namespaces (e.g. interface names) - Provide a uniform approach for network setup/teardown for both virtualized (kata) and non-virtualized (runc) runtimes including kubevirt. This could eliminate the high and low level runtimes from the networking path - Provide a reference implementation of the KNI network runtime - Provide the ability to have all the dependencies packaged in the container image (no more CNI binaries in the host file system) ..- No more downloading CNI binaries via initContainers/Mounting /etc/cni/net.d or /opt/cni/bin - Provide the ability to use native k8s resources for configuration such as a ConfigMap's instead of configuration files in host file system -- Provide an API to indicate network readiness for the node (no more CNI network configuration files in host file system) - Eliminate the need to exec binaries and replace with gRPC -- Make troubleshooting easier by having logs accessible via kubectl logs +- Make troubleshooting easier by having network runtime logs accessible via kubectl logs - Improve network pod startup time -- Provide an API to prevent additional scheduling of pods if IPAM is out of IP addresses without evicting running pods ### Non-Goals From cd3f4b24b98bc3d1ce82f145826d2be8784a8ebb Mon Sep 17 00:00:00 2001 From: Michael Zappa Date: Thu, 15 Feb 2024 11:35:13 -0700 Subject: [PATCH 27/28] update goal --- keps/sig-network/4410-k8s-network-interface/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 17b718d5b12..5c6a0acc00c 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -43,6 +43,7 @@ The community may ask for more features, as we are taking a bold approach to rei - Provide a RPC to prevent additional scheduling of pods if IPAM is out of IP addresses without evicting running pods - Provide a RPC to indicate network readiness for the node (no more CNI network configuration files in host file system) - Provide a RPC to provide the user the ability to query what networks are on a node +- Consolidate K8s networking to a single layer without involving the container/oci runtimes - KNI should provide the RPC's required to establish feature parity with current CNI [ADD, DEL] - Provide documentation, examples, troubleshooting and FAQ's for KNI. - Decouple the Pod and Node Network setup From 9d2ee295138fce1bdda2266e8c05601adbeebdd4 Mon Sep 17 00:00:00 2001 From: Shane Utt Date: Wed, 21 Feb 2024 16:10:45 -0500 Subject: [PATCH 28/28] docs: add options for KNI controllers to KNI KEP This adds a section to the KNI KEP for "ongoing considerations" where we can put ideas and concerns that don't need to be resolved just yet, but are good to come back to as we progress. It then adds an idea for using Kubernetes controllers as an alternative to gRPC APIs for some of the KNI implementation to the ongoing considerations. Signed-off-by: Shane Utt --- .../4410-k8s-network-interface/README.md | 38 ++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/keps/sig-network/4410-k8s-network-interface/README.md b/keps/sig-network/4410-k8s-network-interface/README.md index 5c6a0acc00c..a7afd299193 100644 --- a/keps/sig-network/4410-k8s-network-interface/README.md +++ b/keps/sig-network/4410-k8s-network-interface/README.md @@ -89,4 +89,40 @@ Changes to the pod specification will require hard evidence. The specifics of "Network Readiness" is an implementation detail. We need to provide this RPC to the user. -Since the network runtime can be run separated from the container runtime, you can package everything into a pod and not need to have binaries on disk. This allows the CNI plugins to be isolated in the pod and the pod will never need to mount /opt/cni/bin or /etc/cni/net.d. This offers a potentially more ability to control execution. Keep in mind CNI is the implementation however when this is used chaining is still available. \ No newline at end of file +Since the network runtime can be run separated from the container runtime, you can package everything into a pod and not need to have binaries on disk. This allows the CNI plugins to be isolated in the pod and the pod will never need to mount /opt/cni/bin or /etc/cni/net.d. This offers a potentially more ability to control execution. Keep in mind CNI is the implementation however when this is used chaining is still available. + +## Ongoing Considerations + +### KNI Implementations PULL instead of PUSH? + +The original KNI POC provides a gRPC API for callbacks which (in the POC) are +added to the Kubelet during `Pod` tasks to callout to the KNI implementation to +get `Pod` networking configured. This is pretty straightforward, and the +initial POC actually showed very good performance characteristics, but it has +a couple of potential downsides: + +1. the synchronous nature of callbacks makes it harder to avoid deadlocks +2. in some extremely heavy use cases with lots of `Pods` rapidly deploying and + tearing down, this could be a potential scalability bottleneck. + +Additionally, we intend to create Kubernetes APIs for networks and their +configurations, which means that the Kubelet and other component would operate +as something of a middleman consuming Kubernetes APIs via watch mechanisms and +converting them to gRPC calls, and then being responsible for the status, +e.t.c. + +As such we've been actively considering whether it might make sense for at +least some of the functionality in KNI (such as domain/namespace +creation/deletion, and then interface attachment/detachment) be done by the KNI +directly via the KNI Kubernetes APIs. + +For a simplified example a KNI implementation might watch `Pods` and wait for +kubelet to reach a state (via the status) where it indicates its ready to hand +off for network setup. The KNI implementation does it's work, and then updates +the `Pod` indicating the network setup is complete and the `Pods` containers +are then created. + +There are downsides to this approach as well, one in particularly being that it +makes the provision of hookpoints for the KNI a lot more complicated for the +implementations. For now we've added this to our ongoing considerations section +as something to come back to, discuss and review.