Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we need to add QoS policy in Egress feature? #2766

Closed
Jexf opened this issue Sep 13, 2021 · 13 comments · Fixed by #5425
Closed

Should we need to add QoS policy in Egress feature? #2766

Jexf opened this issue Sep 13, 2021 · 13 comments · Fixed by #5425
Labels
kind/design Categorizes issue or PR as related to design. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@Jexf
Copy link
Member

Jexf commented Sep 13, 2021

Describe what you are trying to solve
Egress feature is a CRD API that manages external access from the Pods in a cluster, and centralized redirects the SNAT packets to Egress Nodes. It may happen that some Pods occupy too much Egress SNAT bandwidth, such as third-party data copy, and there may cause network congestion on Egress Nodes. Although the current k8s has Pods bandwidth limitation function, but only unified set ingress and egress traffic, the Egress SNAT traffic cannot set QoS separately.

Maybe we can add QoS function for Egress feature, we can add bandwidth or pps limitation for it.

Describe the solution you have in mind

1.Maybe we can use ovs meter to implement k8s QoS feature, such as Pods Ingress/Egress bandwidth/pps limitation, Pod to Services bandwidth/pps limitation.
But I have compared between ovs meter and tc, ovs meter is not as good as tc, the jitter for ovs meter is bigger than tc.

2.Use ovs meter to implement Egress QoS, only add bandwidth or pps limitation for Egress SNAT packets.

@Jexf Jexf added the kind/design Categorizes issue or PR as related to design. label Sep 13, 2021
@antoninbas
Copy link
Contributor

@Jexf can you clarify the difference between solutions 1. and 2.? They both mention using OVS meters for the implementation.

@tnqn / @wenqiq any thoughts on this? That sounds like a good differentiating feature to me (and maybe the Egress Status could even report some information about bandwidth usage if this is technically possible based on the implementation method?).

@antoninbas antoninbas added the triage/needs-information Indicates an issue needs more information in order to work on it. label Sep 20, 2021
@Jexf
Copy link
Member Author

Jexf commented Sep 22, 2021

@antoninbas Thanks for the reply.
Solutions 1:
Now the k8s community uses annotation to add QoS configuration information. Not only need to create an independent ifb device for each pod, but also need to restart the pod to make the configuration take effect after modifying the QoS configuration.

Maybe we can use crd to save QoS configuration and use meter to implement QoS function:

  1. use crd to save QoS configuration, flexible setting the namespace and pod label for pods, which need to configure QoS Policy, maybe we can implement namespace-level QoS function. the crd example:
apiVersion: crd.antrea.io/v1alpha1
kind: Qos
metadata:
  name: qos-sample-1
spec:
  appliedTo:
    - podSelector:
        matchLabels:
          app: qospolicy
      namespaceSelector:
        matchLabels:
          namespace: qospolicy
  ingress:
    rate: 10000
    burst: 2000
    type: mbps
  egress:
    rate: 10000
    burst: 2000
    type: mpps
  1. use meter to implement QoS function, not need to an independent ifb device for each pod(redirecting traffic to the ifb device may cause performance loss), the meter also can implement pps limit function.
  2. We can also reuse the meter qos pipeline to implement Egress SNAT QoS function.

Solutions 2:
Only use meter to implement Egress SNAT QoS function, only add bandwidth or pps limitation for Egress SNAT packets

@wenqiq
Copy link
Contributor

wenqiq commented Sep 24, 2021

Good idea.About solution 1, I think a more appropriate name of CRD kind is Bandwidth instead of kind: Qos?
IMO, I perfer solution 1, however the more conventional way to limit the bandwidth on pods is througth annotation in k8s community.
If we want to add QoS function for Egress feature, I think solution 2 should be more appropriate now.

@tnqn
Copy link
Member

tnqn commented Sep 24, 2021

The idea sounds good to me.

  1. Will the QoS apply to off-cluster egress traffic of all Pods selected by the Egress (or the QoS if it's another CRD), or it's per Pod?
  2. Given Kubernetes has provided a way to specify total ingress and egress bandwidth of a Pod, I feel we could just add a QoS applying to off-cluster Egress traffic only. It could be an optional field of the Egress CRD.
  3. For @antoninbas's idea about reporting bandwidth usage to Egress Status, I think the answer of 1 may matter. If it's per Pod QoS, it may be more difficult to report. But I feel generally etcd may be not suitable to store frequently updated data, for which reason metrics-server and antrea networkpolicystats use aggregated apiserver to serve metric data.

@Jexf
Copy link
Member Author

Jexf commented Sep 26, 2021

Good idea.About solution 1, I think a more appropriate name of CRD kind is Bandwidth instead of kind: Qos?
IMO, I perfer solution 1, however the more conventional way to limit the bandwidth on pods is througth annotation in k8s community.
If we want to add QoS function for Egress feature, I think solution 2 should be more appropriate now.

Thanks for your reply, Now the Kubernetes community qos function is so simple, if we use a new CRD to define and save bandwidth limitation info, we can implement a series of functions:

  1. namespace-level bandwidth limitation, we can add scope field in Bandwidth CRD, the sharedvalue means the applied pods share the sampe meter qos rule, and the standalone value means each pod using a meter qos rule. So we can usescope sharedvalue for namespace-level bandwidth limitation, the crd exmaple :
apiVersion: crd.antrea.io/v1alpha1
kind: Bandwidth
metadata:
  name: bandwidth-sample-for-namespace
spec:
  appliedTo:
    - podSelector:
        matchLabels:
          app: qospolicy
      namespaceSelector:
        matchLabels:
          namespace: qospolicy
  ingress:
    rate: 100Mbps
    burst: 100Mbps
    scope: shared
  egress:
    rate: 100Mbps
    burst: 100Mbps
    scope: shared
  1. pps limitation, the crd exmaple :
apiVersion: crd.antrea.io/v1alpha1
kind: Bandwidth
metadata:
  name: bandwidth-sample-for-namespace
spec:
  appliedTo:
    - podSelector:
        matchLabels:
          app: qospolicy
      namespaceSelector:
        matchLabels:
          namespace: qospolicy
  ingress:
    rate: 100Mpps
    burst: 100Mpps
    scope: standalone
  egress:
    rate: 100Mpps
    burst: 100Mpps
    scope: standalone
  1. Solve a series of problems forannotation and redirecting traffic to the ifb device.

@Jexf
Copy link
Member Author

Jexf commented Sep 26, 2021

@tnqn Thanks for your reply. Yes, we can just add an optional field in the Egress CRD for Egress bandwidth limitation, and it not conflict with the Bandwidth limitation CRD. Maybe we don`t need to add meter qos rule for each pod, just sharing a meter qos rule for applied pods may be more simple.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2021
@tnqn tnqn removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 4, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Apr 5, 2022

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 5, 2022
@github-actions github-actions bot closed this as completed Jul 4, 2022
@tnqn tnqn reopened this Jul 4, 2022
@tnqn tnqn removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 4, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Oct 3, 2022

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 3, 2022
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 2, 2023
@GraysonWu
Copy link
Contributor

Hi @Jexf,

Sorry to see this excellent feature proposed by you was blocked by some technical issues before. The existing implementation allowed different Egress to share the EgressIP, making it challenging to identify which exact Egress the packets belong to on the EgressIP Node.
However, we recently had another round of review and discussion on it, and we basically reached an agreement that typically, users want different EgressIPs to differentiate Pods when creating different Egress objects. We could assume/ensure that Egress with QoS has its dedicated EgressIP, making things much more manageable.

I am reaching out to you to inquire about any updates you may have on this item. Specifically, what is your opinion about the new agreement? Are you still interested in or have the bandwidth to work on this item? Or have you already implemented a similar feature on your own fork/project?
Please let me know if you have any questions or ideas to share. Thank you so much for your time and consideration.

BTW, I also sent an email to you via [email protected] which is reported as undeliverable lol.

@GraysonWu GraysonWu reopened this Mar 17, 2023
@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 18, 2023
@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 17, 2023
@ColonelBundy
Copy link

Good idea.About solution 1, I think a more appropriate name of CRD kind is Bandwidth instead of kind: Qos?
IMO, I perfer solution 1, however the more conventional way to limit the bandwidth on pods is througth annotation in k8s community.
If we want to add QoS function for Egress feature, I think solution 2 should be more appropriate now.

Thanks for your reply, Now the Kubernetes community qos function is so simple, if we use a new CRD to define and save bandwidth limitation info, we can implement a series of functions:

  1. namespace-level bandwidth limitation, we can add scope field in Bandwidth CRD, the sharedvalue means the applied pods share the sampe meter qos rule, and the standalone value means each pod using a meter qos rule. So we can usescope sharedvalue for namespace-level bandwidth limitation, the crd exmaple :
apiVersion: crd.antrea.io/v1alpha1
kind: Bandwidth
metadata:
  name: bandwidth-sample-for-namespace
spec:
  appliedTo:
    - podSelector:
        matchLabels:
          app: qospolicy
      namespaceSelector:
        matchLabels:
          namespace: qospolicy
  ingress:
    rate: 100Mbps
    burst: 100Mbps
    scope: shared
  egress:
    rate: 100Mbps
    burst: 100Mbps
    scope: shared
  1. pps limitation, the crd exmaple :
apiVersion: crd.antrea.io/v1alpha1
kind: Bandwidth
metadata:
  name: bandwidth-sample-for-namespace
spec:
  appliedTo:
    - podSelector:
        matchLabels:
          app: qospolicy
      namespaceSelector:
        matchLabels:
          namespace: qospolicy
  ingress:
    rate: 100Mpps
    burst: 100Mpps
    scope: standalone
  egress:
    rate: 100Mpps
    burst: 100Mpps
    scope: standalone
  1. Solve a series of problems forannotation and redirecting traffic to the ifb device.

I like this example. However, wouldn't it be better to follow the same standard of ACNP and ANP ? namely a separate crd for namespaced and cluster scoped.

Also would this limit bandwidth within the namespace or just for traffic originating from outside the namespace? Personally I would love to be able to specify both, there are scenarios where you may or may not have a fast private link between your nodes and limiting each namespace to something sensible would be great. In the cases where that's not a problem it would be great to only limit public traffic.

To expand even further, imagine the scenario where you host services outside of the current cluster but may or may not wish to impose limits to/from those services then a cidr selector to exclude/include ips from the limiter would also fit nicely.

@antoninbas antoninbas removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 11, 2023
@luolanzone
Copy link
Contributor

The PR #5425 is for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants