Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add network flows export support proposal #581

Merged
merged 1 commit into from
Mar 1, 2021

Conversation

rcarrillocruz
Copy link
Contributor

No description provided.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 11, 2021
enhancements/network/netflow.md Outdated Show resolved Hide resolved
enhancements/network/netflow.md Outdated Show resolved Hide resolved
@russellb
Copy link
Member

/assign @russellb

@rcarrillocruz rcarrillocruz changed the title WIP Add Netflow support proposal Add Netflow support proposal Feb 2, 2021
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 2, 2021
@rcarrillocruz rcarrillocruz changed the title Add Netflow support proposal Add network flows export support proposal Feb 2, 2021
@russellb russellb removed the request for review from sttts February 2, 2021 20:54
enhancements/network/netflow.md Outdated Show resolved Hide resolved
enhancements/network/netflow.md Outdated Show resolved Hide resolved
enhancements/network/netflow.md Outdated Show resolved Hide resolved
enhancements/network/netflow.md Outdated Show resolved Hide resolved
enhancements/network/netflow.md Outdated Show resolved Hide resolved
enhancements/network/netflow.md Outdated Show resolved Hide resolved
enhancements/network/netflow.md Outdated Show resolved Hide resolved
@rcarrillocruz rcarrillocruz force-pushed the add_netflow branch 7 times, most recently from ad4b14d to 95d9bdc Compare February 10, 2021 11:13
@rcarrillocruz
Copy link
Contributor Author

/retest

@rcarrillocruz rcarrillocruz force-pushed the add_netflow branch 2 times, most recently from 89abe2e to de7baf5 Compare February 12, 2021 16:13
Copy link
Member

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is looking good. I just had a few minor comments

enhancements/network/netflow.md Outdated Show resolved Hide resolved
enhancements/network/netflow.md Outdated Show resolved Hide resolved
enhancements/network/netflow.md Outdated Show resolved Hide resolved
@rcarrillocruz rcarrillocruz force-pushed the add_netflow branch 4 times, most recently from 7085a0f to 06c8928 Compare February 17, 2021 09:00
@rcarrillocruz
Copy link
Contributor Author

/assign @danwinship

Copy link
Contributor

@danwinship danwinship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically good and maybe all of my comments are better addressed in the actual implementation PRs rather than in the enhancement PR?
/lgtm
/hold
if you want to merge it as-is just cancel the hold

enhancements/network/netflow.md Outdated Show resolved Hide resolved
exportNetworkFlowsCollectorPort: 2056
- exportNetworkFlowsCollectorIP: 172.30.54.103
exportNetworkFlowsCollectorPort: 2056
exportNetworkFlowsProtocol: netflow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how much I should be arguing about the API details here rather than in the eventual API PR, but...

  1. Not sure this belongs under ovnKubernetesConfig? Even though we only support it in ovn-kubernetes currently, these are generic protocols which could be supported by any network plugin. eg, like how we have kubeProxyConfig as a top-level rather than inside openshiftSDNConfig. Not totally sure about this one...

  2. You probably don't need Config in the name? Just exportNetworkFlows explains what that section is doing

  3. Get rid of the naming redundancy in the subfields. eg, consider what accessing this would look like from golang code:

     ip := defaultNetwork.exportNetworkFlowsConfig.exportNetworkFlowsCollectors[0].exportNetworkFlowsCollectorIP
    

    You can just have collectors and ip/port

  4. Kubernetes API enums always have initial caps and then proper case. So, NetFlow, IPFIX, SFlow

  5. Is there any reason we allow multiple collectors of a single type but not multiple collectors of different types? If you put the protocol, ip, and port together, then it would be more flexible, and also slightly simpler to write out in the case where you only have a single collector (which I think is the common case?):

     exportNetworkFlows:
       - protocol: NetFlow
         ip: 172.30.158.150
         port: 2056
    
  6. The big feature of sFlow over NetFlow is that it lets you export only a sample of the traffic rather than all the traffic. Any plans to allow configuring that, now or later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Right, I thought about that. Thing is I saw IPSEC within ovnKubernetesConfig and conceptually it's not really an OVN-K related feature, it could also be leveraged by other plugin. Given that it's the latest thing added, I just cargo culted. I don't have a strong opinion.
  2. I cargo culted HybridOverlayConfig and IPSecConfig
  3. Yeah :D , I was actually looking for this name bikeshedding feedback. I see hybrid overlay options prepended with HybridOverlay, thus I just cargo culted. But I agree, probably sounds better not to prepend, since it's a dict it's clear those options apply to 'export network flows' .
  4. I'd rather accept any case and just canonicalize as we do with other options. Note that sflow is usually written as "sFlow" per https://sflow.org/ .
  5. Here you get me off-guard. The docs don't clarify if you can have multiple collector protocols defined, I'd need to test. I take you suggest having something like "endpoint: 172.30.158.140:2056"
  6. Yes it does, however I was thinking of going light with exposing per-protocol options. In others proposals you were against having toggles on stuff to avoid end-user overconfigurability, e.g. the network diagnostics interval, so I thought of just hardcoding them and if there's end-user demand/RFE then expose . But I'm cool with exposing sflow sampling and polling params now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this belongs under ovnKubernetesConfig?

I guess another counter-argument is that even if we supported exactly the same flow collecting options in openshift-sdn, we could do that by just adding the same ExternalNetworkFlowsConfig struct to openshiftSDNConfig too. OK.

I cargo culted HybridOverlayConfig and IPSecConfig

which maybe we should have done differently...

exportNetworkFlowsConfig struck me as particularly awkward though. If you had said networkFlowExportConfig I probably wouldn't have said anything... I guess "verb phrase" = OK, "noun phrase + Config" = OK, "verb phrase + Config" = 🙁

I see hybrid overlay options prepended with HybridOverlay, thus I just cargo culted.

HybridOverlayVXLANPort is definitely wrong and I think I even complained about that

I'd rather accept any case and just canonicalize as we do with other options.

We mostly don't do that any more; we only originally did that for the network plugin type because the original capitalization was wrong (OpenshiftSDN rather than OpenShiftSDN) and we fixed it right before 4.1 code freeze and didn't want to break anything. But then this led to other bugs because while CNO handles either variant, stuff in other components didn't (eg, MCO deciding whether to enable system OVS) and so random things would break if you had used the wrong case. In the end, it's easier to just require everyone to get the string correct (and use validation to only accept the correct spelling/casing).

The docs don't clarify if you can have multiple collector protocols defined

oh, hm, I just assumed you could since they're totally separate tables...

In others proposals you were against having toggles on stuff to avoid end-user overconfigurability

It's true. I guess we can add more options later if we need them. My vague understanding though was that pretty much the main reason anyone used sFlow or IPFIX rather than NetFlow was to get sampling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, it's just I'm hardcoding the sampling/polling values, from my WIP branch:

https://github.com/openshift/cluster-network-operator/pull/982/files#diff-d1edeaf791eef7719efb5bb733f48f3a201cc1000ce6f8d09a917be2accc9cb4R86

 ovs-vsctl -- --id=@sflow create sflow agent=ovn-k8s-mp0 targets={{ .ExportNetworkFlowsCollectors }} header=128 sampling=64 polling=10 -- set bridge br-int sflow=@sflow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danwinship please take a look, I put up an alternative based slightly on your suggestions.
Since there can be either netflow, ipfix or sflow exports, have those as dicts and as values for those dicts a list of strings encompassing the IP and the port , i.e. no need to add yet another key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end, I'm making the exportNetworkFlows option top-level, I'll check on runtime OVN-K is used.
Cos otherwise, we will just keep piling on ovnKubernetesConfig and it won't be sustainable.

enhancements/network/netflow.md Outdated Show resolved Hide resolved
enhancements/network/netflow.md Show resolved Hide resolved
enhancements/network/netflow.md Outdated Show resolved Hide resolved
@openshift-ci-robot openshift-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Feb 22, 2021
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Feb 23, 2021
@rcarrillocruz rcarrillocruz force-pushed the add_netflow branch 12 times, most recently from 3f76058 to f0a3bf7 Compare February 23, 2021 10:14
@rcarrillocruz
Copy link
Contributor Author

/retest

@rcarrillocruz
Copy link
Contributor Author

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 24, 2021
Copy link
Contributor

@danwinship danwinship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to just merge this and say we can address the other issues in a later follow-up, but then I remembered that I'm not an approver in enhancements anyway

serviceNetwork:
- 172.30.0.0/16
exportNetworkFlows:
netflow:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

netFlow, sFlow, ipfix. Sticking to the standard rules makes it easier for people to remember the right capitalization to use when creating/editing the struct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

- 172.30.54.103:2056
sflow:
- 172.30.158.150:6343
- 172.30.54.103:6343
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above you describe sflow as an "optional dict", but here it's an array, not a dict, and in particular, there's nowhere under sflow where you could add sampling/polling config. You'd need something like

sFlow:
  endpoints:
    - 172.30.158.150:2056
    - 172.30.54.103:2056

so you can add other fields later too. (I don't know if "endpoint" is the right terminology here.)

But then, would you necessarily want the same sampling config for every endpoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, the description needs fixing, it should be an array just like in the example.

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, russellb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danwinship
Copy link
Contributor

oh, Russell approved it.
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 1, 2021
@openshift-merge-robot openshift-merge-robot merged commit 1f7fa50 into openshift:master Mar 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants