Skip to content

Commit

Permalink
Add network flows export support proposal
Browse files Browse the repository at this point in the history
  • Loading branch information
rcarrillocruz committed Feb 2, 2021
1 parent 0689995 commit 75e201b
Showing 1 changed file with 107 additions and 0 deletions.
107 changes: 107 additions & 0 deletions enhancements/network/netflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: OpenShift Network Flow export and collection support
authors:
- "@rcarrillocruz"
reviewers:
- "@russellb"
approvers:
- TBD
creation-date: 2021-01-11
last-updated: 2020-01-11
status: provisional
---

# OpenShift NetFlow support

## Release Signoff Checklist

- [ ] Enhancement is `implementable`
- [ ] Design details are appropriately documented from clear requirements
- [ ] Test plan is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)

## Summary

OVS (Open vSwitch) supports exporting network flow protocols like NetFlow, IPFIX and sflow.
OpenShift uses OVN/OVS for OVN-Kubernetes network plugin.
OpenShift should have the ability to export cluster traffic using any of this protocols.

## Motivation

* Customers need to be able to get network flows out of OpenShift to be consumed by their collectors.
* Customers need to be able to see a log of the network traffic that flows thru OpenShift cluster.
* Customers need to get metrics and monitoring data of the OpenShift cluster network traffic to make decisions on capacity and/or security issues.

### Goals

- Add support for exporting all the network flows protocols supported by OVS from OpenShift.
- Add an optional collector that consumes the network flows data and enriches it with Kubernetes metadata.
- Add Prometheus metrics from the data collected by the optional collector for visualization and alerting purposes.

### Non-Goals

- This document only applies to OVN-Kubernetes, not OpenShift SDN.
- This document will not discuss the flow logs store solutions. Typically, customers store the NetFlow data in a data warehouse system (e.g. ClickHouse) or search and analytics system (e.g. ElasticSearch). It is out of the scope of this document to discuss which option should be used and if we should manage that solution.

## Proposal

OVS supports NetFlow v5, IPFIX and sflow.
NetFlow v5 does not support IPv6, thus it should not be considered as a default protocol
sFlow is the most performant protocol as it is a sampling protocol, which polls periodically interfaces to get samples of packets. The drawback is that it does not capture all packets.
IPFIX is capable of capturing all traffic at line rate and supports IPv6, therefore it should be the default protocol if this option is not specified.

The Cluster Network Operator (CNO) should expose:
* A *exportNetworkFlows* option to enable the export of network flows data from OVS bridges
* A *exportNetworkFlowsProtocol* to specify which protocol should be used. Valid values should be *netflowV5*, *ipfix* and *sflow*, being the default ipfix.
* A *exportNetworkFlowsCollectorIP* to specify the IP of the collector that will consume the flow data
* A *exportNetworkFlowsCollectorPort* to specify the port of the collector that will consume the flow data

Under the covers, these options will make the CNO to perform an ovs-vsctl command on the bridge br-int, which is the bridge the containers are connected to on OVN-Kubernetes, and its management port ovn-k8s-mp0.
As an example, if sFlow was enabled this is the command that would be executed.

`ovs-vsctl -- --id=@sflow create sflow agent=ovn-k8s-mp0 target="\"${exportNetworkFlowsCollectorIP}:{exportNetworkFlowsCollectorPort}\"" header=128 sampling=64 polling=10 -- set bridge br-int sflow=@sflow`

If *exportNetworkFlowsCollectorIP* is not provided, we deploy a collector based in the open-source GoFlow project, expose it with a service and we set the service ClusterIP as its value.
There are many open-source NetFlow collectors. Just to name a few:

In order to provide better context of network flows data, we should enrich the flows collected with Kubernetes metadata.
The flows contain fields for source IP address and destination IP address. We would query the Kubernetes API for those IPs and enrich the flows when applicable with:

* K8SSrcPod
* K8SDstPod
* K8SSrcPodNamespace
* K8SDstPodNamespace
* K8SSrcNode (when the source address is a node IP)
* K8SDstNode (when the destination address is a node IP)

This enrichment could be done by either modifying the collector source code or by using an adapter container pattern, where the collector dumps the stdout of the flows
collected to a volume and that is consumed and enriched by a Go Kubernetes client that dumps the enriched flow to stdout/file.
TBD explain the adapter container pattern.

## Design Details

### Test Plan

- Unit tests for the feature
- e2e tests covering the feature

### Graduation Criteria

From Tech Preview to GA

##### Tech Preview -> GA

- Ensure OpenShift can export network flows data off OVS to an endpoint
- Ensure OpenShift bundled collector can log the network flows data off OVS
- Ensure Prometheus has metrics from optional collector data

### Upgrade / Downgrade Strategy

N/A

### Version Skew Strategy

N/A

## Implementation History

0 comments on commit 75e201b

Please sign in to comment.