Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for traffic router plugins #2573

Merged
merged 50 commits into from
Mar 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
a8a254a
feat: add support for traffic router plugins
zachaller Feb 9, 2023
71e9099
finish up refactor
zachaller Feb 9, 2023
03946ef
Merge branch 'master' of github.com:argoproj/argo-rollouts into feat-…
zachaller Feb 9, 2023
49226f1
codegen
zachaller Feb 9, 2023
8696475
update docs
zachaller Feb 9, 2023
01782dc
rename config field
zachaller Feb 10, 2023
92f762d
refactor tests
zachaller Feb 13, 2023
839fe98
add godocs
zachaller Feb 14, 2023
cb38f17
add docs on creating plugins
zachaller Feb 15, 2023
8229a64
rename config fields
zachaller Feb 15, 2023
c3543ba
Change New function to Init for tr
zachaller Feb 15, 2023
98dc4c5
Change New function to Init for metrics
zachaller Feb 15, 2023
0803f41
docs update
zachaller Feb 15, 2023
3262778
docs update
zachaller Feb 15, 2023
e82fac5
codegen
zachaller Feb 15, 2023
2f7b4c2
change repo name
zachaller Feb 16, 2023
15f0386
Merge branch 'master' of github.com:argoproj/argo-rollouts into feat-…
zachaller Feb 16, 2023
f7793d8
small docs changes
zachaller Feb 17, 2023
4f2ede4
Merge branch 'master' of github.com:argoproj/argo-rollouts into feat-…
zachaller Feb 17, 2023
831c7b1
fix bad merge comments
zachaller Feb 18, 2023
3704b14
Merge branch 'master' of github.com:argoproj/argo-rollouts into feat-…
zachaller Feb 20, 2023
96ea4fe
remove metric passing from metrics plugin on Init method
zachaller Feb 20, 2023
ef9213f
fix mutex
zachaller Feb 21, 2023
dfc22f1
wrap errors
zachaller Feb 21, 2023
8681fe5
rename
zachaller Feb 21, 2023
736f39d
docs change
zachaller Feb 22, 2023
ebafaea
Merge branch 'master' of github.com:argoproj/argo-rollouts into feat-…
zachaller Feb 26, 2023
2b2b2e9
codegen
zachaller Feb 26, 2023
0f95691
some updates to docs
zachaller Feb 27, 2023
119762c
change plugin to plugins for tr
zachaller Feb 27, 2023
39329de
change plugin to plugins for tr
zachaller Feb 27, 2023
a1fca7e
refactor naming for metric plugins
zachaller Feb 27, 2023
360a31a
lint
zachaller Feb 27, 2023
52d3183
change handshake
zachaller Feb 27, 2023
1b6c165
more renames
zachaller Feb 27, 2023
bea36f2
change handshake
zachaller Feb 27, 2023
f01510b
add err context
zachaller Feb 28, 2023
2b36cf1
lint
zachaller Feb 28, 2023
dd4c0c3
Merge branch 'master' of github.com:argoproj/argo-rollouts into feat-…
zachaller Feb 28, 2023
c2f6014
small docs change
zachaller Feb 28, 2023
f58e22e
docs update from pr review
zachaller Feb 28, 2023
b4f0114
updates from review
zachaller Mar 1, 2023
67df14a
change config map format
zachaller Mar 1, 2023
8ab2ff8
update docs
zachaller Mar 1, 2023
0483713
add context to error
zachaller Mar 1, 2023
c151c40
add context to error
zachaller Mar 1, 2023
a6c7a66
add context to errors ans well as wrap the *bool returned by verifiy …
zachaller Mar 1, 2023
d04af88
update docs for new interface type
zachaller Mar 1, 2023
b31516b
Merge branch 'master' of github.com:argoproj/argo-rollouts into feat-…
zachaller Mar 1, 2023
c4a6bcb
change error wraping for init
zachaller Mar 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -286,3 +286,7 @@ checksums:
build-sample-metric-plugin-debug:
go build -gcflags="all=-N -l" -o metric-plugin test/cmd/sample-metrics-plugin/main.go

.PHONY: build-sample-traffic-plugin-debug
build-sample-traffic-plugin-debug:
go build -gcflags="all=-N -l" -o traffic-plugin test/cmd/sample-trafficrouter-plugin/main.go

31 changes: 13 additions & 18 deletions docs/analysis/plugins.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Metric Plugins

!!! important Available since v1.5
!!! important Available since v1.5 - Status: Alpha

Argo Rollouts supports getting analysis metrics via 3rd party plugin system. This allows users to extend the capabilities of Rollouts
to support metric providers that are not natively supported. Rollout's uses a plugin library called
[go-plugin](https://github.com/hashicorp/go-plugin) to do this. You can find a sample plugin
here: [sample-rollouts-metric-plugin](https://github.com/argoproj-labs/sample-rollouts-metric-plugin)
here: [rollouts-sample_prometheus-metric-plugin](https://github.com/argoproj-labs/rollouts-sample_prometheus-metric-plugin)

## Using a Metric Plugin

Expand All @@ -14,28 +14,24 @@ into the rollouts controller container. The second method is to use a HTTP(S) se

### Mounting the plugin executable into the rollouts controller container

To use this method, you will need to build or download the plugin executable and then mount it into the rollouts controller container.
The plugin executable must be mounted into the rollouts controller container at the path specified by the `--metric-plugin-location` flag.

There are a few ways to mount the plugin executable into the rollouts controller container. Some of these will depend on your
particular infrastructure. Here are a few methods:

* Using an init container to download the plugin executable
* Using a Kubernetes volume mount with a shared volume such as NFS, EBS, etc.
* Building the plugin into the rollouts controller container

Then you can use the configmap to point to the plugin executable. Example:
Then you can use the configmap to point to the plugin executable file location. Example:

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: argo-rollouts-config
data:
plugins: |-
metrics:
- name: "prometheus" # name the plugin uses to find this configuration, it must match the name required by the plugin
pluginLocation: "file://./my-custom-plugin" # supports http(s):// urls and file://
metricProviderPlugins: |-
- name: "argoproj-labs/sample-prometheus" # name of the plugin, it must match the name required by the plugin so it can find it's configuration
location: "file://./my-custom-plugin" # supports http(s):// urls and file://
```

### Using a HTTP(S) server to host the plugin executable
Expand All @@ -49,11 +45,10 @@ kind: ConfigMap
metadata:
name: argo-rollouts-config
data:
plugins: |-
metrics:
- name: "prometheus" # name the plugin uses to find this configuration, it must match the name required by the plugin
pluginLocation: "https://github.com/argoproj-labs/sample-rollouts-metric-plugin/releases/download/v0.0.3/metric-plugin-linux-amd64" # supports http(s):// urls and file://
pluginSha256: "08f588b1c799a37bbe8d0fc74cc1b1492dd70b2c" #optional sha256 checksum of the plugin executable
metricProviderPlugins: |-
- name: "argoproj-labs/sample-prometheus" # name of the plugin, it must match the name required by the plugin so it can find it's configuration
location: "https://github.com/argoproj-labs/rollouts-sample_prometheus-metric-plugin/releases/download/v0.0.4/metric-plugin-linux-amd64" # supports http(s):// urls and file://
sha256: "dac10cbf57633c9832a17f8c27d2ca34aa97dd3d" #optional sha256 checksum of the plugin executable
```

## Some words of caution
Expand All @@ -66,13 +61,13 @@ the server hosting the plugin is available again.

Argo Rollouts will download the plugin at startup only once but if the pod is deleted it will need to download the plugin again on next startup. Running
Argo Rollouts in HA mode can help a little with this situation because each pod will download the plugin at startup. So if a single pod gets
deleted during a server outage, the other pods will still be able to take over because there will already be a plugin executable available to it. However,
it is up to you to define your risk for and decide how you want to install the plugin executable.
deleted during a server outage, the other pods will still be able to take over because there will already be a plugin executable available to it. It is the
responsibility of the Argo Rollouts administrator to define the plugin installation method considering the risks of each approach.

## List of Available Plugins (alphabetical order)

#### Add Your Plugin Here
* If you have created a plugin, please submit a PR to add it to this list.
#### [sample-rollouts-metric-plugin](https://github.com/argoproj-labs/sample-rollouts-metric-plugin)
#### [rollouts-sample_prometheus-metric-plugin](https://github.com/argoproj-labs/rollouts-sample_prometheus-metric-plugin)
* This is just a sample plugin that can be used as a starting point for creating your own plugin.
It is not meant to be used in production. It is based on the built-in prometheus provider.
73 changes: 73 additions & 0 deletions docs/features/traffic-management/plugins.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Traffic Router Plugins

!!! important Available since v1.5 - Status: Alpha

Argo Rollouts supports getting analysis metrics via 3rd party plugin system. This allows users to extend the capabilities of Rollouts
to support metric providers that are not natively supported. Rollout's uses a plugin library called
[go-plugin](https://github.com/hashicorp/go-plugin) to do this. You can find a sample plugin
here: [rollouts-sample_nginx-trafficrouter-plugin](https://github.com/argoproj-labs/rollouts-sample_nginx-trafficrouter-plugin)

## Using a Traffic Router Plugin

There are two methods of installing and using an argo rollouts plugin. The first method is to mount up the plugin executable
into the rollouts controller container. The second method is to use a HTTP(S) server to host the plugin executable.

### Mounting the plugin executable into the rollouts controller container

There are a few ways to mount the plugin executable into the rollouts controller container. Some of these will depend on your
particular infrastructure. Here are a few methods:

* Using an init container to download the plugin executable
* Using a Kubernetes volume mount with a shared volume such as NFS, EBS, etc.
* Building the plugin into the rollouts controller container

Then you can use the configmap to point to the plugin executable file location. Example:

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: argo-rollouts-config
data:
trafficRouterPlugins: |-
- name: "argoproj-labs/sample-nginx" # name of the plugin, it must match the name required by the plugin so it can find it's configuration
location: "file://./my-custom-plugin" # supports http(s):// urls and file://
```

### Using a HTTP(S) server to host the plugin executable

Argo Rollouts supports downloading the plugin executable from a HTTP(S) server. To use this method, you will need to
configure the controller via the `argo-rollouts-config` configmap and set `pluginLocation` to a http(s) url. Example:

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: argo-rollouts-config
data:
trafficRouterPlugins: |-
- name: "argoproj-labs/sample-nginx" # name of the plugin, it must match the name required by the plugin so it can find it's configuration
location: "https://github.com/argoproj-labs/rollouts-sample_nginx-trafficrouter-plugin/releases/download/v0.0.1/metric-plugin-linux-amd64" # supports http(s):// urls and file://
sha256: "08f588b1c799a37bbe8d0fc74cc1b1492dd70b2c" #optional sha256 checksum of the plugin executable
```

## Some words of caution

Depending on which method you use to install and the plugin, there are some things to be aware of.
The rollouts controller will not start if it can not download or find the plugin executable. This means that if you are using
leoluz marked this conversation as resolved.
Show resolved Hide resolved
a method of installation that requires a download of the plugin and the server hosting the plugin for some reason is not available and the rollouts
controllers pod got deleted while the server was down or is coming up for the first time, it will not be able to start until
the server hosting the plugin is available again.

Argo Rollouts will download the plugin at startup only once but if the pod is deleted it will need to download the plugin again on next startup. Running
Argo Rollouts in HA mode can help a little with this situation because each pod will download the plugin at startup. So if a single pod gets
deleted during a server outage, the other pods will still be able to take over because there will already be a plugin executable available to it. It is the
responsibility of the Argo Rollouts administrator to define the plugin installation method considering the risks of each approach.

## List of Available Plugins (alphabetical order)

#### Add Your Plugin Here
* If you have created a plugin, please submit a PR to add it to this list.
#### [rollouts-sample_nginx-trafficrouter-plugin](https://github.com/argoproj-labs/rollouts-sample_nginx-trafficrouter-plugin)
* This is just a sample plugin that can be used as a starting point for creating your own plugin.
It is not meant to be used in production. It is based on the built-in prometheus provider.
155 changes: 155 additions & 0 deletions docs/plugins.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Creating an Argo Rollouts Plugin

## High Level Overview

Argo Rollouts plugins depend on hashicorp's [go-plugin](https://github.com/hashicorp/go-plugin) library. This library
provides a way for a plugin to be compiled as a standalone executable and then loaded by the rollouts controller at runtime.
This works by having the plugin executable act as a rpc server and the rollouts controller act as a client. The plugin executable
is started by the rollouts controller and is a long-lived process and that the rollouts controller connects to over a unix socket.
The communication protocol uses golang built in net/rpc library so plugins have to be written in golang.

## Plugin Repository

In order to get plugins listed in the main argo rollouts documentation we ask that the plugin repository be created under
the [argoproj-labs](https://github.com/argoproj-labs) organization. Please open an issue under argo-rollouts requesting a
repo which you would be granted admin access on.

There is also a standard naming convention for plugin names used for configmap registration, as well as what the plugin
uses for locating its specific configuration on rollout or analysis resources. The name needs to be in the form of
`<namespace>/<name>` and both <namespace> and <name> have a regular expression check that matches Github's requirements
for `username/org` and `repository name`. This requirement is in place to help with allowing multiple creators of the same plugin
types to exist such as `<org1>/nginx` and `<org2>/nginx`. These names could be based of the repo name such
as `argoproj-labs/rollouts-sample_prometheus-metric-plugin` but it is not a requirement.

There will also be a standard for naming repositories under argoproj-labs in the form of `rollouts-<tool>-<type>-plugin`
where `<type>` is say `metric`, or `trafficrouter` and `<tool>` is the software the plugin is for say nginx.

## Plugin Name

So now that we have an idea on plugin naming and repository standards let's pick a name to use for the rest of this
documentation and call our plugin `argoproj-labs/nginx`.

This name will be used in a few different spots the first is the config map that your plugin users will need to configure.
It looks like this below.

```yaml
kind: ConfigMap
metadata:
name: argo-rollouts-config
data:
metricProviderPlugins: |-
- name: "argoproj-labs/metrics"
location: "file:///tmp/argo-rollouts/metric-plugin"
trafficRouterPlugins: |-
- name: "argoproj-labs/nginx"
location: "file:///tmp/argo-rollouts/traffic-plugin"
```

As you can see there is a field called `name:` under both `metrics` or `trafficrouters` this is the first place where your
end users will need to configure the name of the plugin. The second location is either in the rollout object or the analysis
template which you can see the examples below.

#### AnalysisTemplate Example
```yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
spec:
metrics:
- name: success-rate
...
provider:
plugin:
argoproj-labs/metrics:
address: http://prometheus.local
leoluz marked this conversation as resolved.
Show resolved Hide resolved
```

#### Traffic Router Example
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: example-plugin-ro
spec:
strategy:
canary:
canaryService: example-plugin-ro-canary-analysis
stableService: example-plugin-ro-stable-analysis
trafficRouting:
plugins:
argoproj-labs/nginx:
stableIngress: canary-demo
leoluz marked this conversation as resolved.
Show resolved Hide resolved
```

You can see that we use the plugin name under `spec.metrics[].provider.plugin` for analysis template and `spec.strategy.canary.trafficRouting.plugins`
for traffic routers. You as a plugin author can then put any configuration you need under `argoproj-labs/nginx` and you will be able to
look up that config in your plugin via the plugin name key. You will also want to document what configuration options your plugin supports.

## Plugin Interfaces

Argo Rollouts currently supports two plugin systems as a plugin author your end goal is to implement these interfaces as
a hashicorp go-plugin. The two interfaces are `MetricsPlugin` and `TrafficRouterPlugin` for each of the respective plugins:

```go
type MetricProviderPlugin interface {
// InitPlugin initializes the traffic router plugin this gets called once when the plugin is loaded.
InitPlugin() RpcError
// Run start a new external system call for a measurement
// Should be idempotent and do nothing if a call has already been started
Run(*v1alpha1.AnalysisRun, v1alpha1.Metric) v1alpha1.Measurement
// Resume Checks if the external system call is finished and returns the current measurement
Resume(*v1alpha1.AnalysisRun, v1alpha1.Metric, v1alpha1.Measurement) v1alpha1.Measurement
// Terminate will terminate an in-progress measurement
Terminate(*v1alpha1.AnalysisRun, v1alpha1.Metric, v1alpha1.Measurement) v1alpha1.Measurement
// GarbageCollect is used to garbage collect completed measurements to the specified limit
GarbageCollect(*v1alpha1.AnalysisRun, v1alpha1.Metric, int) RpcError
// Type gets the provider type
Type() string
// GetMetadata returns any additional metadata which providers need to store/display as part
// of the metric result. For example, Prometheus uses is to store the final resolved queries.
GetMetadata(metric v1alpha1.Metric) map[string]string
}

type TrafficRouterPlugin interface {
// InitPlugin initializes the traffic router plugin this gets called once when the plugin is loaded.
InitPlugin() RpcError
// UpdateHash informs a traffic routing reconciler about new canary, stable, and additionalDestination(s) pod hashes
UpdateHash(rollout *v1alpha1.Rollout, canaryHash, stableHash string, additionalDestinations []v1alpha1.WeightDestination) RpcError
// SetWeight sets the canary weight to the desired weight
SetWeight(rollout *v1alpha1.Rollout, desiredWeight int32, additionalDestinations []v1alpha1.WeightDestination) RpcError
// SetHeaderRoute sets the header routing step
SetHeaderRoute(rollout *v1alpha1.Rollout, setHeaderRoute *v1alpha1.SetHeaderRoute) RpcError
// SetMirrorRoute sets up the traffic router to mirror traffic to a service
SetMirrorRoute(rollout *v1alpha1.Rollout, setMirrorRoute *v1alpha1.SetMirrorRoute) RpcError
// VerifyWeight returns true if the canary is at the desired weight and additionalDestinations are at the weights specified
// Returns nil if weight verification is not supported or not applicable
VerifyWeight(rollout *v1alpha1.Rollout, desiredWeight int32, additionalDestinations []v1alpha1.WeightDestination) (RpcVerified, RpcError)
// RemoveManagedRoutes Removes all routes that are managed by rollouts by looking at spec.strategy.canary.trafficRouting.managedRoutes
RemoveManagedRoutes(ro *v1alpha1.Rollout) RpcError
// Type returns the type of the traffic routing reconciler
Type() string
}
```

## Plugin Init Function

Each plugin interface has a `InitPlugin` function, this function is called when the plugin is first started up and is only called
once per startup. The `InitPlugin` function is used as a means to initialize the plugin it gives you the plugin author the ability
to either set up a client for a specific metrics provider or in the case of a traffic router construct a client or informer
for kubernetes api. The one thing to note about this though is because these calls happen over RPC the plugin author should
not depend on state being stored in the plugin struct as it will not be persisted between calls.

## Kubernetes RBAC

The plugin runs as a child process of the rollouts controller and as such it will inherit the same RBAC permissions as the
controller. This means that the service account for the rollouts controller will need the correct permissions for the plugin
to function. This might mean instructing users to create a role and role binding to the standard rollouts service account
for the plugin to use. This will probably affect traffic router plugins more than metrics plugins.

## Sample Plugins

There are two sample plugins within the argo-rollouts repo that you can use as a reference for creating your own plugin.

* [Sample Metrics Plugin](https://github.com/argoproj/argo-rollouts/tree/master/test/cmd/sample-metrics-plugin)
* [Sample Traffic Router Plugin](https://github.com/argoproj/argo-rollouts/tree/master/test/cmd/sample-trafficrouter-plugin)
3 changes: 3 additions & 0 deletions manifests/crds/rollout-crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -862,6 +862,9 @@ spec:
required:
- stableIngress
type: object
plugins:
type: object
x-kubernetes-preserve-unknown-fields: true
smi:
properties:
rootService:
Expand Down
3 changes: 3 additions & 0 deletions manifests/install.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11908,6 +11908,9 @@ spec:
required:
- stableIngress
type: object
plugins:
type: object
x-kubernetes-preserve-unknown-fields: true
smi:
properties:
rootService:
Expand Down
5 changes: 4 additions & 1 deletion metricproviders/metricproviders.go
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,10 @@ func (f *ProviderFactory) NewProvider(logCtx log.Entry, metric v1alpha1.Metric)
return skywalking.NewSkyWalkingProvider(client, logCtx), nil
case plugin.ProviderType:
plugin, err := plugin.NewRpcPlugin(metric)
return plugin, err
if err != nil {
return nil, fmt.Errorf("failed to create plugin: %v", err)
}
return plugin, nil
default:
return nil, fmt.Errorf("no valid provider in metric '%s'", metric.Name)
}
Expand Down
Loading