Skip to content

Commit

Permalink
Merge pull request kubernetes#68 from danehans/plugins_v2
Browse files Browse the repository at this point in the history
Updates dns plugin enhancement
  • Loading branch information
openshift-merge-robot committed Nov 1, 2019
2 parents c3140c1 + b8da6ee commit a3411e6
Showing 1 changed file with 95 additions and 86 deletions.
181 changes: 95 additions & 86 deletions enhancements/dns/plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,24 +19,21 @@ superseded-by:

# Configurable DNS Plugins

This proposal provides cluster operators the ability to configure CoreDNS [plugins](https://coredns.io/plugins/) and
This proposal provides cluster admins the ability to configure CoreDNS [plugins](https://coredns.io/plugins/) and
includes [forward](https://coredns.io/plugins/forward/) as the first plugin implementation.

## Release Signoff Checklist

- [x] Enhancement is `implementable`
- [ ] Design details are appropriately documented from clear requirements
- [ ] Test plan is defined
- [x] Design details are appropriately documented from clear requirements
- [x] Test plan is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift/docs]

## Open Questions [optional]

1. Is `operator.openshift.io` the best API group for the proposed `ForwardPlugin` type?
2. How are API changes handled for cluster downgrades?
3. Should IPv6 references for `Nameservers` be removed until OpenShift adds IPv6 support?
4. Is the cluster domain (i.e. cluster.local) an invalid `ForwardPlugin` domain?
5. Should the number of `ForwardPlugins` be restricted due to Gap 4?
1. How are API changes handled for cluster downgrades?
2. Should the number of `Servers` be restricted due to Gap 4?

## Summary

Expand Down Expand Up @@ -71,16 +68,18 @@ metadata:
<SNIP>
```

Once CoreDNS starts and has parsed the configuration, it runs servers. Each server is defined by the zones it serves and
a listening port. In the above configuration, CoreDNS starts one server that manages all zones and listens on port 5353.
Each server has its own plugin chain represented within the server block stanza (i.e. `forward`). This proposal will
generate the ConfigMap with a `forward` plugin configuration based on user-provided values of type `ForwardPlugin`.
Once CoreDNS starts and has parsed the configuration, it runs one or more servers. Each server is defined by the zones
it serves and a listening port. In the above configuration, CoreDNS starts one server that manages all zones and listens
on port 5353. Each server has its own plugin chain represented within the server block stanza (i.e. forward). This
proposal describes an API for (a) configuring additional zones for specific subdomains and (b) specifying custom
upstream nameservers that will be used to resolve names beneath these subdomains.

## Motivation

CoreDNS is responsible for resolving pod and service names for the cluster domain (i.e. `cluster.local`). Otherwise,
CoreDNS proxies the request to a resolver identified by `/etc/resolv.conf` on the corresponding node. Although this provides a consistent and reliable approach for name
resolution, it restricts how cluster operators manage DNS name queries.
CoreDNS proxies the request to a resolver identified by `/etc/resolv.conf` on the corresponding node. Although this
provides a consistent and reliable approach for name resolution, it restricts how cluster operators manage DNS name
queries.

### Goals

Expand All @@ -94,43 +93,44 @@ resolution, it restricts how cluster operators manage DNS name queries.
2. Configure or manage external DNS providers.
3. Provide name query forwarding for other cluster services (i.e. container runtime).
4. Manage what plugins get loaded (i.e. plugin.cfg).
5. Specifications for any CoreDNS plugins other than forward.

## Proposal

`PluginConfig` defines one or more CoreDNS plugins that can be configured. If a plugin is not defined, it will use a
default configuration provided by the cluster-dns-operator. A plugin is represented as a slice when multiple instances
of the plugin can be expressed within the DNS configuration.
`Server` defines the schema for a server that runs per instance of CoreDNS. The `Name` field is required and specifies a
unique name for the `Server`. `Zones` is required and specifies the subdomains the server is authoritative for.

`Server` may contain several CoreDNS plugins in the future. This proposal includes `ForwardPlugin` as the first plugin
implementation. `ForwardPlugin` provides options for configuring the forward plugin configuration. A `Server` listens on
port `5353`, which can not be configured at this time. If no `Server` is specified, the cluster-dns-operator will
generate the ConfigMap referenced in the [Summary](#summary) section.

```go
type PluginConfig struct {
ForwardPlugins []ForwardPlugin `json:"forwardPlugin"`
type Server struct {
Name string `json:"name"`
Zones []string `json:"zones"`
ForwardPlugin ForwardPlugin `json:"forwardPlugin"`
// additional future plugins
}
```

The proposed `ForwardPlugin` type defines a schema for associating one or more `Nameservers` to a DNS subdomain
identified by `Domain`. `Nameservers` are responsible for resolving name queries for `Domain`. `Policy` provides a
mechanism for distinguishing where to forward DNS messages when `Nameservers` consists of more than one nameserver:
`ForwardPlugin` defines a schema for configuring the CoreDNS forward plugin. `Upstreams` is a list of resolvers to
forward name queries for subdomains specified by `Zones`. `Upstreams` are randomized when more than 1 upstream is
specified. Each instance of CoreDNS performs health checking of `Upstreams`. When a healthy upstream returns an error
during the exchange, another resolver is tried from `Upstreams`. Each upstream is represented by an IP address or IP
address and port if the upstream listens on a port other than 53.
```go
type ForwardPlugin struct {
Domain string `json:"domain"`
Nameservers []string `json:"nameservers"`
Policy ForwardPolicy `json:"policy,omitempty"`
Upstreams []string `json:"upstreams"`
}

type ForwardPolicy string

const (
ForwardPolicyRandom ForwardPolicy = "Random"
ForwardPolicyRoundRobin ForwardPolicy = "RoundRobin"
ForwardPolicySequential ForwardPolicy = "Sequential"
)
```

Each instance of CoreDNS performs health checking of `Nameservers`. If `Nameservers` consists of more than one
nameserver, `Policy` specifies the order of forwarder nameserver selection. When a healthy nameserver returns an error
during the exchange, another server is tried from `Nameservers` based on `Policy`. Each server is represented by an IP
address or IP address and port if the server listens on a port other than 53.
`Servers` is added to `DNSSpec` since multiple servers can run per CoreDNS instance.
```go
type DNSSpec struct {
Servers []Server `json:"servers,omitempty"`
}
```

### User Stories

Expand All @@ -141,23 +141,28 @@ name queries for our other internal devices using the DNS servers in our data ce

### Implementation Details/Notes/Constraints

If `ForwardPlugin` consists of more than one `ForwardPlugin`, longest suffix match will be used to determine the
`ForwardPlugin`. For example, if there are two `ForwardPlugins`, one for foo.com and one for a.foo.com, and the query is
for www.a.foo.com, it will be routed to the a.foo.com `ForwardPlugin`.
If `Servers` consists of more than one `Server`, longest suffix match will be used to determine the `Server`. For
example, if there are two `Servers`, one for "foo.com" and another for "a.foo.com", and the name query is for
"www.a.foo.com", it will be routed to the `Server` with `Zone` "a.foo.com".

A maximum of 15 `Upstreams` is allowed per `ForwardPlugin`.

`Name` must comply with the [rfc6335](https://tools.ietf.org/rfc/rfc6335.txt) Service Name Syntax.

A maximum of 15 `Nameservers` is allowed per `ForwardPlugin`.
`Zones` must conform to the definition of a subdomain in [rfc1123](https://tools.ietf.org/html/rfc1123). The cluster
domain (i.e. cluster.local) is an invalid subdomain for `Zones`.

The cluster-dns-operator will generate the `forward` plugin configuration of `configmap/dns-default` based on
`ForwardPlugin` of `dnses.operator.openshift.io` instead of `forward` being statically defined. To achieve this:
The cluster-dns-operator will prepend the configuration of `Servers` to `configmap/dns-default`, instead of the entire
ConfigMap being statically defined. To achieve this:

1. The [default](https://github.com/openshift/cluster-dns-operator/blob/master/assets/dns/configmap.yaml) ConfigMap
asset and associated code from the [manifests pkg](https://github.com/openshift/cluster-dns-operator/blob/master//pkg/manifests/manifests.go#L70:6)
should be removed.

2. The [desiredDNSConfigMap](https://github.com/openshift/cluster-dns-operator/blob/master/pkg/operator/controller/controller_dns_configmap.go)
function must be modified to create a ConfigMap type with a `forward` configuration based on `ForwardPlugin`.
If `ForwardPlugin` is not present or an invalid `ForwardPlugin` is provided, the ConfigMap will contain
`forward . /etc/resolv.conf`. Otherwise, desiredDNSConfigMap will use the provided `ForwardPlugin` to construct the
function must be modified to create a ConfigMap with additional server configuration blocks based on `Server`. If
`Server` is undefined or an invalid, the ConfigMap will only contain the default server. Otherwise,
`desiredDNSConfigMap` will use the provided `Servers` to construct additional server configuration blocks, each with a
forwarding configuration and use `/etc/resolv.conf` as a resolver of last resort. For example:

```yaml
Expand All @@ -166,18 +171,22 @@ kind: DNS
metadata:
name: default
spec:
pluginConfig:
forwardPlugins:
- domain: foo.com
nameServers:
- 1.2.3.4
- 5.6.7.8:5353
policy: RoundRobin
- domain: bar.com
nameServers:
- 4.3.2.1
- 8.7.6.5:5353
policy: Sequential
servers:
- name: foo-server
zones:
- foo.com
forwardPlugin:
upstreams:
- 1.1.1.1
- 2.2.2.2:5353
- name: bar-server
zones:
- bar.com
- example.com
forwardPlugin:
upstreams:
- 3.3.3.3
- 4.4.4.4:5454
```
The above `DNS` will produce the following `ConfigMap`:
Expand All @@ -186,12 +195,10 @@ apiVersion: v1
data:
Corefile: |
foo.com:5353 {
forward . 1.2.3.4 5.6.7.8:5353
policy round_robin
forward . 1.1.1.1 2.2.2.2:5353
}
bar.com:5353 {
forward . 4.3.2.1 8.7.6.5:5353
policy sequential
bar.com:5353 example.com:5353 {
forward . 3.3.3.3 4.4.4.4:5454
}
.:5353 {
errors
Expand Down Expand Up @@ -221,9 +228,8 @@ of the CoreDNS DaemonSet.

#### Validation
`Domain` must conform to the [RFC 1123](https://tools.ietf.org/html/rfc1123#page-13) definition of a subdomain. Each
nameserver in `Nameservers` must be a valid IPv4 or IPv6 address. If the nameserver listens on a port other than 53,
a valid port number must be specified. A colon is used to separate the address and port, `IP:port` for IPv4 or
`[IP]:port` for IPv6.
upstream in `Upstreams` must be a valid IPv4 address. If the upstream listens on a port other than 53,
a valid port number must be specified. A colon is used to separate the address and port, `IP:port` for IPv4.

#### Transport
UDP is used to transport DNS messages and `ForwardPlugin` health checks. Any UDP transport will automatically retry with
Expand All @@ -235,14 +241,14 @@ Nameserver health checking is performed in-band. A health check is performed onl
The check runs in a loop, every 0.5s, for as long as the forwarder reports unhealthy. Once healthy, CoreDNS stops
health checking until the next error. The health checks use a recursive DNS query (. IN NS) to get forwarder health.
Any response that is not a network error (REFUSED, NOTIMPL, SERVFAIL, etc) is taken as a healthy forwarder.
When all `Nameservers` are down CoreDNS assumes health checking as a mechanism has failed and will try to connect to a
random nameserver (which may or may not work).
When all `Upstreams` are down CoreDNS assumes health checking as a mechanism has failed and will try to connect to a
random upstream (which may or may not work).

### Risks and Mitigations

#### Gap 1
Forwarded name queries and nameserver health checks are insecure. This may allow a malicious actor to impersonate an
`ForwardPlugin` nameserver.
Forwarded name queries and upstream health checks are insecure. This may allow a malicious actor to impersonate an
`ForwardPlugin` upstream.

#### Mitigation 1
Add TLS support to secure forwarded name queries. Clearly document this insecurity in product documentation in the
Expand All @@ -258,7 +264,7 @@ Use the OpenShift Enhancement process to solicit feedback from the installer tea

#### Gap 3

Surfacing status for nameservers that fail health checks.
Surfacing status for upstreams that fail health checks.

#### Mitigation 3

Expand All @@ -267,25 +273,26 @@ metrics are exported. Ensure these metrics are surfaced through the OpenShift mo

#### Gap 4

An increased utilization of compute resources on each node due to the potential of running many `ForwardPlugins`, each
with up to 15 nameservers that are actively checked for health.
An increased utilization of compute resources on each node due to the potential of running many `Servers`, each with up
to 15 upstreams being actively checked for health.

#### Mitigation 4

Test utilization using multiple `ForwardPlugins` with multiple `Nameservers`. Add a warning in the documentation that
adding a large number of forwarders may incur a performance penalty or hit memory limits.
Test utilization using multiple `Servers` with multiple `Upstreams`. Include a warning in the documentation that states
"adding a large number of forwarders may incur a performance penalty or hit memory limits". It's also possible for the
operator to increase the amount of resources (i.e. memory) requested based on the number of `Servers`.

## Design Details

### Test Plan

Implement the following end-to-end test in addition to unit tests:

- Create a `DNS` with a `ForwardPlugin` that uses a `Nameserver` to resolve a hostname from `Domain`.
- Start the dns-operator and check the logs for healthcheck failures for the `Nameserver`.
- Create a `DNS` with `Servers` that uses an `Upstream` to resolve a hostname from `Zones`.
- Start the dns-operator and check the logs for healthcheck failures for the `Upstream`.
- Create a pod that performs an nslookup for a hostname in the cluster domain.
- Have the pod perform an nslookup for a hostname in `Domain`.
- If the nslookup succeeds, check if the nslookup server matches the `Nameserver`.
- Have the pod perform an nslookup for a hostname in `Zones`.
- If the nslookup succeeds, check if the nslookup server matches the `Upstream`.
- Have the pod perform an nslookup for a hostname in `/etc/resolve.conf`.

### Graduation Criteria
Expand All @@ -295,20 +302,22 @@ TODO
### Upgrade / Downgrade Strategy

Upgrades will continue to use the existing `forward . /etc/resolv.conf` forwarding configuration. Edit
`dnses.operator.openshift.io` to modify the default forwarding behavior by adding a `ForwardPlugin`. For example:
`dnses.operator.openshift.io` to modify the default forwarding behavior by adding `Servers`. For example:

```yaml
apiVersion: operator.openshift.io/v1
kind: DNS
metadata:
name: default
spec:
pluginConfig:
forwardPlugins:
- domain: foo.com
nameServers:
- 1.2.3.4
- 5.6.7.8
servers:
- name: foo-server
zones:
- foo.com
forwardPlugin:
upstreams:
- 1.1.1.1
- 2.2.2.2:5353
```

### Version Skew Strategy
Expand Down

0 comments on commit a3411e6

Please sign in to comment.