Skip to content

Commit

Permalink
Refactor InfraCluster contract
Browse files Browse the repository at this point in the history
  • Loading branch information
fabriziopandini committed Sep 23, 2024
1 parent fef53c1 commit 76c99ab
Show file tree
Hide file tree
Showing 23 changed files with 685 additions and 367 deletions.
18 changes: 9 additions & 9 deletions docs/book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,6 @@
- [Diagnostics](./tasks/diagnostics.md)
- [Security Guidelines](./security/index.md)
- [Pod Security Standards](./security/pod-security-standards.md)
- [Infrastructure Provider Security Guidance](./security/infrastructure-provider-security-guidance.md)
- [clusterctl CLI](./clusterctl/overview.md)
- [clusterctl Commands](clusterctl/commands/commands.md)
- [init](clusterctl/commands/init.md)
Expand All @@ -64,7 +63,6 @@
- [alpha topology plan](clusterctl/commands/alpha-topology-plan.md)
- [additional commands](clusterctl/commands/additional-commands.md)
- [clusterctl Configuration](clusterctl/configuration.md)
- [clusterctl Provider Contract](clusterctl/provider-contract.md)
- [clusterctl for Developers](clusterctl/developers.md)
- [clusterctl Extensions with Plugins](clusterctl/plugins.md)
- [Developer Guide](./developer/getting-started.md)
Expand All @@ -87,7 +85,6 @@
- [Developing E2E tests](developer/core/e2e.md)
- [Tuning controllers](./developer/core/tuning.md)
- [Support multiple instances](./developer/core/support-multiple-instances.md)
- [Multi-tenancy](./developer/core/multi-tenancy.md)
- [Developing providers](./developer/providers/overview.md)
- [Getting started](developer/providers/getting-started/overview.md)
- [Naming](developer/providers/getting-started/naming.md)
Expand All @@ -97,12 +94,15 @@
- [Controllers and Reconciliation](developer/providers/getting-started/controllers-and-reconciliation.md)
- [Configure the provider manifest](developer/providers/getting-started/configure-the-deployment.md)
- [Building, Running, Testing](developer/providers/getting-started/building-running-and-testing.md)
- [Provider contracts](./developer/providers/contracts.md)
- [Cluster Infrastructure](./developer/providers/cluster-infrastructure.md)
- [Control Plane](./developer/providers/control-plane.md)
- [Machine Infrastructure](./developer/providers/machine-infrastructure.md)
- [Bootstrap](./developer/providers/bootstrap.md)
- [Version migration](./developer/providers/version-migration.md)
- [Provider contracts](developer/providers/contracts/overview.md)
- [InfraCluster](./developer/providers/contracts/infra-cluster.md)
- [InfraMachine](developer/providers/contracts/infra-machine.md)
- [BootstrapConfig](developer/providers/contracts/bootstrap-config.md)
- [ControlPlane](developer/providers/contracts/control-plane.md)
- [clusterctl](developer/providers/contracts/clusterctl.md)
- [Best practices](./developer/providers/best-practices.md)
- [Security guidelines](./developer/providers/security-guidelines.md)
- [Version migration](developer/providers/migrations/overview.md)
- [v1.6 to v1.7](./developer/providers/migrations/v1.6-to-v1.7.md)
- [v1.7 to v1.8](./developer/providers/migrations/v1.7-to-v1.8.md)
- [v1.8 to v1.9](./developer/providers/migrations/v1.8-to-v1.9.md)
Expand Down
2 changes: 1 addition & 1 deletion docs/book/src/clusterctl/commands/move.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ clusterctl move --to-kubeconfig="path-to-target-kubeconfig.yaml"
To move the Cluster API objects existing in the current namespace of the source management cluster; in case if you want
to move the Cluster API objects defined in another namespace, you can use the `--namespace` flag.

The discovery mechanism for determining the objects to be moved is in the [provider contract](../provider-contract.md#move)
The discovery mechanism for determining the objects to be moved is in the [provider contract](../../developer/providers/contracts/clusterctl.md#move)

<aside class="note">

Expand Down
4 changes: 2 additions & 2 deletions docs/book/src/clusterctl/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ It can be used to:

## Provider repositories

The `clusterctl` CLI is designed to work with providers implementing the [clusterctl Provider Contract](provider-contract.md).
The `clusterctl` CLI is designed to work with providers implementing the [clusterctl Provider Contract](../developer/providers/contracts/clusterctl.md).

Each provider is expected to define a provider repository, a well-known place where release assets are published.

Expand Down Expand Up @@ -39,7 +39,7 @@ providers:
type: "BootstrapProvider"
```
See [provider contract](provider-contract.md) for instructions about how to set up a provider repository.
See [provider contract](../developer/providers/contracts/clusterctl.md) for instructions about how to set up a provider repository.
**Note**: It is possible to use the `${HOME}` and `${CLUSTERCTL_REPOSITORY_PATH}` environment variables in `url`.

Expand Down
105 changes: 34 additions & 71 deletions docs/book/src/developer/core/controllers/cluster.md
Original file line number Diff line number Diff line change
@@ -1,89 +1,52 @@
# Cluster Controller

![](../../../images/cluster-admission-cluster-controller.png)

The Cluster controller's main responsibilities are:

* Setting an OwnerReference on the infrastructure object referenced in `Cluster.spec.infrastructureRef`.
* Setting an OwnerReference on the control plane object referenced in `Cluster.spec.controlPlaneRef`.
* Cleanup of all owned objects so that nothing is dangling after deletion.
* Keeping the Cluster's status in sync with the infrastructureCluster's status.
* Creating a kubeconfig secret for [workload clusters](../../../reference/glossary.md#workload-cluster).

## Contracts

### Infrastructure Provider

The general expectation of an infrastructure provider is to provision the necessary infrastructure components needed to
run a Kubernetes cluster. As an example, the AWS infrastructure provider, specifically the AWSCluster reconciler, will
provision a VPC, some security groups, an ELB, a bastion instance and some other components all with AWS best practices
baked in. Once that infrastructure is provisioned and ready to be used the AWSMachine reconciler takes over and
provisions EC2 instances that will become a Kubernetes cluster through some bootstrap mechanism.

The cluster controller will set an OwnerReference on the infrastructureCluster. This controller should normally take no action during reconciliation until it sees the OwnerReference.

An infrastructureCluster controller is expected to either supply a controlPlaneEndpoint (via its own `spec.controlPlaneEndpoint` field),
or rely on `spec.controlPlaneEndpoint` in its parent [Cluster](./cluster.md) object.
The Cluster controller is responsible for reconciling the Cluster resource.

If an endpoint is not provided, the implementer should exit reconciliation until it sees `cluster.spec.controlPlaneEndpoint` populated.
In order to allow Cluster provisioning on different type of infrastructure, The Cluster resource references
an InfraCluster object, e.g. AWSCluster, GCPCluster etc.

The Cluster controller bubbles up `spec.controlPlaneEndpoint` and `status.ready` into `status.infrastructureReady` from the infrastructureCluster.
The [InfraCluster resource contract](../../providers/contracts/infra-cluster.md) defines a set of rules a provider is expected to comply in order to allow
the expected interactions with the Cluster controller.

#### Required `status` fields
Among those rules:
- InfraCluster SHOULD report a [controlplane endpoint](../../providers/contracts/infra-cluster.md#infracluster-control-plane-endpoint) for the Cluster
- InfraCluster SHOULD report available [failure domains](../../providers/contracts/infra-cluster.md#infracluster-failure-domains) where machines should be placed in
- InfraCluster MUST report when Cluster's infrastructure is [fully provisioned](../../providers/contracts/infra-cluster.md#infracluster-initialization-completed)
- InfraCluster SHOULD report [conditions](../../providers/contracts/infra-cluster.md#infracluster-conditions)
- InfraCluster SHOULD report [terminal failures](../../providers/contracts/infra-cluster.md#infracluster-terminal-failures)

The InfrastructureCluster object **must** have a `status` object.
Similarly, in order to support different solutions for control plane management, The Cluster resource references
an ControlPlane object, e.g. KubeadmControlPlane, EKSControlPlane etc.

The `spec` object **must** have the following fields defined:
The [ControlPlane resource contract](../../providers/contracts/control-plane.md) defines a set of rules a provider is expected to comply in order to allow
the expected interactions with the Cluster controller.

- `controlPlaneEndpoint` - identifies the endpoint used to connect to the target's cluster apiserver.
Considering all the info above, the Cluster controller's main responsibilities are:

The `status` object **must** have the following fields defined:

- `ready` - a boolean field that is true when the infrastructure is ready to be used.

#### Optional `status` fields

The `status` object **may** define several fields that do not affect functionality if missing:

* `failureReason` - is a string that explains why a fatal error has occurred, if possible.
* `failureMessage` - is a string that holds the message contained by the error.
* `failureDomains` - is a `FailureDomains` type indicating the failure domains that machines should be placed in. `FailureDomains`
is a map, defined as `map[string]FailureDomainSpec`. A unique key must be used for each `FailureDomainSpec`.
`FailureDomainSpec` is defined as:
- `controlPlane` (bool): indicates if failure domain is appropriate for running control plane instances.
- `attributes` (`map[string]string`): arbitrary attributes for users to apply to a failure domain.
* Setting an OwnerReference on the infrastructure object referenced in `Cluster.spec.infrastructureRef`.
* Setting an OwnerReference on the control plane object referenced in `Cluster.spec.controlPlaneRef`.
* Keeping the Cluster's status in sync with the InfraCluster and ControlPlane's status.
* If no ControlPlane object is referenced, create a kubeconfig secret for [workload clusters](../../../reference/glossary.md#workload-cluster).
* Cleanup of all owned objects so that nothing is dangling after deletion.

Note: once any of `failureReason` or `failureMessage` surface on the cluster who is referencing the infrastructureCluster object,
they cannot be restored anymore (it is considered a terminal error; the only way to recover is to delete and recreate the cluster).
![](../../../images/cluster-admission-cluster-controller.png)

Example:
```yaml
kind: MyProviderCluster
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
spec:
controlPlaneEndpoint:
host: example.com
port: 6443
status:
ready: true
```
### Kubeconfig Secrets

### Secrets
In order to create a kubeconfig secret, it is required to have a certificate authority (CA) for the cluster.

If you are using the kubeadm bootstrap provider you do not have to provide any Cluster API secrets. It will generate
all necessary CAs (certificate authorities) for you.
all necessary CAs for you.

However, if you provide a CA for the cluster then Cluster API will be able to generate a kubeconfig secret.
This is useful if you have a custom CA or do not want to use the bootstrap provider's generated self-signed CA.
As alternative users can provide custom CA as described in [Using Custom Certificates](../../../tasks/certs/using-custom-certificates.md).

| Secret name | Field name | Content |
|:---:|:---:|:---:|
|`<cluster-name>-ca`|`tls.crt`|base64 encoded TLS certificate in PEM format|
|`<cluster-name>-ca`|`tls.key`|base64 encoded TLS private key in PEM format|

Alternatively can entirely bypass Cluster API generating a kubeconfig entirely if you provide a kubeconfig secret
Last option, is to entirely bypass Cluster API kubeconfig generation by providing a kubeconfig secret
formatted as described below.

| Secret name | Field name | Content |
|:---:|:---:|:---:|
|`<cluster-name>-kubeconfig`|`value`|base64 encoded kubeconfig|
| Secret name | Field name | Content |
|:---------------------------:|:----------:|:-------------------------:|
| `<cluster-name>-kubeconfig` | `value` | base64 encoded kubeconfig |

Notes:
- Also renewal of the above certificate should be taken care out of band.
- This option does not prevent from providing a cluster CA which is required also for other purposes.
13 changes: 0 additions & 13 deletions docs/book/src/developer/core/multi-tenancy.md

This file was deleted.

20 changes: 9 additions & 11 deletions docs/book/src/developer/core/support-multiple-instances.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Support running multiple instances of the same provider

Up until v1alpha3, the need of supporting [multiple credentials](../../../reference/glossary.md#multi-tenancy) was addressed by running multiple
Up until v1alpha3, the need of supporting [multiple credentials](../../reference/glossary.md#multi-tenancy) was addressed by running multiple
instances of the same provider, each one with its own set of credentials while watching different namespaces.

However, running multiple instances of the same provider proved to be complicated for several reasons:
Expand All @@ -19,24 +19,22 @@ However, running multiple instances of the same provider proved to be complicate
Nevertheless, we want to make it possible for users to choose to deploy multiple instances of the same providers,
in case the above limitations/extra complexity are acceptable for them.

## Contract

In order to make it possible for users to deploy multiple instances of the same provider:
In order to make it possible for users to deploy multiple instances of the Cluster API controller following
flags are provided:

- Providers MUST support the `--namespace` flag in their controllers.
- Providers MUST support the `--watch-filter` flag in their controllers.

⚠️ Users selecting this deployment model, please be aware:
<aside class="note warning">

<h1>⚠️ Users selecting this deployment model, please be aware:</h1>

- Support should be considered best-effort.
- Giving the increasingly complex task that is to manage multiple instances of the same controllers,
the Cluster API community may only provide best effort support for users that choose this model.
- Cluster API (incl. every provider managed under `kubernetes-sigs`) won't release a specialized components file
supporting the scenario described above; however, users should be able to create such deployment model from
the `/config` folder.
- Cluster API (incl. every provider managed under `kubernetes-sigs`) testing infrastructure won't run test cases
with multiple instances of the same provider.

In conclusion, giving the increasingly complex task that is to manage multiple instances of the same controllers,
the Cluster API community may only provide best effort support for users that choose this model.

As always, if some members of the community would like to take on the responsibility of managing this model,
please reach out through the usual communication channels, we'll make sure to guide you in the right path.
</aside>
2 changes: 1 addition & 1 deletion docs/book/src/developer/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ make envsubst
The generated binary can be found at ./hack/tools/bin/envsubst

[envsubst]: https://github.com/drone/envsubst
[provider-contract]: ./../clusterctl/provider-contract.md
[provider-contract]: providers/contracts/clusterctl.md

### Cert-Manager

Expand Down
41 changes: 41 additions & 0 deletions docs/book/src/developer/providers/best-practices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
## Implementation best practices

Cluster API doesn't define strict rules about how providers should implement controllers.

However, some best practice are worth to notice:

- Infrastructure objects (e.g. load balancers, VMs etc) generated by the Infra providers SHOULD adopt a naming
convention that directly links to the Kubernetes resource that originated those objects.
Please note that in most cases external constraints might impact this decision, like e.g.
- Differences in naming conversions from Kubernetes CRDs and the target infrastructure
- The fact that the InfraCluster Kubernetes CRD is namespace-scoped while target infrastructure might have different approaches
to grouping resources

- Naming convention above should not be used and advertised as a contract to build on top. Instead more robust mechanism
MUST always be provided and used for identifying objects, like tagging or labeling.
Please note that this is necessary not only to prevent issues in case Cluster API changes default naming strategies
for the Kubernetes objects generated by core controllers, but also to handle use cases where users intentionally influence Cluster API naming strategies.

- Cluster API offers a great development environment based on Tilt, which can be easily extended to work with any provider. Use it!
See [Rapid iterative development with Tilt]

- Cluster API defines a set of best practices and standards that, if adopted, could speed up provider development and improve
consistency with core Cluster API. See:
- [Logging]
- [Tuning controllers]

- Cluster API implements a test framework that, if adopted, could help in ensuring the quality of the provider. See:
- [Testing]
- [Developing E2E tests]

- While standard security practices for developing Kubernetes controllers apply, it is important to recognize that
given that infrastructure provider deal with cloud credentials and cloud infrastructure, there are additional critical
security concern that must be addressed to ensure secure operations. See:
- [Infrastructure Provider Security Guidance]

[Rapid iterative development with Tilt]: ../core/tilt.md
[Logging]: ../core/logging.md
[Testing]: ../core/testing.md
[Developing E2E tests]: ../core/e2e.md
[Tuning controllers]: ../core/tuning.md
[Infrastructure Provider Security Guidance]: security-guidelines.md
Loading

0 comments on commit 76c99ab

Please sign in to comment.