Skip to content

Commit

Permalink
Refactor core developer guide
Browse files Browse the repository at this point in the history
  • Loading branch information
fabriziopandini committed Sep 20, 2024
1 parent c1c8833 commit e980e08
Show file tree
Hide file tree
Showing 33 changed files with 225 additions and 194 deletions.
55 changes: 28 additions & 27 deletions docs/book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,35 +67,36 @@
- [clusterctl Provider Contract](clusterctl/provider-contract.md)
- [clusterctl for Developers](clusterctl/developers.md)
- [clusterctl Extensions with Plugins](clusterctl/plugins.md)
- [Developer Guide](./developer/guide.md)
- [Repository Layout](./developer/repository-layout.md)
- [Rapid iterative development with Tilt](./developer/tilt.md)
- [Logging](./developer/logging.md)
- [Testing](./developer/testing.md)
- [Developing E2E tests](./developer/e2e.md)
- [Controllers](./developer/architecture/controllers.md)
- [Bootstrap](./developer/architecture/controllers/bootstrap.md)
- [Cluster](./developer/architecture/controllers/cluster.md)
- [Machine](./developer/architecture/controllers/machine.md)
- [MachineSet](./developer/architecture/controllers/machine-set.md)
- [MachineDeployment](./developer/architecture/controllers/machine-deployment.md)
- [MachineHealthCheck](./developer/architecture/controllers/machine-health-check.md)
- [Control Plane](./developer/architecture/controllers/control-plane.md)
- [MachinePool](./developer/architecture/controllers/machine-pool.md)
- [ClusterTopology](./developer/architecture/controllers/cluster-topology.md)
- [ClusterResourceSet](./developer/architecture/controllers/cluster-resource-set.md)
- [Multi-tenancy](./developer/architecture/controllers/multi-tenancy.md)
- [Support multiple instances](./developer/architecture/controllers/support-multiple-instances.md)
- [Tuning controllers](./developer/architecture/controllers/tuning.md)
- [Developer Guide](./developer/getting-started.md)
- [Developing "core" Cluster API](developer/core/overview.md)
- [Rapid iterative development with Tilt](developer/core/tilt.md)
- [Repository Layout](developer/core/repository-layout.md)
- [Controllers](./developer/core/controllers/overview.md)
- [Cluster](./developer/core/controllers/cluster.md)
- [ClusterTopology](./developer/core/controllers/cluster-topology.md)
- [ClusterResourceSet](./developer/core/controllers/cluster-resource-set.md)
- [MachineDeployment](./developer/core/controllers/machine-deployment.md)
- [MachineSet](./developer/core/controllers/machine-set.md)
- [Machine](./developer/core/controllers/machine.md)
- [MachinePool](./developer/core/controllers/machine-pool.md)
- [MachineHealthCheck](./developer/core/controllers/machine-health-check.md)
- [Bootstrap](./developer/core/controllers/bootstrap.md)
- [Control Plane](./developer/core/controllers/control-plane.md)
- [Logging](developer/core/logging.md)
- [Testing](developer/core/testing.md)
- [Developing E2E tests](developer/core/e2e.md)
- [Tuning controllers](./developer/core/tuning.md)
- [Support multiple instances](./developer/core/support-multiple-instances.md)
- [Multi-tenancy](./developer/core/multi-tenancy.md)
- [Developing providers](./developer/providers/overview.md)
- [Getting started](developer/providers/getting-started/overview.md)
- [Naming](developer/providers/getting-started/naming.md)
- [Initialize Repo and API types](developer/providers/getting-started/initialize-repo-and-api-types.md)
- [Implement API types](developer/providers/getting-started/implement-api-types.md)
- [Webhooks](developer/providers/getting-started/webhooks.md)
- [Controllers and Reconciliation](developer/providers/getting-started/controllers-and-reconciliation.md)
- [Configure the provider manifest](developer/providers/getting-started/configure-the-deployment.md)
- [Building, Running, Testing](developer/providers/getting-started/building-running-and-testing.md)
- [Naming](developer/providers/getting-started/naming.md)
- [Initialize Repo and API types](developer/providers/getting-started/initialize-repo-and-api-types.md)
- [Implement API types](developer/providers/getting-started/implement-api-types.md)
- [Webhooks](developer/providers/getting-started/webhooks.md)
- [Controllers and Reconciliation](developer/providers/getting-started/controllers-and-reconciliation.md)
- [Configure the provider manifest](developer/providers/getting-started/configure-the-deployment.md)
- [Building, Running, Testing](developer/providers/getting-started/building-running-and-testing.md)
- [Provider contracts](./developer/providers/contracts.md)
- [Cluster Infrastructure](./developer/providers/cluster-infrastructure.md)
- [Control Plane](./developer/providers/control-plane.md)
Expand Down
2 changes: 1 addition & 1 deletion docs/book/src/clusterctl/provider-contract.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ While defining the Deployment Spec, the container that executes the controller/r

For controllers only, the manager MUST support a `--namespace` flag for specifying the namespace where the controller
will look for objects to reconcile; however, clusterctl will always install providers watching for all namespaces
(`--namespace=""`); for more details see [support for multiple instances](../developer/architecture/controllers/support-multiple-instances.md)
(`--namespace=""`); for more details see [support for multiple instances](../developer/core/support-multiple-instances.md)
for more context.

While defining Pods for Deployments, canonical names should be used for images.
Expand Down
20 changes: 0 additions & 20 deletions docs/book/src/developer/architecture/controllers.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# MachineDeployment

A MachineDeployment orchestrates deployments over a fleet of [MachineSets](./machine-set.md).
A MachineDeployment orchestrates deployments over a fleet of MachineSets.

Its main responsibilities are:
* Adopting matching MachineSets not assigned to a MachineDeployment
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# MachineHealthCheck

A MachineHealthCheck is responsible for remediating unhealthy [Machines](./machine.md).
A MachineHealthCheck is responsible for remediating unhealthy Machines.

Its main responsibilities are:
* Checking the health of Nodes in the [workload clusters] against a list of unhealthy conditions
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# MachineSet

A MachineSet is an abstraction over [Machines](./machine.md).
A MachineSet is an abstraction over Machines.

Its main responsibilities are:
* Adopting unowned Machines that aren't assigned to a MachineSet
Expand Down
18 changes: 18 additions & 0 deletions docs/book/src/developer/core/controllers/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Controllers

This section of the book provides an overview about "core" controllers in Cluster API.

<aside class="note warning">

<h1>The code is the source of truth!</h1>

While we put a great effort in ensuring a good documentation for Cluster API, we also recognize that some
part of the documentation are more prone to miss details or become outdated.

Unfortunately, this section is one of those parts, because things in Cluster API change fast and the
complexity of core controllers keeps growing.

Please feel free to open issues or even better send PRs with improvement that can make this documentation
even more valuable for the readers that will follow you.

</aside>
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,10 @@ Using the config file it is possible to:

- Define the list of providers to be installed in the management cluster. Most notably,
for each provider it is possible to define:
- One or more versions of the providers manifest (built from the sources, or pulled from a
remote location).
- A list of additional files to be added to the provider repository, to be used e.g.
to provide `cluster-templates.yaml` files.
- One or more versions of the providers manifest (built from the sources, or pulled from a
remote location).
- A list of additional files to be added to the provider repository, to be used e.g.
to provide `cluster-templates.yaml` files.
- Define the list of variables to be used when doing `clusterctl init` or
`clusterctl generate cluster`.
- Define a list of intervals to be used in the test specs for defining timeouts for the
Expand Down Expand Up @@ -135,7 +135,7 @@ defined in the [Cluster API test framework] to check if the operation completed

### Naming the test spec

You can categorize the test with a custom label that can be used to filter a category of E2E tests to be run. Currently, the cluster-api codebase has [these labels](./testing.md#running-specific-tests) which are used to run a focused subset of tests.
You can categorize the test with a custom label that can be used to filter a category of E2E tests to be run. Currently, the cluster-api codebase has [these labels](testing.md#running-specific-tests) which are used to run a focused subset of tests.

## Tear down

Expand Down Expand Up @@ -189,7 +189,7 @@ The [test E2E package] provides examples of how this can be achieved by implemen
test specs for the most common Cluster API use cases.

<!-- links -->
[Cluster API quick start]: ../user/quick-start.md
[Cluster API quick start]: ../../user/quick-start.md
[Cluster API test framework]: https://pkg.go.dev/sigs.k8s.io/cluster-api/test/framework?tab=doc
[Apply method]: https://pkg.go.dev/sigs.k8s.io/cluster-api/test/framework?tab=doc#Applier
[CAPA E2E tests]: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/scripts/ci-e2e.sh
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ In Cluster API we strive to follow three principles while implementing logging:

## Upstream Alignment

Kubernetes defines a set of [logging conventions](https://git.k8s.io/community/contributors/devel/sig-instrumentation/logging.md),
Kubernetes defines a set of [logging conventions](https://git.k8s.io/community/contributors/devel/sig-instrumentation/logging.md),
as well as tools and libraries for logging.

## Continuous improvement
Expand All @@ -28,16 +28,16 @@ The foundational items of Cluster API logging are:
- Adding a minimal set of key/value pairs in the logger at the beginning of each reconcile loop, so all the subsequent
log entries will inherit them (see [key value pairs](#keyvalue-pairs)).

Starting from the above foundations, then the long tail of small improvements will consist of following activities:
Starting from the above foundations, then the long tail of small improvements will consist of following activities:

- Improve consistency of additional key/value pairs added by single log entries (see [key value pairs](#keyvalue-pairs)).
- Improve log messages (see [log messages](#log-messages)).
- Improve consistency of log levels (see [log levels](#log-levels)).

## Log Format

Controllers MUST provide support for [structured logging](https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/1602-structured-logging)
and for the [JSON output format](https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/1602-structured-logging#json-output-format);
and for the [JSON output format](https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/1602-structured-logging#json-output-format);
quoting the Kubernetes documentation, these are the key elements of this approach:

- Separate a log message from its arguments.
Expand All @@ -61,7 +61,7 @@ beginning of the chain are then inherited by all the subsequent log entries crea
Contextual logging is also embedded in controller runtime; In Cluster API we use contextual logging via controller runtime's
`LoggerFrom(ctx)` and `LoggerInto(ctx, log)` primitives and this ensures that:

- The logger passed to each reconcile call has a unique `reconcileID`, so all the logs being written during a single
- The logger passed to each reconcile call has a unique `reconcileID`, so all the logs being written during a single
reconcile call can be easily identified (note: controller runtime also adds other useful key value pairs by default).
- The logger has a key value pair identifying the objects being reconciled,e.g. a Machine Deployment, so all the logs
impacting this object can be easily identified.
Expand All @@ -85,18 +85,18 @@ one of the above practices is really important for Cluster API developers
- Developers MUST use `klog.KObj` or `klog.KRef` functions when logging key value pairs for Kubernetes objects, thus
ensuring a key value pair representing a Kubernetes object is formatted consistently in all the logs.
- Developers MUST use consistent log keys:
- kinds should be written in upper camel case, e.g. `MachineDeployment`, `MachineSet`
- Note: we cannot use lower camel case for kinds consistently because there is no way to
automatically calculate the correct log key for provider CRDs like `AWSCluster`
- all other keys should use lower camel case, e.g. `resourceVersion`, `oldReplicas` to align to Kubernetes log conventions
- kinds should be written in upper camel case, e.g. `MachineDeployment`, `MachineSet`
- Note: we cannot use lower camel case for kinds consistently because there is no way to
automatically calculate the correct log key for provider CRDs like `AWSCluster`
- all other keys should use lower camel case, e.g. `resourceVersion`, `oldReplicas` to align to Kubernetes log conventions

Please note that, in order to ensure logs can be easily searched it is important to ensure consistency for the following
key value pairs (in order of importance):

- Key value pairs identifying the object being reconciled, e.g. a MachineDeployment.
- Key value pairs identifying the hierarchy of objects being reconciled, e.g. the Cluster a MachineDeployment belongs
to.
- Key value pairs identifying side effects on other objects, e.g. while reconciling a MachineDeployment, the controller
- Key value pairs identifying side effects on other objects, e.g. while reconciling a MachineDeployment, the controller
creates a MachineSet.
- Other Key value pairs.

Expand All @@ -117,9 +117,9 @@ for log levels; as a small integration on the above guidelines we would like to
- Logs at the lower levels of verbosity (<=3) are meant to document “what happened” by describing how an object status
is being changed by controller/reconcilers across subsequent reconciliations; as a rule of thumb, it is reasonable
to assume that a person reading those logs has a deep knowledge of how the system works, but it should not be required
for those persons to have knowledge of the codebase.
for those persons to have knowledge of the codebase.
- Logs at higher levels of verbosity (>=4) are meant to document “how it happened”, providing insight on thorny parts of
the code; a person reading those logs usually has deep knowledge of the codebase.
the code; a person reading those logs usually has deep knowledge of the codebase.
- Don’t use verbosity higher than 5.

We are using log level 2 as a default verbosity for all core Cluster API
Expand All @@ -140,7 +140,7 @@ Our [Tilt](tilt.md) setup offers a batteries-included log suite based on [Promta
We are working to continuously improving this experience, allowing Cluster API developers to use logs and improve them as part of their development process.

For the best experience exploring the logs using Tilt:
1. Set `--logging-format=json`.
1. Set `--logging-format=json`.
2. Set a high log verbosity, e.g. `v=5`.
3. Enable Promtail, Loki, and Grafana under `deploy_observability`.

Expand Down Expand Up @@ -168,7 +168,7 @@ extra_args:
- "--v=5"
- "--logging-format=json"
```
The above options can be combined with other settings from our [Tilt](tilt.md) setup. Once Tilt is up and running with these settings users will be able to browse logs using the Grafana Explore UI.
The above options can be combined with other settings from our [Tilt](tilt.md) setup. Once Tilt is up and running with these settings users will be able to browse logs using the Grafana Explore UI.
This will normally be available on `localhost:3001`. To explore logs from Loki, open the Explore interface for the DataSource 'Loki'. [This link](http://localhost:3001/explore?datasource%22:%22Loki%22) should work as a shortcut with the default Tilt settings.

Expand Down Expand Up @@ -220,4 +220,3 @@ we encourage providers to adopt and contribute to the guidelines defined in this
It is also worth noting that the foundational elements of the approach described in this document are easy to achieve
by leveraging default Kubernetes tooling for logging.
16 changes: 16 additions & 0 deletions docs/book/src/developer/core/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Developing "core" Cluster API

This section of the book is about developing "core" Cluster API.

Whith "core" Cluster API we refer to the common set of API and controllers that are required to run
any Cluster API provider.

Please note that in the Cluster API code base, side by side of "core" Cluster API components there
is also a limited number of in-tree providers:

- Kubeadm bootstrap provider (CAPBK)
- Kubeadm control plane provider (KCP)
- Docker infrastructure provider (CAPD) - The Docker provider is not designed for production use and is intended for development & test only.
- In Memory infrastructure provider (CAPIM) - The In Memory provider is not designed for production use and is intended for development & test only.

Please refer to [Developing providers](../providers/overview.md) for documentation about in-tree providers (and out of tree providers too).
Loading

0 comments on commit e980e08

Please sign in to comment.