Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImageUpdateAutomation v1beta2 API with refactored controller #647

Merged
merged 10 commits into from
Apr 23, 2024

Conversation

darkowlzz
Copy link
Contributor

@darkowlzz darkowlzz commented Mar 12, 2024

This change refactors image-automation-controller to implement the new unified standards of flux controllers.

See #643 for some overview of the new changes.

Following are some highlights of the changes:

  • Introduce v1beta2 API which introduces new status fields .observedPolicies and .observedSourceRevision required for the new reconciliation model, and a spec field .policySelector for selecting the policies to consider in the update operation.
  • Refactor the ImageUpdateAutomationReconciler to align with other flux controllers in the way we report the status and send notifications.
  • When the checkout and push branch of the source are the same, the controller now returns early when there's no change in the policies and the remote source. This would help avoid full reconciliations when nothing has changed. When the push branch is different from checkout branch or a refspec is provided, full reconciliation will always be performed.
  • The controller/reconciler e2e tests have been refactored to simplify the tests and expand what's tested. Some tests involving the remote git state changes have been moved to the tests in internal/source package tests.
  • Introduce internal/source package which provides all the source management API. These source management operations are tested in detail independent of the reconciler.
  • The commit template values now has a new update ResultV2 (introduced in Introduce ResultV2 for update results #642) which includes the old and new value of the changes made to the objects in files. This is accessible under .Changed. The previous update Result remains available under .Updated but is deprecated. ResultV2 contains all the updates, even if the update wasn't a complete image but part of an image.
  • The policy applying code has been moved to internal/policy package to be maintained and tested independent of the reconciler and can be used reconciler and source manager tests as needed.
  • Use the ResultV2 based commit template description and examples in the v1beta2 API spec docs.
  • Remove the deprecated suspend and readiness reconciliation metrics.

Example ImageUpdateAutomation status with the new fields:

status:
  conditions:
  - lastTransitionTime: "2024-03-18T20:01:57Z"
    message: repository up-to-date
    observedGeneration: 8
    reason: Succeeded
    status: "True"
    type: Ready
  lastAutomationRunTime: "2024-03-18T21:02:30Z"
  lastHandledReconcileAt: "1710791381"
  lastPushCommit: 8084f1bb180ac259c6698cd027064b7dce86a72a
  lastPushTime: "2024-03-18T18:53:04Z"
  observedGeneration: 8
  observedPolicies:
    podinfo-policy:
      name: ghcr.io/stefanprodan/podinfo
      tag: 4.0.6
    myapp1:
      name: ghcr.io/fluxcd/myapp1
      tag: 4.0.0
    myapp2:
      name: ghcr.io/fluxcd/myapp2
      tag: 2.0.0
  observedSourceRevision: main@sha1:8084f1bb180ac259c6698cd027064b7dce86a72a

Example rendered commit template with before and after update values per file and per object:

Automation: default/test-update-auto

- File: foo-deployment.yaml
  - Object: Deployment/default/foo
    Changes:
    - 2.2.2 -> 5.0.3
    
- File: podinfo-deployment.yaml
  - Object: Deployment/default/infopod
    Changes:
    - v1.0 -> 5.0.3
  - Object: Deployment/default/podinfo
    Changes:
    - ghcr.io/stefanprodan/podinfo:4.0.6 -> ghcr.io/stefanprodan/podinfo:5.0.3
    - bar -> ghcr.io/stefanprodan/podinfo
    - 4.0.6 -> 5.0.3

Instructions for testing

To test this, checkout this branch and build the container image with make docker-build IMG=<image-name> TAG=<tag>. Before deploying the container image, update/install the CRDs to v1beta2 API with make install or apply the CRD manifests in config/crd/bases. Create git repository, image repository, policy and update automation object and test.

Fixes #437

@darkowlzz darkowlzz added the enhancement New feature or request label Mar 12, 2024
internal/source/source.go Fixed Show fixed Hide fixed
internal/source/source.go Fixed Show fixed Hide fixed
internal/source/source.go Fixed Show fixed Hide fixed
internal/source/source.go Fixed Show fixed Hide fixed
internal/source/git.go Fixed Show fixed Hide fixed
internal/testutil/util.go Dismissed Show dismissed Hide dismissed
internal/testutil/util.go Dismissed Show dismissed Hide dismissed
internal/source/git.go Dismissed Show dismissed Hide dismissed
@darkowlzz darkowlzz force-pushed the refactor branch 8 times, most recently from 2746a75 to fa09050 Compare March 18, 2024 21:30
@darkowlzz darkowlzz marked this pull request as ready for review March 18, 2024 21:32
@darkowlzz darkowlzz changed the title v1beta2 ImageUpdateAutomation API with refactored controller ImageUpdateAutomation v1beta2 API with refactored controller Mar 18, 2024
@darkowlzz
Copy link
Contributor Author

darkowlzz commented Mar 18, 2024

@tun0 you may be interesting in trying this. It addresses your feature request in fluxcd/flux2#3027 . There's a new commit template data field which provides a new update result that include the old and new values. See the new docs for examples of the template that have old and new values after update.

@tun0
Copy link

tun0 commented Mar 19, 2024

@tun0 you may be interesting in trying this. It addresses your feature request in fluxcd/flux2#3027 . There's a new commit template data field which provides a new update result that include the old and new values. See the new docs for examples of the template that have old and new values after update.

Looks promising! Will add it to my list of things I still need to do 😬

Comment on lines +310 to +329
type TemplateData struct {
AutomationObject struct {
Name, Namespace string
}
Changed update.ResultV2
}

// ResultV2 contains the file changes made during the update. It contains
// details about the exact changes made to the files and the objects in them. It
// has a nested structure file->objects->changes.
type ResultV2 struct {
FileChanges map[string]ObjectChanges
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this actually is:

type TemplateData struct {
	AutomationObject struct {
	  Name, Namespace string
	}
	Updated          update.Result
	Changed          update.ResultV2
}

and

type ResultV2 struct {
	ImageResult Result
	FileChanges map[string]ObjectChanges
}

In order to keep the docs concise and lead users towards using the new type, I have not shown the old Updated field in the TemplateData and the old Result in ResultV2 as embedded.
Their existence ensure that the old templates continue to work, but documenting them would be almost redundant with slight changes. Maybe we can link to the v1beta1 API doc and ask to look there for details about those types.

I think we'll deprecate the old result type eventually. Or we can deprecate them in v1beta2 itself and remove in v1 API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose we deprecate the old result type in v1beta2 and we keep only v2 in the docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a deprecation note in the spec docs https://github.com/fluxcd/image-automation-controller/blob/refactor/docs/spec/v1beta2/imageupdateautomations.md#message-template .
I think it's difficult to reliably detect its usage and log warnings because the message template may contain .Updated but not necessarily be using the template data. We can use some regex and try to detect it but, I'll wait for some discussion and agreement if we do need to do that.

Comment on lines +456 to +471
If both `.push.refspec` and `.push.branch` are specified, then the reconciler
will push to both the destinations. This is particularly useful for working with
Gerrit servers. For more information about this, please refer to the
[Gerrit](#gerrit) section.

**Note:** If both `.push.refspec` and `.push.branch` are essentially equal to
each other (for e.g.: `.push.refspec: refs/heads/main:refs/heads/main` and
`.push.branch: main`), then the reconciler might fail with an `already
up-to-date` error.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike in v1beta1 docs for refspec, I have not mentioned the refspec push functionality as a "second push operation" because in go-git, the client can be configured to call a single push to push to a branch + any refspecs specified in the push options. I had this implemented in my initial refactored code which worked with the refspec push tests, but I decided to introduce this change in a separate release in the future to avoid adding more new things to this refactor. It'll also require some changes in pkg/git/go-git to accept the options.

I'd also like to revisit this to change .spec.git.push.refspec to refspecs and accept any number of refspecs as the underlying data type is a list of refspecs and I have not seen any apparent reason to limit it just to one refspec.

@darkowlzz
Copy link
Contributor Author

darkowlzz commented Mar 21, 2024

Sharing results of an upgrade test from v1beta1 to v1beta2 API.

Before upgrade, I had the following object:

apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
  creationTimestamp: "2024-03-21T16:31:40Z"
  finalizers:
  - finalizers.fluxcd.io
  generation: 1
  name: podinfo-update
  namespace: default
  resourceVersion: "3055784"
  uid: 25d426d7-b7b4-4b4d-8b31-0522aa8d5f6e
spec:
  git:
    checkout:
      ref:
        branch: main
    commit:
      author:
        email: [email protected]
        name: fluxcdbot
  interval: 30m
  sourceRef:
    kind: GitRepository
    name: podinfo
  update:
    strategy: Setters
status:
  conditions:
  - lastTransitionTime: "2024-03-21T16:31:40Z"
    message: no updates made; last commit 62b92bf at 2024-03-21T22:01:42+05:30
    reason: ReconciliationSucceeded
    status: "True"
    type: Ready
  lastAutomationRunTime: "2024-03-21T16:31:42Z"
  lastPushCommit: 62b92bfe37b7dfa97d3870b29db211f2145917fd
  lastPushTime: "2024-03-21T16:31:42Z"
  observedGeneration: 1

After installing the new CRD for v1beta2 API and running the new controller, the object becomes:

apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageUpdateAutomation
metadata:
  creationTimestamp: "2024-03-21T16:31:40Z"
  finalizers:
  - finalizers.fluxcd.io
  generation: 1
  name: podinfo-update
  namespace: default
  resourceVersion: "3055965"
  uid: 25d426d7-b7b4-4b4d-8b31-0522aa8d5f6e
spec:
  git:
    checkout:
      ref:
        branch: main
    commit:
      author:
        email: [email protected]
        name: fluxcdbot
  interval: 30m
  sourceRef:
    kind: GitRepository
    name: podinfo
  update:
    strategy: Setters
status:
  conditions:
  - lastTransitionTime: "2024-03-21T16:33:00Z"
    message: repository up-to-date
    observedGeneration: 1
    reason: Succeeded
    status: "True"
    type: Ready
  lastAutomationRunTime: "2024-03-21T16:33:00Z"
  lastPushCommit: 62b92bfe37b7dfa97d3870b29db211f2145917fd
  lastPushTime: "2024-03-21T16:31:42Z"
  observedGeneration: 1
  observedPolicies:
    podinfo-policy:
      name: ghcr.io/stefanprodan/podinfo
      tag: 4.0.6
    podinfo-test:
      name: ghcr.io/stefanprodan/podinfo
      tag: 4.0.0
    podinfo-test-1:
      name: ghcr.io/stefanprodan/podinfo
      tag: 4.0.6
  observedSourceRevision: main@sha1:62b92bfe37b7dfa97d3870b29db211f2145917fd

Just the status gets updated with new information.

api/v1beta2/imageupdateautomation_types.go Show resolved Hide resolved
api/v1beta2/imageupdateautomation_types.go Show resolved Hide resolved
internal/testutil/util.go Show resolved Hide resolved
internal/testutil/util.go Outdated Show resolved Hide resolved
internal/testutil/util.go Outdated Show resolved Hide resolved
@souleb
Copy link
Member

souleb commented Mar 28, 2024

Tested it manually in a kind cluster with various scenarios:

  • migration from v1beta1 to v1beta1
  • use Result
  • use ResultV2
  • use policySelector
Status:
  Conditions:
    Last Transition Time:     2024-03-28T09:08:22Z
    Message:                  repository up-to-date
    Observed Generation:      2
    Reason:                   Succeeded
    Status:                   True
    Type:                     Ready
  Last Automation Run Time:   2024-03-28T10:08:23Z
  Last Handled Reconcile At:  2024-03-28T10:58:14.243605+01:00
  Last Push Commit:           f5167f60f1d90eb81ed30eabc7b0e562280cb2ef
  Last Push Time:             2024-03-28T10:07:39Z
  Observed Generation:        2
  Observed Policies:
    Backend - Memcached:
      Name:  registry-1.docker.io/bitnamicharts/memcached
      Tag:   7.0.3
    Backend - Redis:
      Name:  registry-1.docker.io/bitnamicharts/redis
      Tag:   19.0.1
    Frontend - Podinfo:
      Name:                  ghcr.io/stefanprodan/charts/podinfo
      Tag:                   6.6.1
  Observed Source Revision:  main@sha1:f5167f60f1d90eb81ed30eabc7b0e562280cb2ef
Events:
  Type    Reason     Age   From                         Message
  ----    ------     ----  ----                         -------
  Normal  Succeeded  11m (x5 over 60m)  image-automation-controller  repository up-to-date
  Normal  Succeeded  97s                image-automation-controller  pushed commit 'be01557' to branch 'main'
Automated image update

Automation name: apps-update/flux-apps

Files:
- backend/base/memcached.yaml
Objects:
- HelmRelease memcached
  Changes:
    - 7.0.2 -> 7.0.3
  Normal  Succeeded  96s  image-automation-controller  pushed commit 'f5167f6' to branch 'main'
Automated image update

Automation name: apps-update/flux-apps

Files:
- frontend/base/podinfo.yaml
Objects:
- HelmRelease podinfo
  Changes:
    - 6.6.0 -> 6.6.1
  Normal  Succeeded  51s (x3 over 32m)  image-automation-controller  no change since last reconciliation

darkowlzz and others added 10 commits April 18, 2024 16:16
- Introduce v1beta2 API with the following changes
  - Removes SetImageUpdateAutomationReadiness() and
    GetStatusConditions().
  - Introduce new status fields in the API ObservedPolicies and
    ObservedSourceRevision.
  - Introduce new status condition reasons for use in the new
    reconciliation model with v1beta2 API.

Signed-off-by: Sunny <[email protected]>
Move all the common test utilities that are needed for testing different
packages into a common testutil package. Modify the test helpers to be
more generic to be reusable.

Signed-off-by: Sunny <[email protected]>
Move the policy applying code to a separate package so that it can be
tested and maintained independent of the other components, and imported
to other packages where needed.

Signed-off-by: Sunny <[email protected]>
Move all the Git source management code into a new package and introduce
abstractions to manage the source as per the needs of image update
automation. A new type, SourceManager, is introduced which configures
and manages the source. It provides methods to perform relevant actions
on the source and also the ability to customize those actions. It also
introduces PushResult which contains the information about the changes
that were pushed. It can be used to gather information about the pushed
commit and get a summary of the operation.

All the source related operations are tested in this package,
independent of the reconciler. The tests from the controller e2e tests
have been rewritten in terms of source manager, making the tests focused
and simpler.

The source change commit operation uses the new ResultV2 update result
which includes the old and new strings that are part of the update. The
previous Result type is still available to use.

Signed-off-by: Sunny <[email protected]>
Since the reconciler is being completely rewritten, remove the old
controller file. A new controller file with the new implemementation
will be added in the following commit.

Signed-off-by: Sunny <[email protected]>
Introduce the rewritten reconciler which uses v1beta2 API and the new
internal/policy and internal/source packages for performing all the
operations. The reconciliation model is written similar to the other
flux controller, with simplified events and logs, and kstatus support.
The reconciliation result computation similar to
image-reflector-controller, using the same reconciler helpers from
pkg/runtime repo.

With the new status fields ObservedPolicies and ObservedSourceRevision,
the reconciler now avoid full sync of the source which involves cloning
the source every time. If the policies and the remote source have not
changed since the last reconciliation, the reconciliation is returned
early without cloning and apply the updates. This is only applicable
when the checkout branch and the push branch are the same. For a
different push branch and refspec, full sync is always performed as
before.

Notifications are now only sent when there's something new to inform
about. If there is no change, an source up-to-date notification is sent.
When there's an update, the details about the pushed commit with any
rendered commit template is send. When there's a failure, the error is
sent in the notification. And when there's an error recovery, a success
notification is sent to rely that the automation has recovered.

Signed-off-by: Sunny <[email protected]>
Rewrite the controller tests to test the new behavior of the reconciler
and also simplify the tests for scenarios that are now being tests in
internal/source package, especially the git operations related tests.
Some of the old controller tests are still kept to ensure those
functionalities continue to work after the controller rewrite, even if
some of them are redundant. They can be removed in the future for the
tests in the respective subpackages, internal/policy and
internal/source.

New tests that focus the status conditions and notifications have been
added for various possible scenarios.

The test helpers have been modified to simplify their usage and some
have been replaced with their equivalent version from the testutils
package.

Signed-off-by: Sunny <[email protected]>
Signed-off-by: Sunny <[email protected]>
Copy link
Member

@souleb souleb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @darkowlzz

Copy link
Member

@stefanprodan stefanprodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks @darkowlzz 🥇

@stefanprodan stefanprodan merged commit ffcb4d1 into main Apr 23, 2024
7 checks passed
@stefanprodan stefanprodan deleted the refactor branch April 23, 2024 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow including the previous image tag in the commit message template
5 participants