Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Approve and Auto-Apply when no changes #4187

Open
1 task
javajawa opened this issue Jan 29, 2024 · 12 comments
Open
1 task

Approve and Auto-Apply when no changes #4187

javajawa opened this issue Jan 29, 2024 · 12 comments
Labels
feature New functionality/enhancement

Comments

@javajawa
Copy link

javajawa commented Jan 29, 2024

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Describe the user story

As an operations/engineering/SRE team, we would like to keep change via terraform in a controlled workflow, whilst allowing non-changes to proceed without becoming toil.

Consider a situation with over 50 AWS accounts managed via terraform, stored in a self-hosted gitlab, with renovate and atlantis running. Renovate will create a merge request for each provider upgrade for each account, which will generate in 95+% cases a set of empty terraform plans.

In this case, an approver will have to come along, approve each request, and apply the change (which makes no write API calls to AWS, but will update the state file with the new provider/module version numbers).

In order to reduce fatigue on the approvers, they should only have to look at the merges where a change to the controlled infrastructure is happening. In the no-op case, the tools should be able to resolve the issue autonomously.

Describe the solution you'd like

Atlantis, upon successfully completing all plans for a merge request, knows whether there are any changes, either between reality and state, or reality and config. If there are no such changes, it can undertake to help move the merge along.

The following are workflows I feel make sense.

  1. Along with the existing setting for Atlantis to automatically merge when the apply is complete, Atlantis could also merge when there is nothing to apply. This leaves open the question of state file updates that would happen on these applies.
  2. Atlantis could add itself as an approver on these merges.
  3. Atlantis adds labels to the merge request indicating different statuses. atlantis:planned, atlantis:no-changes. A separate process can then consume these and other metadata ("merge is from renovate") and perform the actions as part of a different control loop. This information is technically exposed via job statuses.

The minimum viable outcome is any mechanism by which an approval bot can verify that altantis has completed up-to-date plans for all modules in a merge request, and that they contain no changes.
This does already exist (see conversation below), but some of these options may result in a better user experience.

Describe the drawbacks of your solution

Automatically starting Atlantis actions other than plan is not the current convention (at least in the Gitlab ecosystem, I am less familiar with how Atlantis integrates with other VCS providers).

Label usage is simple in some systems, but may not be consistent across all officially supported providers.

Describe alternatives you've considered

As highlighted by the label-based workflow, most of the actual behaviour here can be achieved by use of a different component, keeping Atlantis itself more focused. However, determining that a given merge is fully planned and has no changes in a consistent manner is the true problem here.

@javajawa javajawa added the feature New functionality/enhancement label Jan 29, 2024
@nitrocode
Copy link
Member

nitrocode commented Jan 30, 2024

Without adding additional complexity in atlantis and building the code for each git hosting service (github, bitbucket, etc), we can defer this to said system.

For GitHub (however this can be used with other cicd systems), have you considered

  1. setting atlantis/apply as a required check
  2. For renovate prs
  3. For non renovate prs

Then the workflow would be the following for renovate and human users

  1. renovate bumps dependencies or dev creates a change and opens pr
  2. renovate/gha enables auto merge and approves
  3. atlantis runs plan
  4. If all dirs show no changes, atlantis apply will be green and renovate/gh merges
  5. If any dir shows changes, atlantis apply will be red and renovate won't merge

@javajawa
Copy link
Author

javajawa commented Jan 30, 2024

That indeed works excellently for GitHub.

GitLab does not have the concept of have specific external jobs be expected. (It treats CI at the pipeline level, not as a set of jobs). This makes the workflow described above less portable, but it is possible by looking for the atlantis/apply job on the current pipeline.

Additionally, the latest version of atlantis does not seem to register the apply check as part of the pipeline until atlantis apply is directly, even if there are no planned changes.
That only leaves text parsing as the only currently available option (I'm aware of) for detecting empty plans.

@tiagomeireles
Copy link
Contributor

Additionaly, the latest version of atlantis[2] does not seem to register the apply check as part of the pipeline until atlantis apply is directly

Was this an intentional change? I'm seeing this behavior change, running atlantis v0.27.1 and Gitlab 15.11.

@javajawa
Copy link
Author

Did some more detailed testing of behaviour for Atlantis v0.27.0 against self-hosted premium gitlab at 15.11.

Atlantis is reporting the the atlantis/apply as a complete job in the pipeline if there are no changes.

Therefore, I can in principle write some tooling which checks for this in conjunction with the author of the merge request being renovate bot, adds an approval, and triggers the actual apply (as "no plan changes" != "no state changes").

I will leave this feature request open because I believe the idea of atlantis applying actual labels or approvals is a capability worth continued dicussion, but I will attempt to edit the description to better reflect the reality of the problem.

@stasostrovskyi
Copy link
Contributor

I think it can be a problem to run apply in case of "No Changes", because all the workflows (at least custom ones) depend on plan file to exist. And you don't always want to exit successfully from atlantis/apply if the plan file doesn't exist, because there can be other reasons why the plan doesn't exist..

@javajawa
Copy link
Author

javajawa commented Jan 31, 2024

Surely Atlantis should be saving a plan file even if there are no changes? Just because there's no API operations to perform doesn't mean there isn't other interaction.

I think we should possibly clarify the meaning of "no changes" here?

A terraform operation has four points of interaction -- the HCL code (configuration), the state file, the binaries being run, and configured infrastructure (reality)

Terraform reports "no changes" when the the binaries have no changes to make to reality based on the configuration. I am presuming that Atlantis maintains this contract. However, there are a number of other operations that happen during an terraform apply that may still be relevant:

  • Changes that are already reflected in both configuration and reality may be updated in state ("external changes")
  • Changes in the binaries affect state (version changes in providers)
  • moved{} blocks are evaluated into state

These are all changes that should be applied to correct sync the four points, but they are not going to affect reality (and thus in the workflow I'm describing, do not need prior human authorization).

@stasostrovskyi
Copy link
Contributor

Terraform doesn't create a plan file if there are no changes, it exits with non-zero error code instead.

@javajawa
Copy link
Author

javajawa commented Jan 31, 2024

That's...trivially provable as "not true"?

mkdir test && cd test || exit 1
echo 'terraform {}' >main.tf
terraform --version
terraform init
terraform plan -out plan.tfplan
printf "plan result: %d\n" "$?"
base64 plan.tfplan | head -n 1

2024-01-31_1064x530

The exact value and behaviour depends on the options -- see the documentation for -detailed-exitcode for example where you only get an exit code of 0 if there are no planned changes.

But the key point is that "no changes" results in an applyable plan that updates the statefile

benedict@junco:~/test$ terraform apply plan.tfplan 

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
benedict@junco:~/test$ cat terraform.tfstate 
{
  "version": 4,
  "terraform_version": "1.7.1",
  "serial": 1,
  "lineage": "3110dbfe-bf87-e157-63e5-7840f3dc1440",
  "outputs": {},
  "resources": [],
  "check_results": null
}

Additionally, if I did follow the suggestion there from terraform --version and upgrade the terraform binary, a re-planned, I would get a different plan with still no API changes, and applying that plan would result in a new different state file.

(Aside: I have however just learnt that terraform only checks for the presence of a .tf file; there's no need for any content. Neat!)

@nitrocode
Copy link
Member

nitrocode commented Feb 5, 2024

I was pretty confident that it does save a plan file if no changes and it returns a zero status code if it's successful.

If you run atlantis apply, you can also see the terraform output for the no-changes plan. Im on mobile but there is a pr that we can look at to see the exact code changes that made this possible.

But maybe I'm mistaken. Happy to be corrected.

Also this request is related to

@nitrocode
Copy link
Member

Here is the PR #3378

Im unsure what version introduced it since the milestone is missing. Most likely a recent version.

@javajawa
Copy link
Author

javajawa commented Feb 5, 2024

For the record, it is listed as a change in v0.25.0.

Thanks for the link to the PR (and this the issue that spawned it). Neither seem to have addressed any concerns around external changes or other things that would cause state file but not real world changes?

Looking at #266, this comment on the nature of the Terraform Core Workflow principles I think underlines the distinction between workflows with plans and those with no plan but corrective state.

We could look at merging this issue into #266, though I feel there is still some difference around the idea of Atlantis taking a more concrete action than adding an external job (especially in non-GitHub environments which don't have the concept of required external jobs).

@javajawa
Copy link
Author

Update on the gitlab front: as Atlantis uses (well, has to use) commit statuses rather than workflow jobs, there are no associated events to bind a webhook to.

Currently, the best way I can find to do this in a self hosted environment (without making the changes to Atlantis described above) is to watch for the note being added to the MR when atlantis finished planning, then look up the commit status of the head of the MR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New functionality/enhancement
Projects
None yet
Development

No branches or pull requests

4 participants