Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a new policy check workflow #1317

Merged

Conversation

msarvar
Copy link
Contributor

@msarvar msarvar commented Dec 16, 2020

This PR adds policy check workflow into atlantis. It uses conftest to execute policies. At the moment you can only define policies locally and configure them in the server side config.

How it works

Policy checks step, if enabled, will always run after plan step. If the policy check fails an approval from a policy owner is required, to approve the policy owner needs to comment atlantis approve_policies in the PR . We decided against overloading the PR approval to also approve policies, for that reason there is a new approve_policies command to approve failing policies. This makes policy approval an explicit action that avoids any ambiguity that might be created by PR approvals.

To define the list of policy owners you will need to update server side config, currently we are only supporting github usernames(example below).

How to use it

  • Enable the policy checks by adding --enable-policy-checks=true flag to your server launch script.

  • Define your policy_sets, owners, and conftest_version in the server config under policies key. Example below:

policies:
  conftest_version: "0.21.0"
  owners:
    - johndoe
  policy_sets:
    - name: policy_name
      path: path/to/policy
      source: local

Note: Policy Checks do not work if you're using TFE as a backend. Policy Checks require plan output file and TFE doesn't have that option.

Notes

CommandRunner was broken down into ApplyCommandRunner, PlanCommandRunner, UnlockCommandRunner and ApprovePoliciesCommandRunner. Now CommandRunner just creates a required object in runtime based on request. There is not PolicyCheckCommandRunner because policy checks are running after plan step so the logic is encapsulated within the PlanCommandRunner.

This PR is not fully ready to be merged, I still need to add e2e tests and update the docs.

@codecov
Copy link

codecov bot commented Dec 16, 2020

Codecov Report

Merging #1317 (6458002) into master (10d7b28) will decrease coverage by 0.61%.
The diff coverage is 66.17%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1317      +/-   ##
==========================================
- Coverage   70.01%   69.40%   -0.62%     
==========================================
  Files          74       93      +19     
  Lines        5586     6291     +705     
==========================================
+ Hits         3911     4366     +455     
- Misses       1312     1543     +231     
- Partials      363      382      +19     
Impacted Files Coverage Δ
cmd/server.go 79.09% <ø> (ø)
server/events/event_parser.go 84.27% <0.00%> (-0.66%) ⬇️
server/events/terraform/terraform_client.go 79.14% <0.00%> (+1.01%) ⬆️
server/events/yaml/raw/global_cfg.go 0.00% <0.00%> (ø)
server/events/yaml/valid/policies.go 0.00% <0.00%> (ø)
server/events/yaml/valid/repo_cfg.go 11.11% <0.00%> (-10.32%) ⬇️
server/user_config.go 100.00% <ø> (ø)
server/events/policy_check_command_runner.go 21.87% <21.87%> (ø)
server/events/project_command_pool_executor.go 23.80% <23.80%> (ø)
server/events/project_command_runner.go 47.53% <25.31%> (-25.81%) ⬇️
... and 50 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 10d7b28...6458002. Read the comment docs.

@msarvar
Copy link
Contributor Author

msarvar commented Dec 16, 2020

Here are two screenshots showing an example workflow. The policy check is a bit contrived, it fails if a null_resource resource is being provisioned.

PR is created with null_resource being provisioned. Policy check fails and PR is updated with comment and PR status.

autoplan_failing_policy_check

Policy owner approves the policy and PR check is green and PR is updated with a comment.

approve_policies

And here is my policies config on the server side yaml.

policies:
  owners:
    - msarvar
  policy_sets:
    - name: no-null-resources
      path: $WORKING_DIR/policies/no-null-resources
      source: local

Policy that is being used

package main

import input as tfplan

deny[reason] {
	num_deletes.null_resource > 0
	reason := "WARNING: Null Resource creation is prohibited."
}

resource_types = {"null_resource"}

resources[resource_type] = all {
	some resource_type
	resource_types[resource_type]
	all := [name |
		name := tfplan.resource_changes[_]
		name.type == resource_type
	]
}

# number of deletions of resources of a given type
num_deletes[resource_type] = num {
	some resource_type
	resource_types[resource_type]
	all := resources[resource_type]
	deletions := [res | res := all[_]; res.change.actions[_] == "create"]
	num := count(deletions)
}

Copy link
Contributor

@nishkrishnan nishkrishnan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 . e2e tests and docs will be the final touch. Looks great!

@jamengual
Copy link
Contributor

Super cool I have to say this is an awesome feature to add.

@gwkunze
Copy link

gwkunze commented Dec 17, 2020

Amazing, I was literally just thinking about this. Some thoughts on policies though, it would be nice if multiple ways of policy-provisioning would be supported, like:

  1. Policies on atlantis server (the current way as I understand it)
  2. Policies from an OCI repository (conftest supports this, should be easy)
  3. Policies from the git repository in question (relative path to the checkout) either policy directory or a bundle

Another couple of thoughts on features I had:

  • Provide more than just the plan, also provide metadata (who opened the PR etc., who approved it (this would require multiple invocations of the policy whenever there's an update on the PR))
  • Allow three possible outcomes to policy evaluation: Outright reject (close the PR), nothing (wait for more updates on PR or an atlantis apply) or auto-approve

I'll try to understand the PR and see if I can contribute something

@nishkrishnan
Copy link
Contributor

Amazing, I was literally just thinking about this. Some thoughts on policies though, it would be nice if multiple ways of policy-provisioning would be supported, like:

  1. Policies on atlantis server (the current way as I understand it)
  2. Policies from an OCI repository (conftest supports this, should be easy)
  3. Policies from the git repository in question (relative path to the checkout) either policy directory or a bundle

Another couple of thoughts on features I had:

  • Provide more than just the plan, also provide metadata (who opened the PR etc., who approved it (this would require multiple invocations of the policy whenever there's an update on the PR))
  • Allow three possible outcomes to policy evaluation: Outright reject (close the PR), nothing (wait for more updates on PR or an atlantis apply) or auto-approve

I'll try to understand the PR and see if I can contribute something

2 and 3 are being planned on and the code seems structured well enough to support those with minimal effort. I think what makes 3 a bit difficult from supporting off the bat is running different policy sets under different conftest versions. With only server side right now its assumed that there is a single version for all policies. This shouldn't be done in the scope of this PR, and this PR is a very solid first step

@netguino
Copy link
Contributor

One of the things I would like to see here, is the ability to not have to depend on an actual file. It seems that the policies require a file path, so if I was to run atlantis stateless-ish, I'd love to be able to define all the policies the same way I can define the server side config by passing a json string.

Otherwise such a great feature! So much potential.

@nishkrishnan
Copy link
Contributor

One of the things I would like to see here, is the ability to not have to depend on an actual file. It seems that the policies require a file path, so if I was to run atlantis stateless-ish, I'd love to be able to define all the policies the same way I can define the server side config by passing a json string.

Otherwise such a great feature! So much potential.

It can be explored in a future PR, I think conftest requires a folder as input

@gwkunze
Copy link

gwkunze commented Dec 17, 2020

It would be more work, but since Atlantis is written in the same language as Open Policy Agent, instead of using conftest, it could embed OPA as a library https://www.openpolicyagent.org/docs/latest/integration/#integrating-with-the-go-api

That would allow more control of where policies come from perhaps

@nishkrishnan
Copy link
Contributor

It would be more work, but since Atlantis is written in the same language as Open Policy Agent, instead of using conftest, it could embed OPA as a library https://www.openpolicyagent.org/docs/latest/integration/#integrating-with-the-go-api

That would allow more control of where policies come from perhaps

policy runner backend should be pluggable, if someone wants to write OPA support they should be able to. Might need to change the interface for that.

Personally, prefer the simplicity of conftest and with newer features simplicity is 🔑

@jamengual
Copy link
Contributor

could it be possible to define teams instead of user for approvers?

@nishkrishnan
Copy link
Contributor

could it be possible to define teams instead of user for approvers?

we don't use teams, so i think supporting both would be better. @msarvar might actually be worthwhile to change

owners:
   - <user1>
   - <user2>

to

owners:
    users:
        - <user1>
        - <user2>

to allow us to add support for teams in the future without having to rev the major version

@msarvar
Copy link
Contributor Author

msarvar commented Dec 17, 2020

Amazing, I was literally just thinking about this. Some thoughts on policies though, it would be nice if multiple ways of policy-provisioning would be supported, like:

  1. Policies on atlantis server (the current way as I understand it)
  2. Policies from an OCI repository (conftest supports this, should be easy)
  3. Policies from the git repository in question (relative path to the checkout) either policy directory or a bundle

As @nishkrishnan mentioned we are planning on adding different ways to define your policies. Probably bundles are the next obvious and easy choice to implement.

  • Provide more than just the plan, also provide metadata (who opened the PR etc., who approved it (this would require multiple invocations of the policy whenever there's an update on the PR))

Can you expand on what would be the use case for this? Our main thinking was to keep policy enforcement scoped around your infrastructure changes.

  • Allow three possible outcomes to policy evaluation: Outright reject (close the PR), nothing (wait for more updates on PR or an atlantis apply) or auto-approve

This is something we considered to add but decided to keep things simple and only support pass/fail/approve for now. I think adding some extra functionality like auto-approve/auto-apply might make things interesting and improve devX.

@gwkunze
Copy link

gwkunze commented Jan 7, 2021

  • Provide more than just the plan, also provide metadata (who opened the PR etc., who approved it (this would require multiple invocations of the policy whenever there's an update on the PR))

Can you expand on what would be the use case for this? Our main thinking was to keep policy enforcement scoped around your infrastructure changes.

This sort of integrates with how I envisioned the auto-approval I also suggested would work, I think I can best illustrate this with some example flows.

Example 1:

We have some Terraform stack which represents cloud resources for development environment (maybe some instances, storage etc.) we invoke a module for every developer so everyone gets the same resources unless specific overrides are made.
We could have a policy that says that if the PR to make an override is made by user X and that user is in the group ADMIN it allows it. At the same time you could have a policy saying non-ADMIN users are only allowed to edit the resources of their own stack within certain limits.

In this case the metadata used is the attached username of the person who opened the PR and all the groups and memberships of the repository system in question, whether that is github, gitlab, bitbucket etc. This could also resolve the issue @jamengual raised

Example 2:

You use Kops with the Terraform target to generate a terraform module.
You have some form of cronjob that runs periodically to refresh the terraform module (new, patched machine images for example, or automated minor/patch version upgrade of Kubernetes). The module is automatically pushed to git and a PR is opened to use the latest version by a bot user. A policy runs that checks the plan and determines the changes fall within expectations of a routine update and auto-approves them. In the case it falls out of the expected routine changes, it can still run once at least two users in the team KubeOps have approved the PR.

Here the flow of approval itself you could say is part of the metadata, if you were to have the input JSON look something like the following psuedo-json all kinds of custom flows could be made possible:

{
  "plan": {...} // The JSON-ified terraform plan
  "repository_owner": "runatlantis",
  "repository_name": "atlantis",
  "pull-request": {
    "author": "msarvar",
    "approval": ["jamengual", "nishkrishnan"]
    "comments": [
      {
        "author": "gwkunze",
        "comment": "Amazing, I was literally just thinking about this...."
      }
    ]
  },
  "teams": {
    "ADMIN": ["some-admin-person", "another-admin-person"],
    "kubeops": [...]
  }
}

You could even use this to (potentially) get rid of certain built-in commands like atlantis approve_policies as you could make that part of the flow of the OPA evaluation by inspecting the comments.

@nishkrishnan nishkrishnan added the waiting-on-response Waiting for a response from the user label Jan 28, 2021
@msarvar msarvar force-pushed the adding-conftest-as-policy-check-step branch from 38208f5 to 98386ba Compare February 1, 2021 21:43
Nish Krishnan and others added 2 commits February 9, 2021 13:48
* [ORCA-666] Ensure failing policy checks don't discard the locks.

* Fix tests

* fix.
@nishkrishnan nishkrishnan added waiting-on-review Waiting for a review from a maintainer and removed waiting-on-response Waiting for a response from the user labels Feb 9, 2021
// can happen once at the beginning
envVars = append(envVars, os.Environ()...)

// honestly not entirely sure why we're using sh -c but it's used
Copy link
Member

@lkysow lkysow Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's because folks set extra_args and other things that may require shell interpolation

@msarvar msarvar force-pushed the adding-conftest-as-policy-check-step branch from 89f1f74 to d3d1a6c Compare February 10, 2021 03:07
@msarvar msarvar force-pushed the adding-conftest-as-policy-check-step branch from 6cae32f to 38e05e7 Compare February 10, 2021 21:45
@utamas
Copy link

utamas commented May 24, 2021

Hi @msarvar,

I'm trying to get conftest working but I'm getting an error:

Error: running test: load: loading policies: load: 1 error occurred during loading: stat /terraform/policies/null_resource_warning: no such file or directory

The documentation does not covert if the path is relative or what env variables are recognized.

Could you please help me?

@utamas
Copy link

utamas commented May 25, 2021

@stark-tech25
Copy link

stark-tech25 commented Feb 23, 2022

@nishkrishnan
Does Atlantis support gitHub groups for approving polices instead of individual users? If not, can we add this feature? This would help provide entire group to have the ability to approve polices instead of each user.

@jamengual
Copy link
Contributor

if you want to propose a feature you can create an issue or commit code on a PR, contributions are welcome.
I know Atlantis supports apply command filtering by groups, I do not know if policy-checks can do but you could potentially use pre-workflow hooks with a script to check that

@stark-tech25
Copy link

Created a ticket for the group feature - #2116

@mikecutalo mikecutalo deleted the adding-conftest-as-policy-check-step branch June 8, 2023 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-on-review Waiting for a review from a maintainer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants