Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.22.2+: atlantis/post_workflow_hook errors after apply when using --automerge flag due to deleted directory #3031

Open
Yasmine92 opened this issue Jan 23, 2023 · 19 comments
Labels
bug Something isn't working collab hook never-stale regression Bug introduced in a new version

Comments

@Yasmine92
Copy link

Yasmine92 commented Jan 23, 2023

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Overview of the Issue

When using the flag --automerge, the Postworkflow hooks are executed after the PR is merged (the PR directory is deleted, and the branch is deleted). which results in errors in logs about not being able to read the current directory, or fetching from origin.

Reproduction Steps

  • enable automerging mode, by adding the flag --automerge
  • add a post-workflow-hook
  • open a Pull request, and run atlantis apply

Logs

Logs
{"level":"info","ts":"2023-01-24T09:58:33.477Z","caller":"events/events_controller.go:542","msg":"parsed comment as command=\"apply\" verbose=false dir=\"\" workspace=\"\" project=\"\" flags=\"\"","json":{"gh-request-id":"X-Github-Delivery=a9221ee0-9bcd-11ed-9d8e-31769df0a119"}}
{"level":"info","ts":"2023-01-24T09:58:37.115Z","caller":"terraform/terraform_client.go:317","msg":"Cannot determine which version to use from terraform configuration, detected 2 possibilities.","json":{"repo":"example/sandbox-project","pull":"19"}}
{"level":"info","ts":"2023-01-24T09:58:37.116Z","caller":"terraform/terraform_client.go:317","msg":"Cannot determine which version to use from terraform configuration, detected 2 possibilities.","json":{"repo":"example/sandbox-project","pull":"19"}}
{"level":"info","ts":"2023-01-24T09:58:37.460Z","caller":"runtime/apply_step_runner.go:39","msg":"starting apply","json":{"repo":"example/sandbox-project","pull":"19"}}
{"level":"info","ts":"2023-01-24T09:58:41.988Z","caller":"models/shell_command_runner.go:156","msg":"successfully ran \"/usr/local/bin/terraform apply -input=false \\\"/atlantis-data/repos/example/sandbox-project/19/dev/terraform/dev-dev.tfplan\\\"\" in \"/atlantis-data/repos/example/sandbox-project/19/dev/terraform\"","json":{"repo":"example/sandbox-project","pull":"19"}}
{"level":"info","ts":"2023-01-24T09:58:41.989Z","caller":"runtime/apply_step_runner.go:58","msg":"apply successful, deleting planfile","json":{"repo":"example/sandbox-project","pull":"19"}}
{"level":"info","ts":"2023-01-24T09:58:42.363Z","caller":"events/instrumented_project_command_runner.go:82","msg":"apply success. output available at: https://github.com/example/sandbox-project/pull/19","json":{"repo":"example/sandbox-project","pull":"19"}}
{"level":"info","ts":"2023-01-24T09:58:45.244Z","caller":"events/automerger.go:32","msg":"automerging pull request","json":{"repo":"example/sandbox-project","pull":"19"}}
{"level":"info","ts":"2023-01-24T09:58:47.986Z","caller":"events/instrumented_pull_closed_executor.go:45","msg":"Initiating cleanup of pull data.","json":{"repository":"example/sandbox-project","pull-num":"19"}}
{"level":"warn","ts":"2023-01-24T09:58:48.154Z","caller":"events/working_dir.go:168","msg":"getting remote update failed: Fetching origin\nerror: cannot open '.git/FETCH_HEAD': No such file or directory\nerror: could not fetch origin\nFetching head\nfatal: Unable to read current working directory: No such file or directory\nerror: could not fetch head\nfatal: Unable to read current working directory: No such file or directory\n","json":{"repo":"example/sandbox-project","pull":"19"},"stacktrace":"github.com/runatlantis/atlantis/server/events.(*FileWorkspace).warnDiverged\n\tgithub.com/runatlantis/atlantis/server/events/working_dir.go:168\ngithub.com/runatlantis/atlantis/server/events.(*FileWorkspace).Clone\n\tgithub.com/runatlantis/atlantis/server/events/working_dir.go:117\ngithub.com/runatlantis/atlantis/server/events.(*DefaultPostWorkflowHooksCommandRunner).RunPostHooks\n\tgithub.com/runatlantis/atlantis/server/events/post_workflow_hooks_command_runner.go:69\ngithub.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand\n\tgithub.com/runatlantis/atlantis/server/events/command_runner.go:298"}
{"level":"info","ts":"2023-01-24T09:58:49.011Z","caller":"events/events_controller.go:470","msg":"deleted locks and workspace for repo example/sandbox-project, pull 19","json":{"gh-request-id":"X-Github-Delivery=b183cfc0-9bcd-11ed-9de7-c11f64f1d767"}}
{"level":"error","ts":"2023-01-24T09:58:49.101Z","caller":"events/command_runner.go:301","msg":"Error running post-workflow hooks chdir /atlantis-data/repos/example/sandbox-project/19/default: no such file or directory: running \"ls -l /etc/atlantis/repos.yaml\" in \"/atlantis-data/repos/example/sandbox-project/19/default\": \n.","json":{"repo":"example/sandbox-project","pull":"19"},"stacktrace":"github.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand\n\tgithub.com/runatlantis/atlantis/server/events/command_runner.go:301"}

Environment details

  • Atlantis version: 0.22.3
  • Atlantis flags: --enable-diff-markdown-format --automerge --checkout-strategy=merge --hide-prev-plan-comments --enable-policy-checks --gh-allow-mergeable-bypass-apply --quiet-policy-checks

post-worklflow hook configuration from the server config:

post_workflow_hooks:
   - run: ls -l /etc/atlantis/repos.yaml

Repo atlantis.yaml file:

---
# atlantis.yaml
version: 3
parallel_plan: true
projects:
- name: dev
  dir: terraform
  autoplan:
    when_modified: ["*.tf*"]
  workflow: dev
  workspace: dev
workflows:
  dev:
    plan:
      steps:
      - init:
      - plan:
          extra_args: ["-var-file", "environments/dev.tfvars"]

Additional Context

@Yasmine92 Yasmine92 added the bug Something isn't working label Jan 23, 2023
@nitrocode
Copy link
Member

Could you include your yaml configuration such as your post workflow hook?

Looks like this is the main error

Error running post-workflow hooks chdir /atlantis-data/repos///7/default: no such file or directory: running "rm -rf /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM"

@jamengual
Copy link
Contributor

jamengual commented Jan 23, 2023 via email

@Yasmine92
Copy link
Author

/atlantis-data/repos///7/default:

no, I just tried to hide the repo and organisation name and it resulted in that 🤦. I'll update the issue with a different log, using a simple post workflow hook

@Yasmine92
Copy link
Author

Yasmine92 commented Jan 24, 2023

@jamengual and @nitrocode I updated the issue with another simpler example (different and simpler post workflow), and the log from the apply time.

@nitrocode
Copy link
Member

@Yasmine92 can you ssh to the pod and try to access the directory it's complaining about? Can you cd to it?

cd /atlantis-data/repos/example/sandbox-project/19/default

@Yasmine92
Copy link
Author

@Yasmine92 can you ssh to the pod and try to access the directory it's complaining about? Can you cd to it?

cd /atlantis-data/repos/example/sandbox-project/19/default

Well from my understanding, the directory for the PR is cleaned up after the automerge at this step:

{"level":"info","ts":"2023-01-24T09:58:45.244Z","caller":"events/automerger.go:32","msg":"automerging pull request","json":{"repo":"example/sandbox-project","pull":"19"}}
{"level":"info","ts":"2023-01-24T09:58:47.986Z","caller":"events/instrumented_pull_closed_executor.go:45","msg":"Initiating cleanup of pull data.","json":{"repository":"example/sandbox-project","pull-num":"19"}}

so it's normal that the directory is already gone after merge:

$ ls /atlantis-data/repos/example/sandbox-project/19           
ls: /atlantis-data/repos/example/sandbox-project/19: No such file or directory

@Yasmine92
Copy link
Author

Yasmine92 commented Jan 24, 2023

but the thing is that events/working_dir.go is called before running the post-workflow-hook, so it always try to cd to the directory for the PR (that is already merged) before executing the post-workflow action that is not necessarily related to that directory, like the example I showed ls -l /etc/atlantis/repos.yaml
So I think it would be better to change the logic of the code to execute the post-workflow-hooks before automerging.

@nitrocode
Copy link
Member

@Fabianoshz for your thoughts here.

@Yasmine92 please feel free to test a change locally and propose it. What's odd is that I'm using the latest version, most of the above flags, and not experiencing this. I wonder what condition would make the cleanup happen prior to the post workflow run?

@Yasmine92
Copy link
Author

@Fabianoshz for your thoughts here.

@Yasmine92 please feel free to test a change locally and propose it. What's odd is that I'm using the latest version, most of the above flags, and not experiencing this. I wonder what condition would make the cleanup happen prior to the post workflow run?

@nitrocode are you using both --automerge and a post-workflow-hook? how does it look in the logs when you do an apply?
Sure, I'll give it a try :)

@Fabianoshz
Copy link
Contributor

@Yasmine92 can you check if you can access the PR directory?

  • This should work: ls /atlantis-data/repos/example/sandbox-project/19/ - This is the PR directory.
  • This should not: ls atlantis-data/repos/example/sandbox-project/19/default

Taking from memory I believe the project directory is deleted right after the apply, while the PR directory lives until we receive a merged event. Again, taking from memory I might be wrong.

@Yasmine92
Copy link
Author

@Yasmine92 can you check if you can access the PR directory?

  • This should work: ls /atlantis-data/repos/example/sandbox-project/19/ - This is the PR directory.
  • This should not: ls atlantis-data/repos/example/sandbox-project/19/default

Taking from memory I believe the project directory is deleted right after the apply, while the PR directory lives until we receive a merged event. Again, taking from memory I might be wrong.

Thanks @Fabianoshz , the PR directory /atlantis-data/repos/example/sandbox-project/19/ is deleted after apply, because the merge event comes automatically after apply, because of the flag "--automerge". and that's the exact bug I'm pointing out, using --automerge combined with a post-workflow makes the dir of the Pr deleted before the post-workflow is executed.

@nitrocode nitrocode added the hook label Jan 25, 2023
@nitrocode nitrocode changed the title Post-workflow-hooks execution errors when using --automerge flag atlantis/post_workflow_hook errors when using --automerge flag Jan 25, 2023
@nitrocode nitrocode changed the title atlantis/post_workflow_hook errors when using --automerge flag atlantis/post_workflow_hook errors after apply when using --automerge flag Jan 25, 2023
@bob-rohan
Copy link

Seeing this race condition also. Thanks the notes, so dir is deleted on receipt of merge event, if this is receveid before post-workflow is run, then the directory from which the post workflow command would be run no longer exists. Is this correct summary?

@jamengual
Copy link
Contributor

jamengual commented Jan 26, 2023 via email

@nitrocode
Copy link
Member

nitrocode commented Jan 27, 2023

I'm seeing this too on 0.22.3.

I haven't checked previous versions yet. Has anyone gone back to see if this is a recent version that introduced this? Has anyone tested with 0.22.2, 0.22.1, 0.22.0? Id go as far back as 0.21.0 to see if it's a regression and go up from there.

this issue may be related to

@cilindrox
Copy link
Contributor

@nitrocode I made the jump from v0.19.8 (working) to v0.22.2 (merge error) and currently v0.22.3, which also has the merge error.

@nitrocode nitrocode added the regression Bug introduced in a new version label Jan 27, 2023
@nitrocode
Copy link
Member

nitrocode commented Jan 27, 2023

Ah thank you so this is definitely a regression. If anyone gets a chance, please test in earlier versions to see if we can pinpoint the pr that introduced this breaking change.

Cc @Fabianoshz in case you or others can spot the issue without checking individual versions

v0.19.8...v0.22.2

@nitrocode nitrocode changed the title atlantis/post_workflow_hook errors after apply when using --automerge flag v0.22.2+: atlantis/post_workflow_hook errors after apply when using --automerge flag Jan 27, 2023
@nitrocode nitrocode changed the title v0.22.2+: atlantis/post_workflow_hook errors after apply when using --automerge flag v0.22.2+: atlantis/post_workflow_hook errors after apply when using --automerge flag due to deleted directory Jan 30, 2023
@nitrocode
Copy link
Member

nitrocode commented Jan 30, 2023

@Fabianoshz is it possible that the post workflow run never used to run from the pr directory until a recent pr? If so, then we'd just have to run the delete after the post workflow run completes.

This deletes the dir

if err := p.WorkingDir.Delete(repo, pull); err != nil {

This is the function call that deletes the dir

func (p *PullClosedExecutor) CleanUpPull(repo models.Repo, pull models.PullRequest) error {

func (e *InstrumentedPullClosedExecutor) CleanUpPull(repo models.Repo, pull models.PullRequest) error {

if err := e.PullCleaner.CleanUpPull(baseRepo, pull); err != nil {

func (e *VCSEventsController) handlePullRequestEvent(logger logging.SimpleLogging, baseRepo models.Repo, headRepo models.Repo, pull models.PullRequest, user models.User, eventType models.PullRequestEventType) HTTPResponse {

autoPlanRunner := buildCommentCommandRunner(c, command.Plan)
autoPlanRunner.Run(ctx, nil)
err = c.PostWorkflowHooksCommandRunner.RunPostHooks(ctx, nil)

@weeezes
Copy link

weeezes commented Mar 9, 2023

I'm not sure if this is what's causing issues for others here, but for some reason a perfectly well functioning pre-workflow-hook started to fail for me yesterday. Turns out the issue was that my 🪄 magical diff script 🪄 left things into an unwanted state and got cleaned up by the logic maybe here? https://github.dev/runatlantis/atlantis/blob/ba7b67a42cf8105fbbbe4a1d003e06cca58fc2a0/server/events/working_dir.go#L97-L98 After I set the pre-workflow-hook to do git checkout $HEAD_COMMIT in the end things started to work again.

@arohter
Copy link

arohter commented Sep 19, 2024

Still an issue in v0.29.0 afaik. Moving from post_workflow_hooks to a workflow steps run: command stanza is our workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working collab hook never-stale regression Bug introduced in a new version
Projects
None yet
Development

No branches or pull requests

8 participants