Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atlantis 0.17.2 not always creating 'default' working dir #1714

Open
srlightbody opened this issue Jul 21, 2021 · 19 comments
Open

Atlantis 0.17.2 not always creating 'default' working dir #1714

srlightbody opened this issue Jul 21, 2021 · 19 comments
Labels
bug Something isn't working help wanted Good feature for contributors never-stale regression Bug introduced in a new version

Comments

@srlightbody
Copy link

We've noticed some odd behavior after upgrading from 0.16.1 to 0.17.2. The behavior is:
User creates a PR in GitHub
Atlantis creates the repo folder and PR number folder in /home/atlantis/.atlantis/repos
Atlantis does not create the default directory nor clone into it
Atlantis attempts to check if the default workspace exists and fails with an error. Here's some debug level log output showing the issue -
021-07-21 13:24:48.491 MDT{caller: events/events_controller.go:417, json: {…}, level: info, msg: parsed comment as command="plan" verbose=false dir="" workspace="company-daily" project="" flags="", ts: 2021-07-21T19:24:48.490Z} 2021-07-21 13:24:48.491 MDT{caller: events/events_controller.go:439, json: {…}, level: debug, msg: executing command, ts: 2021-07-21T19:24:48.490Z} 2021-07-21 13:24:48.491 MDT{caller: server/middleware.go:37, json: {…}, level: debug, msg: POST /events – respond HTTP 200, ts: 2021-07-21T19:24:48.490Z} 2021-07-21 13:24:48.818 MDT{caller: server/server.go:749, json: {…}, level: info, msg: Apply Lock: {false 0001-01-01 00:00:00 +0000 UTC }, ts: 2021-07-21T19:24:48.818Z} 2021-07-21 13:24:48.885 MDT{caller: server/server.go:749, json: {…}, level: info, msg: Apply Lock: {false 0001-01-01 00:00:00 +0000 UTC }, ts: 2021-07-21T19:24:48.883Z} 2021-07-21 13:24:49.245 MDT{caller: events/project_command_builder.go:287, json: {…}, level: debug, msg: building plan command, ts: 2021-07-21T19:24:49.244Z} 2021-07-21 13:24:49.245 MDT{caller: events/project_command_builder.go:294, json: {…}, level: debug, msg: cloning repository, ts: 2021-07-21T19:24:49.244Z} 2021-07-21 13:24:49.245 MDT{caller: events/working_dir.go:202, json: {…}, level: info, msg: creating dir "/home/atlantis/.atlantis/repos/company/atlantis-foo/218/company-daily", ts: 2021-07-21T19:24:49.244Z} 2021-07-21 13:24:49.884 MDT{caller: events/working_dir.go:268, json: {…}, level: debug, msg: ran: git clone --branch 5625048_daily_staging --depth=1 --single-branch https://companyatlantis:<redacted>@github.com/company/atlantis-foo.git /home/atlantis/.atlantis/repos/company/atlantis-foo/218/company-daily. Output: Cloning into '/h… 2021-07-21 13:24:49.886 MDT{caller: server/server.go:749, json: {…}, level: info, msg: Apply Lock: {false 0001-01-01 00:00:00 +0000 UTC }, ts: 2021-07-21T19:24:49.886Z} 2021-07-21 13:24:50.226 MDT{caller: events/pull_updater.go:14, json: {…}, level: error, msg: checking if workspace exists: stat /home/atlantis/.atlantis/repos/company/atlantis-foo/218/default: no such file or directory, stacktrace: github.com/runatlantis/atlantis/server/events.(*PullUpdater).updatePull /home/circleci/proje…

The full log for that last line is -
{ "caller": "events/pull_updater.go:14", "json": { ... }, "msg": "checking if workspace exists: stat /home/atlantis/.atlantis/repos/companymaps/atlantis-foo/218/default: no such file or directory", "stacktrace": "github.com/runatlantis/atlantis/server/events.(*PullUpdater).updatePull /home/circleci/project/server/events/pull_updater.go:14 github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).run /home/circleci/project/server/events/plan_command_runner.go:162 github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).Run /home/circleci/project/server/events/plan_command_runner.go:223 github.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand /home/circleci/project/server/events/command_runner.go:212", "ts": "2021-07-21T19:24:50.225Z", "level": "error" }

The issue is intermittent, i.e. I can close out a PR that has had the issue, open a new one with the same commits in it, and the new one will work just fine. Rolling Atlantis back to 0.16.1 completely resolves the issue.

I've spent some time today digging around, I think it may be related to the change introduced in #1620 in some way, it seems like atlantis is attempting to use the default directory without it ever being initialized. We do use a custom workflow for our planning step that adds a simplified output comment for users, and an atlantis.yaml with file specific auto plan triggers, but if it's an interaction with those I have not figured out the issue yet.

@msarvar
Copy link
Contributor

msarvar commented Jul 23, 2021

@srlightbody Are you triggering atlantis plan through GitHub comment?
I'm thinking that this might be caused when autoplan is not triggered due to no changes in the code and no pre_workflow_hook is present. If either of autoplan or pre_workflow_hook is present they will create the default folder. If neither exists and you trigger the plan with PR comment(i.e. atlantis plan -w <workspace-name>) this error will happen. Is that's the case?

@msarvar msarvar added the waiting-on-response Waiting for a response from the user label Jul 23, 2021
@srlightbody
Copy link
Author

I've done some more digging and I think there were 2 distinct issues going on that made this extra confusing. A bunch of our webhooks were failing with a 301 after the upgrade, the url we were using as the hook target ended in a ., ie https://atlantis.endpoint./events. For some reason that started causing a 301. I've since rolled a change that fixes the hooks, and am going to retry the upgrade to 0.17.2 today so I can do more thorough testing.

That being said, when the issue was occurring it was with autoplans being prompted by an atlantis.yaml in the repo. We trigger autoplans based on changed file, and select a workspace as part of that. The default workspace is unused.

@msarvar msarvar added bug Something isn't working help wanted Good feature for contributors and removed waiting-on-response Waiting for a response from the user labels Jul 27, 2021
@msarvar
Copy link
Contributor

msarvar commented Jul 27, 2021

@srlightbody This is definitely a bug and needs to be fixed. I think one potential workaround could be adding a no-op pre-workflow-hook. Can you try adding following to the config:

pre_workflow_hooks:
   - echo "do nothing"

Let me know if that mitigates the issue for the time being.

@askmike1
Copy link

I get this same error under the same conditions. We are updating from 0.16.1 -> 0.17.2. Autoplans are disabled and we currently do not have a pre_workflow_hook. As a workaround, I was able to get past this by adding the following to my repos.yaml:

  pre_workflow_hooks:
    - run: echo "workaround"

@emulanob
Copy link

Hi there!

I'm facing a similar situation. Upgrading from version v.0.16.1 to anything above and including v0.17.2 makes all my plans fail with that same error:

"checking if workspace exists: stat /home/atlantis/.atlantis/repos/${repo-name}/terraform/${pull-request-id}/default: no such file or directory"

Important context:

  • Running on bitbucket cloud
  • Plan/apply triggered via comments only
  • Using workspaces

Example command in a comment:

atlantis plan -d path/to/changes -w foo

Atlantis logs for above command:

{"level":"info","ts":"2022-08-17T12:21:15.342Z","caller":"events/events_controller.go:417","msg":"parsed comment as command=\"plan\" verbose=false dir=\"path/to/changes\" workspace=\"foo\" project=\"\" flags=\"\"","json":{}}
{"level":"info","ts":"2022-08-17T12:21:15.825Z","caller":"events/working_dir.go:202","msg":"creating dir \"/home/atlantis/.atlantis/repos/my-org/my-repo/3843/foo\"","json":{"repo":"my-org/my-repo","pull":"3843"}}
{"level":"error","ts":"2022-08-17T12:21:18.698Z","caller":"events/pull_updater.go:14","msg":"checking if workspace exists: stat /home/atlantis/.atlantis/repos/my-org/my-repo/3843/default: no such file or directory","json":{"repo":"my-org/my-repo","pull":"3843"},"stacktrace":"github.com/runatlantis/atlantis/server/events.(*PullUpdater).updatePull\n\t/home/runner/work/atlantis/atlantis/server/events/pull_updater.go:14\ngithub.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).run\n\t/home/runner/work/atlantis/atlantis/server/events/plan_command_runner.go:162\ngithub.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).Run\n\t/home/runner/work/atlantis/atlantis/server/events/plan_command_runner.go:223\ngithub.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand\n\t/home/runner/work/atlantis/atlantis/server/events/command_runner.go:212"}

Additional comments

  • Notice how Atlantis first creates a directory using the workspace name ("msg":"creating dir \"/home/atlantis/.atlantis/repos/my-org/my-repo/3843/foo\") and then looks for a directory named default ("msg":"checking if workspace exists: stat /home/atlantis/.atlantis/repos/my-org/my-repo/3843/default: no such file or directory").
  • Running version v0.17.1 works fine - will stick to it until the issue is fixed.

@jamengual
Copy link
Contributor

is this still an issue with v0.19.8?

@jamengual jamengual added the waiting-on-response Waiting for a response from the user label Aug 26, 2022
@emulanob
Copy link

Hi @jamengual. Yes, I started upgrading from v.0.16.1 to v0.19.8, which failed, and then downgraded until I reached one that worked.

@jamengual jamengual removed the waiting-on-response Waiting for a response from the user label Aug 26, 2022
@j0rzsh
Copy link

j0rzsh commented Sep 1, 2022

I'm working with latest version which is currently v0.19.9-pre.2022082 and the same error is happening.
Git: bitbucket cloud
Using workspaces

pre_workflow_hooks workaround commented before make it work.

@sujeets-toast
Copy link

sujeets-toast commented Nov 21, 2022

I've been using Atlantis for a year and recently encountered an error with version 0.18.2.0.
Screenshot 2022-11-21 at 11 27 43 AM
Screenshot 2022-11-21 at 11 35 15 AM
Screenshot 2022-11-21 at 11 39 35 AM

@nitrocode
Copy link
Member

It's possible the atlantis pod ran out of space?

Please also try with the latest version 0.20.1.

@sujeets-toast
Copy link

It's possible the atlantis pod ran out of space?

Please also try with the latest version 0.20.1.

Thanks for your reply. I created a new repository with the same name as the one it is currently using. It's working for me. Due to time constraints, I will upgrade the Atlantis image later because it will necessitate a significant amount of testing for us. 

@tekumara
Copy link

tekumara commented Mar 5, 2023

This happened to me when trying to run atlantis plan via comment on an empty PR.
I pushed a commit with a trivial change and the atlantis plan via comment worked.

@hskrtich
Copy link

My org has run into this same issue a number of times. It seems to randomly resolve it self at some point. We also use custom workspaces. This is still happening with the latest version of Atlantis (v0.23.3).

@bml1g12
Copy link

bml1g12 commented May 23, 2023

Same issue here on latest version, occurs on all new PRs until one runs atlantis plan. e.g. cannot run atlantis plan -p project_name without running atlantis plan first

@nitrocode nitrocode added the regression Bug introduced in a new version label May 23, 2023
@inkel
Copy link
Contributor

inkel commented Jun 2, 2023

This is also happening for us, we were using 0.19.9 and recently upgraded to 0.24.2.

@Jonathanboliveira
Copy link

Any updates for this issue?
We are having the same problem here in the organization, when updating from v0.17.0 to v0.24.3

@kelvingl
Copy link

Hello!
Any updates for this issue?
We are having the same problem here in the organization, on v0.25.0

@jamengual
Copy link
Contributor

we are documenting the Locks flow, which includes part of the cloning process too, after that we will try to figure a way to make this more stable #3345

@carmennavarreteh
Copy link

carmennavarreteh commented Jan 4, 2024

Hi! In my organisation we are also facing this, and we are using the 0.25.0 version.
We have some reproducible cases:

  • You have an empty PR and you run a manual plan + atlantis unlock command and we run atlantis plan again and the system will output checking if workspace exists: stat /home/atlantis/.atlantis/repos. The only way to fix this is sending a commit to the PR and trying to replan
    We have other cases, but it has been difficult to reproduce.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Good feature for contributors never-stale regression Bug introduced in a new version
Projects
None yet
Development

No branches or pull requests