Skip to content

Commit

Permalink
Export TF_PLUGIN_CACHE_MAY_BREAK_DEPENDENCY_LOCK_FILE=true in CI
Browse files Browse the repository at this point in the history
Starting from Terraform v1.4, launching terraform providers in the
acceptance test has been failing more frequently with a text file busy
error.

```
--- FAIL: TestAccMultiStateMigratorApplySimple (1.07s)
    multi_state_migrator_test.go:123: failed to run terraform init: failed to run command (exited 1): terraform init -input=false -no-color
        stdout:

        Initializing the backend...

        Successfully configured the backend "s3"! Terraform will automatically
        use this backend unless the backend configuration changes.

        Initializing provider plugins...
        - Finding latest version of hashicorp/null...
        - Installing hashicorp/null v3.2.1...

        stderr:

        Error: Failed to install provider

        Error while installing hashicorp/null v3.2.1: open
        /tmp/plugin-cache/registry.terraform.io/hashicorp/null/3.2.1/linux_amd64/terraform-provider-null_v3.2.1_x5:
        text file busy
```

After some investigation, I found Go's `os/exec.Cmd.Run()` does not wait
for the grandchild process to complete; from the point of view of
tfmigrate, the terraform command is the child process, and the provider
is the grandchild process.

golang/go#23019

If I understand correctly, this is not a Terraform issue and
theoretically should occur in versions older than v1.4; the changes in
v1.4 may have broken the balance of execution timing and made the test
very flaky. I experimented with inserting some sleep but could not get
the test to stabilize correctly. After trying various things, I found
that the test became stable by enabling the
`TF_PLUGIN_CACHE_MAY_BREAK_DEPENDENCY_LOCK_FILE` flag was introduced in
v1.4. This is an escape hatch to revert to the v1.3 equivalent of the
global cache behavior change in v1.4.
hashicorp/terraform#32726

This behavior change has already been addressed in the previous commit
using a local file system mirror, so activating this flag does not seem
to make any sense. Even though I have no other reasonable solutions now,
please let me know if anyone finds a better solution.
  • Loading branch information
minamijoyo committed Mar 9, 2023
1 parent e369af2 commit e34b939
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ services:
# Use the same filesystem to avoid a checksum mismatch error
# or a file busy error caused by asynchronous IO.
TF_PLUGIN_CACHE_DIR: "/tmp/plugin-cache"
TF_PLUGIN_CACHE_MAY_BREAK_DEPENDENCY_LOCK_FILE: "true"
# From observation, although we don’t have complete confidence in the root cause,
# it appears that localstack sometimes misses API requests when run in parallel.
TF_CLI_ARGS_apply: "--parallelism=1"
Expand Down

0 comments on commit e34b939

Please sign in to comment.