Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency cycle errors with modules #1637

Closed
ketzacoatl opened this issue Apr 22, 2015 · 19 comments
Closed

Dependency cycle errors with modules #1637

ketzacoatl opened this issue Apr 22, 2015 · 19 comments

Comments

@ketzacoatl
Copy link
Contributor

I have a build plan which is failing to apply, citing a cycle/dependency error.

With v0.4.2, I would get the cycle error with terraform plan. I built terraform from master as of Sunday (this topic originally started in #1475), and reviewing the plan will now succeed:

/src/terraform-consul-aws-asg/test  ᐅ   tfp           
Refreshing Terraform state prior to plan...
...

+ module.cleader-inbound-sg
    1 resource(s)
+ module.cleaders
    9 resource(s)
+ module.cleaders.cleader-sg
    1 resource(s)
+ module.cminion-inbound-sg
    1 resource(s)
+ module.cminions-a
    7 resource(s)
+ module.cminions-a.cminion-sg
    1 resource(s)

...but errors out when applying the plan:

/src/terraform-consul-aws-asg/test  ᐅ   tfa           
Error applying plan:

1 error(s) occurred:

* Cycle: aws_subnet.c (destroy), aws_subnet.c, module.cleader-sg (expanded), aws_elb.main (destroy), aws_subnet.a (destroy), aws_subnet.a

The code is currently in a private repo, but I have requested @phinze have access. See here and here. The high-level bit is replicated here:

module "cleaders" {
    source = "../leaders"
    max_nodes = 5
    min_nodes = 3
    desired_capacity = 7
    key_name = "${var.key_name}"
    key_file = "${var.key_file}"
    ssh_pubkey = "${var.ssh_pubkey}"
    access_key = "${var.access_key}"
    secret_key = "${var.secret_key}"
    region = "${var.region}"
    consul_secret_key = "${var.consul_secret_key}"
    cidr_prefix_a = "${var.cidr_prefix_leader_a}"
    cidr_prefix_c = "${var.cidr_prefix_leader_c}"
    vpc_id = "${var.vpc_id}"
    route_table_id = "${var.route_table_id}"
    inbound_security_group = "${module.cleader-inbound-sg.id}"
}


module "cminions-a" {
    source = "../minions"
    max_nodes = 5
    min_nodes = 3
    desired_capacity = 7
    key_name = "${var.key_name}"
    key_file = "${var.key_file}"
    ssh_pubkey = "${var.ssh_pubkey}"
    access_key = "${var.access_key}"
    secret_key = "${var.secret_key}"
    region = "${var.region}"
    consul_secret_key = "${var.consul_secret_key}"
    cidr_minions_a = "${var.cidr_minions_a}"
    cidr_minions_c = "${var.cidr_minions_c}"
    vpc_id = "${var.vpc_id}"
    route_table_id = "${var.route_table_id}"
    leader_dns = "${module.cleaders.leader_dns}"
}

# allow minion to leader
module "cleader-inbound-sg" {
    source = "../leader_sg"
    name = "inbound-${var.name}"
    vpc_id = "${var.vpc_id}"
    region = "${var.region}"
    access_key = "${var.access_key}"
    secret_key = "${var.secret_key}"
    cidr_blocks = "${var.cidr_minions_a},${var.cidr_minions_c}"
}

# and allow leader to minion
module "cminion-inbound-sg" {
    source = "../agent_sg"
    name = "inbound-${var.name}"
    vpc_id = "${var.vpc_id}"
    region = "${var.region}"
    access_key = "${var.access_key}"
    secret_key = "${var.secret_key}"
    cidr_blocks = "${module.cleaders.subnet-a-cidr_block},${module.cleaders.subnet-a-cidr_block}"
}

I include this reference because, unless I am mistaken, it confirms there should be no cyclical dependency issue/error here. The cminions-a and cminion-inbound-sg modules depend on the cleaders module, which in turn depends on cleader-inbound-sg. This last module is simple and only uses variables (see the end of the code snippet above). Here is the core of the leader sg module:

# Security group to cover Consul Leaders
resource "aws_security_group" "main" {
    name = "consul-leaders-${var.name}-${var.region}"
    vpc_id = "${var.vpc_id}"
    description = "Allow TCP and UDP ports to consul leaders in ${var.name}"
    tags {
        Name = "consul-leader-${var.name}-${var.region}"
    }
    # Server RPC, used by servers to handle incoming requests from other agents. TCP only.
    ingress {
        from_port = 8300
        to_port = 8300
        protocol = "tcp"
        cidr_blocks = ["${split(",", var.cidr_blocks)}"]
    }
    ...
}

Nothing depends on cminion-inbound-sg or cminions-a. When trying to apply the plan (with Terraform built from master this past Sunday), the plan fails:

ᐅ       tfa
Error applying plan:

1 error(s) occurred:

* Cycle: aws_subnet.a (destroy), aws_subnet.a, aws_subnet.c (destroy), aws_subnet.c, module.cleader-sg (expanded), aws_elb.main (destroy)

If you have 0.4.2 available, terraform plan should fail in the same way.

@johnrengelman
Copy link
Contributor

I've been seeing this as well. Additionally if I do a terraform plan -out=plan.out and then try to graph that terraform graph plan.out, the graph will error out with a cycle.

@ketzacoatl
Copy link
Contributor Author

Interesting. @johnrengelman, could you share any of the TF code that errors out? Maybe @mitchellh, @phinze or others can determine if this is the same issue or different.

@johnrengelman
Copy link
Contributor

I'll try and replicate it today.

@johnrengelman
Copy link
Contributor

provider "aws" {
  region = "us-east-1"
}

provider "consul" {
  address = "consul:80"
  datacenter = "internal-east"
}

resource "consul_keys" "es_ami" {
  key {
    name = "ami_id"
    path = "amis/elasticsearch/1.4.4-11/us-east-1"
  }
}

resource "aws_launch_configuration" "es_conf" {
  lifecycle {
    create_before_destroy = true
  }
  image_id = "${consul_keys.es_ami.var.ami_id}"
  instance_type = "m3.medium"
  key_name = "peoplenet-load"
  root_block_device {
    volume_type = "gp2"
    volume_size = 8
  }

  associate_public_ip_address = true
}

resource "aws_autoscaling_group" "es_cluster" {
  availability_zones = ["us-east-1b"]
  name = "elasticsearch-cluster"
  max_size = 3
  min_size = 2
  desired_capacity = 3
  launch_configuration = "${aws_launch_configuration.es_conf.id}"
  health_check_type = "EC2"
  tag {
    key = "Name"
    value = "elastic-search"
    propagate_at_launch = true
  }
}

This will fail to graph after outputting the plan.

test git:(master) ✗ terraform plan -out plan.out
Refreshing Terraform state prior to plan...


The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed.

Your plan was also saved to the path below. Call the "apply" subcommand
with this plan file and Terraform will exactly execute this execution
plan.

Path: plan.out

+ aws_autoscaling_group.es_cluster
    availability_zones.#:               "" => "1"
    availability_zones.1305112097:      "" => "us-east-1b"
    default_cooldown:                   "" => "<computed>"
    desired_capacity:                   "" => "3"
    force_delete:                       "" => "<computed>"
    health_check_grace_period:          "" => "<computed>"
    health_check_type:                  "" => "EC2"
    launch_configuration:               "" => "${aws_launch_configuration.es_conf.id}"
    max_size:                           "" => "3"
    min_size:                           "" => "2"
    name:                               "" => "elasticsearch-cluster"
    tag.#:                              "" => "1"
    tag.1615826712.key:                 "" => "Name"
    tag.1615826712.propagate_at_launch: "" => "1"
    tag.1615826712.value:               "" => "elastic-search"
    termination_policies.#:             "" => "<computed>"
    vpc_zone_identifier.#:              "" => "<computed>"

+ aws_launch_configuration.es_conf
    associate_public_ip_address:               "" => "1"
    ebs_block_device.#:                        "" => "<computed>"
    ebs_optimized:                             "" => "<computed>"
    image_id:                                  "" => "${consul_keys.es_ami.var.ami_id}"
    instance_type:                             "" => "m3.medium"
    key_name:                                  "" => "peoplenet-load"
    name:                                      "" => "<computed>"
    root_block_device.#:                       "" => "1"
    root_block_device.0.delete_on_termination: "" => "1"
    root_block_device.0.iops:                  "" => "<computed>"
    root_block_device.0.volume_size:           "" => "8"
    root_block_device.0.volume_type:           "" => "gp2"

+ consul_keys.es_ami
    datacenter:             "" => "<computed>"
    key.#:                  "" => "1"
    key.3999287073.default: "" => ""
    key.3999287073.delete:  "" => "0"
    key.3999287073.name:    "" => "ami_id"
    key.3999287073.path:    "" => "amis/elasticsearch/1.4.4-11/us-east-1"
    key.3999287073.value:   "" => "<computed>"
    var.#:                  "" => "<computed>"


 test git:(master) ✗ terraform graph plan.out
Error creating graph: 1 error(s) occurred:

* Cycle: aws_launch_configuration.es_conf, aws_autoscaling_group.es_cluster, aws_launch_configuration.es_conf (destroy), consul_keys.es_ami (destroy), consul_keys.es_ami

However, if I add a create_before_destroy = true to the consul_keys resource, it works.
I know this is a different issue, but I would expect that if a plan can be generated, then I could graph it, so I can see the cycles.

@ketzacoatl
Copy link
Contributor Author

@johnrengelman, your message is cut off at the end. FWIW, @phinze noted we should remove create_before_destroy and test - see here on details.

@johnrengelman
Copy link
Contributor

@ketzacoatl yeah, i accidentally hit . Message updated.

@phinze
Copy link
Contributor

phinze commented Apr 23, 2015

I would expect that if a plan can be generated, then I could graph it, so I can see the cycles.

Filed this officially as #1651 - stay tuned on that thread for a PR with my recent work here once I finish it up.

@ketzacoatl
Copy link
Contributor Author

great, that should help us all figure out what is what here. I am in over my head atm on it and will need to wait for further guidance from the wizards of Terraform.

@phinze
Copy link
Contributor

phinze commented Apr 25, 2015

Working with @ketzacoatl I was able to determine that this is another expression of the same Terraform bug we've been kicking around across several issues - Terraform needs to change the way it treats modules in the execution graph such that modules are only a UX layer abstraction - the execution graph needs to be flattened out to a single layer to prevent these situations from being problematic.

Terraform generates "destroy nodes" for each of the resources, which it places at the logical place in the graph that destroys would happen for those nodes. But TF can't do this for the module, because it has an unknown number of resources inside of it.

The cycle happens as Terraform tries to figure out how to perform its operations in the proper order assuming a worst case destroy-then-create. If you think of the module as just a security group you'd want it to be:

  • Destroy ELB
  • Destroy SG
  • Destroy Subnets
  • Create Subnets
  • Create SG
  • Create ELB

But the problem is, from this view the graph only has "handle module" as an operation, giving us:

  • Destroy ELB
  • Handle SG Module
  • Destroy Subnets
  • Create Subnets
  • Handle SG Module
  • Create ELB

And hence the cycle.

To work around the issue in the meantime, I'd recommend structuring your config to avoid cross-module references. So instead of directly accessing an output of one module from inside another, set it up as in input parameter instead and wire everything together on the top level.

Tagging this as a core bug and leaving it open for now - I may try and consolidate the "stuff that will be fixed when we flatten the module graph" issues eventually. 👌

@ketzacoatl
Copy link
Contributor Author

Wow, that is intense. It is also confusing to me that Terraform defaults to a destroy-then-create action. Even in a worst-case scenario.. it is not always what the user would want, and it is not always necessary (given the circumstances imposed by the current state of the environment and the provider's APIs).

That aside.. @phinze, I am confused what you mean by cross-module references. If I understand your suggestion, I should use an output from one module as an input to another, is that right?

If so, I believe I am doing that with:

    cidr_blocks = "${module.cleaders.subnet-a-cidr_block},${module.cleaders.subnet-a-cidr_block}"

and

    inbound_security_group = "${module.cleader-inbound-sg.id}"

as examples from the code I pasted in this issue. Forgive me if I am not understanding correctly.

I do not intend to reference one module from inside another, except through the variable/output interfaces - do you see a place in my code where I am not doing this? I read the description of the problem/solution as removing the namespacing the modules do - I don't think that is the way to go, but maybe I do not understand what you are describing.

@mitchellh, maybe you have a few spare cycles you can send our way to confirm we are on the right track?

@phinze
Copy link
Contributor

phinze commented Apr 25, 2015

It is also confusing to me that Terraform defaults to a destroy-then-create action. Even in a worst-case scenario.. it is not always what the user would want, and it is not always necessary

It's true that in many cases create_before_destroy would be a preferable default, but unfortunately a lot of the resources Terraform manages would generate errors (name conflicts, etc) if a clone of them were to appear, so we have to assume the worst case scenario by default as we structure the graph.

@phinze, I am confused what you mean by cross-module references. If I understand your suggestion, I should use an output from one module as an input to another, is that right?

Sorry about the unclear explanation here - in your example the references causing the cycle are from the ELB and the LC in the "leaders" module to the id output of the "leader-sg" module.

Here's a better way of describing my suggested workaround: avoid direct references from a resource block to a module. I think the easiest way to do this is by flattening out the module declarations at the top level. So in your case I'd try pulling the "leader-sg" module declaration out to "main.tf" and piping in the SG id as a variable to the "leaders" module instead of nesting them.

It's definitely not ideal, but it should help you avoid cycles until we get the module graph fixed up in core.

@ketzacoatl
Copy link
Contributor Author

Ok, I understand now. I will shuffle around the SG modules so they are at the top level with the other modules.

In regards to create_before_destroy, I guess my confusion stems from the way Terraform will delete resources to recreate them, when I could have made the same changes through the AWS console without removing the resource. To work around TF wanting to rm instances on me, I end up making updates in the AWS console, update TF's resource definition, and then have TF sync its snapshot of the state.

EDIT: Thank you @phinze for ensuring I understand what is going on and how to work around the limitation for now!

@ketzacoatl
Copy link
Contributor Author

Ah, now I remember why I did this to begin with.. I have two modules, leaders and minions, each of which creates the resources needed to spin up a cluster running consul as either a cluster of consul leaders or followers (which I refer to as minions). In each module we create a group of subnets for that cluster. We then need a default security group for the cluster, and that needs to allow traffic from these subnets. Thus.. I cannot move the SG module to the top level, it needs the CIDR block from the subnets created in the leader module, which would need the SG ID as input if I were to move the module.. so we end up with a circular dependency that route.

For now, I think my work around is to put the security group resources directly in the leader/minion modules, rather than using a module to the SG. I will figure it out and report back. I guess it makes sense to close this issue if Terraform will be addressing multiple issues like this with structural changes in core.

@ketzacoatl
Copy link
Contributor Author

I've been able to work around the issue here by not calling a module from within a module. All modules include resources, no modules. This is less than ideal, but workable. @phinze, do you have plans for consolidating one or more of these issues?

@knuckolls
Copy link
Contributor

+1 to perhaps making a meta issue, there are a fair amount of module "bugs / feature requests" floating about.

mikkoc pushed a commit to mikkoc/tsuru-terraform-example that referenced this issue Apr 30, 2015
We were getting some cycle dependencies error when applying the config without existing .tfstate.
We double checked and we have no cycles at all.
For reference, see:
  * [What is the dependency cycle in this? #1475](hashicorp/terraform#1475)
  * [Dependency cycle errors with modules #1637](hashicorp/terraform#1637)
mikkoc pushed a commit to alphagov/paas-alpha-tsuru-terraform that referenced this issue May 1, 2015
We were getting some cycle dependencies error when applying the config without existing .tfstate.
We double checked and we have no cycles at all.
For reference, see:
  * [What is the dependency cycle in this? #1475](hashicorp/terraform#1475)
  * [Dependency cycle errors with modules #1637](hashicorp/terraform#1637)
mikkoc pushed a commit to alphagov/paas-alpha-tsuru-terraform that referenced this issue May 1, 2015
We were getting some cycle dependencies error when applying the config without existing .tfstate.
We double checked and we have no cycles at all.
For reference, see:
  * [What is the dependency cycle in this? #1475](hashicorp/terraform#1475)
  * [Dependency cycle errors with modules #1637](hashicorp/terraform#1637)
@mitchellh
Copy link
Contributor

Fixed by #1781

@lordnynex
Copy link

@mitchellh FYI, experiencing this in 0.7.1 & 0.7.2. The code to reproduce is very similar to this issue, basically a top level module that does some CIDR math and generates an output map using null_data_source (admittedly a hack on my part).

@JakimLi
Copy link

JakimLi commented Jan 9, 2017

I am still having this issue in 0.8.2, modules are all at the top level, using input parameter as suggested by @phinze , here is the main.tf where wires all the modules:

provider "aws" {
  region = "ap-southeast-2"
}

module "my_elb" {
  source = "../modules/elb"
  subnets = ["subnet-481d083f", "subnet-303cd454"]
  security_groups = ["sg-e8ac308c"]
}

module "my_lc" {
  source = "../modules/lc"
  subnets = ["subnet-481d083f", "subnet-303cd454"]
  security_groups = ["sg-e8ac308c"]
  snapshot_id = "snap-00d5e8ef70d1b3e24"
}

module "my_asg" {
  source = "../modules/asg"
  subnets = ["subnet-481d083f", "subnet-303cd454"]
  my_asg_name = "my_asg_${module.my_lc.my_lc_name}"
  my_lc_id = "${module.my_lc.my_lc_id}"
  my_elb_name = "${module.my_elb.my_elb_name}"
}

Use this, I can do initial apply, and destroy, but when I try to lc, it gives cycle error.

@ghost
Copy link

ghost commented Apr 5, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants
@mitchellh @phinze @knuckolls @johnrengelman @lordnynex @JakimLi @ketzacoatl and others