Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCP Peering does not work #3034

Closed
dgarstang opened this issue Feb 12, 2019 · 7 comments · Fixed by GoogleCloudPlatform/magic-modules#2938
Closed

GCP Peering does not work #3034

dgarstang opened this issue Feb 12, 2019 · 7 comments · Fixed by GoogleCloudPlatform/magic-modules#2938
Assignees
Labels

Comments

@dgarstang
Copy link

dgarstang commented Feb 12, 2019

Terraform Version

Terraform v0.11.11
+ provider.google v1.20.0

Affected Resource(s)

google_compute_network_peering

Terraform Configuration Files

// Peer the infra vpc with the dev vpc.
resource "google_compute_network_peering" "infra_dev" {
  name         = "infra-dev"
  network      = "${module.infra_network.network_self_link}"
  peer_network = "${module.dev_network.network_self_link}"
}

// Peer the dev vpc with the infra vpc.
resource "google_compute_network_peering" "dev_infra" {
  name         = "dev-infra"
  network      = "${module.dev_network.network_self_link}"
  peer_network = "${module.infra_network.network_self_link}"
  depends_on   = ["google_compute_network_peering.infra_dev"]
}

Debug Output

Error: Error applying plan:

1 error(s) occurred:

* google_compute_network_peering.dev_infra: 1 error(s) occurred:

* google_compute_network_peering.dev_infra: Error adding network peering: googleapi: Error 400: There is a route operation in progress on the local or peer network. Try again later., badRequest

Expected Behavior

The infra and the dev vpc should be peered.

Actual Behavior

See error. This basically means that peering is broken. The depends_on has no effect. The use of depends_on in dev_infra should mean that it WAITS until the first peering operation completes thereby fulfilling the GCP API requirement of one peering operation at a time.

Steps to Reproduce

  1. terraform apply
@emilymye
Copy link
Contributor

Hi @dgarstang - is this a different issue than #3026?

@dgarstang
Copy link
Author

I will close #3026. I think this ticket describes the situation more clearly.

@emilymye
Copy link
Contributor

Hmm, I can't seem to recreate this issue with the following config:

// Peer the infra vpc with the dev vpc.
resource "google_compute_network_peering" "infra_dev" {
  name         = "infra-dev"
  network      = "${google_compute_network.infra_network.self_link}"
  peer_network = "${google_compute_network.dev_network.self_link}"
}

// Peer the dev vpc with the infra vpc.
resource "google_compute_network_peering" "dev_infra" {
  name         = "dev-infra"
  network      = "${google_compute_network.dev_network.self_link}"
  peer_network = "${google_compute_network.infra_network.self_link}"
  depends_on   = ["google_compute_network_peering.infra_dev"]
}

resource "google_compute_network" "infra_network" {
  name                    = "prodfoobar"
  auto_create_subnetworks = "false"
}

resource "google_compute_network" "dev_network" {
  name                    = "devfoobar"
  auto_create_subnetworks = "false"
}

Do you mind running with the full debug logs? i.e. TF_LOG="DEBUG"

@JackDavidson
Copy link

JackDavidson commented Mar 16, 2019

I have been working around this with a null resource:

resource "google_compute_network_peering" "to" {
  name         = "to"
  network      = "network-2"
  peer_network = "network-1"
}

resource "google_compute_network_peering" "from" {
  name         = "from"
  network      = "network-1"
  peer_network = "network-2"

  // only one operation at a time for network peering, so we need an explicit serialization
  depends_on = ["null_resource.force_networks_in_order"]
}

resource "null_resource" "force_networks_in_order" {
  provisioner "local-exec" {
    command = "echo ${google_compute_network_peering.to.id}"
  }
}

@syl20bnr
Copy link

syl20bnr commented May 31, 2019

@emilymye this is because you need more networks and peerings to reproduce the race condition.

We have a lot of different networks in GCP using a shared VPC. Each service lies in its own separated network and we need to peer each network to allow communication between relevant services.

We hit the race condition every single time and without a dependency hack using input/output it would take like 20 iterations of plan/apply to have all the peerings created from scratch.

Now Terraform team seems to want to let Terraform being dumb regarding parallelism, I mean dumb in a good way. And let the provider to take care of provider specific implementation details like the parallelism issue we have here, i.e. "In GCP it is not possible to peer a network with several other networks at the same time".

Our solution is to reproduce a graph of dependency of the peerings using input/output:

  1. We have a module to peer two networks in both direction and we use depends_on to do them in sequence:
# Terraform module: gcp/google/vpc_network/network_peering
# Peer a network with another.

# Note: a network cannot be peered to multiple networks simultaneously.
# We have to create the peering sequentially thus you'll notice some hacks
# to be able to do so with Terraform 0.12

resource "google_compute_network_peering" "network" {
  name         = "${var.network_name}-${var.peered_network_name}"
  network      = "${var.network_link}"
  peer_network = "${var.peered_network_link}"
}

resource "google_compute_network_peering" "peered_network" {
  depends_on = ["google_compute_network_peering.network"]

  name         = "${var.peered_network_name}-${var.network_name}"
  network      = "${var.peered_network_link}"
  peer_network = "${var.network_link}"
}
  1. This module declares its own network link inputs as outputs:
# Outputs for gcp/google/vpc_network/network_peering module

# Modules dependency hack as of Terraform 0.12
# We use the network variable to define a chain of dependencies between the
# different calls of this module.
# Note that the values seem to be reversed but this is expected as we use the
# google_compute_network_peering.peered_network resource which is the last one
# to be created.
# Inspired from:
# https://github.com/hashicorp/terraform/issues/1178#issuecomment-207369534
output "network_link"        { value = "${google_compute_network_peering.peered_network.peer_network}" }
output "peered_network_link" { value = "${google_compute_network_peering.peered_network.network}" }
  1. Then the module callers can reproduce the dependency graph like the following (note that we have another module to actually create the network, they have the name <name>_network):
module "peering_A_B" 
  source = "../../vpc_network/network_peering"

  network_name        = module.A_network.project_name
  network_link        = module.A_network.network_link
  peered_network_name = module.B_network.project_name
  peered_network_link = module.B_network.network_link
}

module "peering_B_C" {
  source = "../../vpc_network/network_peering"

  network_name        = module.B_network.project_name
  network_link        = module.peering_A_B.peered_network_link
  peered_network_name = module.C_network.project_name
  peered_network_link = module.C_network.network_link
}

module "peering_A_C" {
  source = "../../vpc_network/network_peering"

  network_name        = module.A_network.project_name
  network_link        = module.peering_A_B.network_link
  peered_network_name = module.C_network.project_name
  peered_network_link = module.peering_B_C.peered_network_link
}

The example above reproduces the following graph:

A -> B -> C
|__________^

The above solution effectively peers in sequence A-B then B-C then A-C. If we don't do that then Terraform will do all the 3 peerings at the same time which will fail 2 times and require 3 apply iterations to complete. The first time the B-C and A-C peering will fail because A-B is being peered. The second time A-C will fail because B-C is being peered. The third time A-C will be created.

So it would be upra-supra-mega cool if the Google provider could handle this for us, one possible way would be that the provider allows only one peering resource to run at any given time. It will be slower but will work in one pass and we can use count in our peering resources, saving a lot of management burden because the example above is simple, in a real use-case it becomes much more harder to maintain the graph.

@bruceharrison1984
Copy link

bruceharrison1984 commented Aug 5, 2019

An easy way to reproduce this is to use google_compute_network_peering with a count of networks. Setting up a hub and spoke network with counts causes this error every single time.

variable "organization_id" {
  description = "The organization where the projects and folders should be created"
  type        = "string"
}

variable "billing_account_id" {
  description = "The ID of the billing account resources should be created under (XXXXXX-XXXXX-XXXXXX)"
  type        = "string"
}

variable "labels" {
  description = "Map of labels that will be applied to all resources that have labels"
  type        = "map"
}

variable "number_of_spokes" {
  description = "How many VPCs should be created and peered with the hub"
  type        = "string"
  default     = 4
}

resource "google_project" "compute_project" {
  name                = "compute-project"
  project_id          = "project-${random_id.compute_project.hex}"
  org_id              = "${var.organization_id}"
  billing_account     = "${var.billing_account_id}"
  labels              = "${var.labels}"
  auto_create_network = false
}

resource "random_id" "compute_project" {
  byte_length = 4
}
resource "google_compute_network" "hub_network" {
  name                            = "hub-network"
  project                         = "${google_project.compute_project.id}"
  auto_create_subnetworks         = false
  delete_default_routes_on_create = true
}

resource "google_compute_subnetwork" "hub_subnetwork" {
  provider         = "google-beta"
  name             = "hub-subnetwork"
  project          = "${google_project.compute_project.id}"
  ip_cidr_range    = "10.1.1.0/24"
  region           = "us-central1"
  network          = "${google_compute_network.hub_network.self_link}"
  enable_flow_logs = true
  log_config {
    aggregation_interval = "INTERVAL_10_MIN"
    flow_sampling        = 0.5
    metadata             = "INCLUDE_ALL_METADATA"
  }
}

resource "google_compute_firewall" "ingress" {
  provider = "google-beta"

  name           = "hub-firewall"
  network        = "${google_compute_network.hub_network.name}"
  project        = "${google_project.compute_project.id}"
  enable_logging = true

  allow {
    protocol = "tcp"
    ports = [
      "80",  //http
      "443", //https
      "22"   //ssh
    ]
  }
}

resource "google_compute_route" "internet" {
  name    = "hub-network"
  project = "${google_project.compute_project.id}"

  dest_range       = "0.0.0.0/0"
  network          = "${google_compute_network.hub_network.name}"
  next_hop_gateway = "default-internet-gateway"
  priority         = 1
}

resource "google_compute_network" "vpc_network" {
  count = "${var.number_of_spokes}"

  name                            = "spoke-network-${count.index}"
  project                         = "${google_project.compute_project.id}"
  auto_create_subnetworks         = false
  delete_default_routes_on_create = true

  depends_on = ["google_compute_subnetwork.hub_subnetwork"]
}

resource "random_id" "vpc_network" {
  count       = "${var.number_of_spokes}"
  byte_length = 4
}

resource "google_compute_subnetwork" "vpc_subnetwork" {
  count = length(google_compute_network.vpc_network)

  provider         = "google-beta"
  name             = "spoke-subnetwork-${count.index}"
  project          = "${google_project.compute_project.id}"
  ip_cidr_range    = "${cidrsubnet("10.1.1.0/16", 8, count.index + 2)}"
  region           = "us-central1"
  network          = "${element(google_compute_network.vpc_network.*.self_link, count.index)}"
  enable_flow_logs = true
  log_config {
    aggregation_interval = "INTERVAL_10_MIN"
    flow_sampling        = 0.5
    metadata             = "INCLUDE_ALL_METADATA"
  }
}

resource "google_compute_network_peering" "hub_to_peer" {
  count = length(google_compute_network.vpc_network)

  name         = "hub-to-peer-${count.index}"
  network      = "${google_compute_network.hub_network.self_link}"
  peer_network = "${element(google_compute_network.vpc_network.*.self_link, count.index)}"

  depends_on = ["google_compute_subnetwork.vpc_subnetwork", "google_compute_subnetwork.hub_subnetwork"]
}

resource "google_compute_network_peering" "peer_to_hub" {
  count = length(google_compute_network.vpc_network)

  name         = "peer-to-hub-${count.index}"
  network      = "${element(google_compute_network.vpc_network.*.self_link, count.index)}"
  peer_network = "${google_compute_network.hub_network.self_link}"

  depends_on = ["google_compute_subnetwork.vpc_subnetwork", "google_compute_subnetwork.hub_subnetwork"]
}

I also agree that it would be really nice if this worked. Creating a wrapper resource just to fulfill this is pretty painful.

I also don't think anyone has yet mentioned the easiest workaround, which is using:
terraform apply -parallelism=1
That nicely sidesteps the issue, at the expense of deployment time increasing.

@ghost
Copy link

ghost commented Feb 8, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Feb 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants