DNS records disappear after a few hours #1177

2opremio · 2015-07-17T10:59:04Z

If I let the example of the (still unfinished) ECS guide running for a few hours, DNS records disappear and are not resolvable.

The setup consists of 3 hosts with 3 containers each, which register the same DNS records in all the hosts:

ecs-agent (Amazon's ECS Agent container)
dataproducer
httpserver

Here are the logs and status of one of the hosts and here's what I do to confirm the problem:

[ec2-user@ip-172-31-33-176 ~]$ DOCKER_HOST=unix:///var/run/weave.sock docker run busybox ping ecs-agent                                                                                                                                 
ping: bad address 'ecs-agent'
[ec2-user@ip-172-31-33-176 ~]$ DOCKER_HOST=unix:///var/run/weave.sock docker run busybox ping -c 1 ecs-agent.weave.local
ping: bad address 'ecs-agent.weave.local'
[ec2-user@ip-172-31-33-176 ~]$ DOCKER_HOST=unix:///var/run/weave.sock docker run busybox ping -c 1 dataproducer
ping: bad address 'dataproducer'
[ec2-user@ip-172-31-33-176 ~]$ DOCKER_HOST=unix:///var/run/weave.sock docker run busybox ping -c 1 dataproducer.weave.local
ping: bad address 'dataproducer.weave.local'
[ec2-user@ip-172-31-33-176 ~]$ DOCKER_HOST=unix:///var/run/weave.sock docker run busybox ping -c 1 httpserver
ping: bad address 'httpserver'
[ec2-user@ip-172-31-33-176 ~]$ DOCKER_HOST=unix:///var/run/weave.sock docker run busybox ping -c 1 httpserver.weave.local
ping: bad address 'httpserver.weave.local'

The commands above work for at least one hour after spawning the instances.

I am using weave/master with some extra proxy fixes on top which I need for ecs-agent to work with weave.

The text was updated successfully, but these errors were encountered:

2opremio · 2015-07-17T19:20:37Z

This issue can now be reproduced automatically by running setup.sh (from the ECS guide). You still need to wait for the records to disappear though.

tomwilkie · 2015-07-20T10:08:23Z

I've had master running over the weekend (not using the ecs stuff, using the 210 test), and the entries were still there this morning. I'm going to add a http handler to retrieve entries (including tombstones) and then try and reproduce with ecs scripts.

tomwilkie · 2015-07-20T11:19:59Z

Okay I've spun up a new cluster based off the 1177-dns-disappear branch, with the new http handler:

[ec2-user@ip-172-31-13-65 ~]$ curl -s -H "Accept:application/json" http://$(docker inspect --format='{{.NetworkSettings.IPAddress}}' weave):6784/name | jq .
[
  {
    "ContainerID": "39ae439b08c353a0d7df69791b58cf634b4c6f5048cfe5c0dbb62b9d19ebaac7",
    "Origin": "36:2f:21:46:97:0c",
    "Addr": 170655745,
    "Hostname": "dataproducer.weave.local.",
    "Version": 0,
    "Tombstone": 0
  },
  {
    "ContainerID": "59e5cc1d9678ddf9a09bc4bb7f3a70ac52eea09d4f2b0717714514a632e531f2",
    "Origin": "6a:74:05:31:eb:80",
    "Addr": 169869314,
    "Hostname": "dataproducer.weave.local.",
    "Version": 0,
    "Tombstone": 0
  },
  {
    "ContainerID": "2df1a026ff2dbe1ca0d2ec4bd329ef3596c5c2d09bb88d8d6c61b1f074814337",
    "Origin": "d6:fd:65:1a:99:f9",
    "Addr": 170393601,
    "Hostname": "dataproducer.weave.local.",
    "Version": 0,
    "Tombstone": 0
  },
  {
    "ContainerID": "2ad60daf1b2c1b1c14fee4bd1434ad454d4670beb615393ce4e3453998cdb0e9",
    "Origin": "36:2f:21:46:97:0c",
    "Addr": 170655744,
    "Hostname": "ecs-agent.weave.local.",
    "Version": 0,
    "Tombstone": 0
  },
  {
    "ContainerID": "6bc8734bf93625e8c3aceeb5300b9a5cb3f60cadce1a69ae549915ed547023ee",
    "Origin": "6a:74:05:31:eb:80",
    "Addr": 169869313,
    "Hostname": "ecs-agent.weave.local.",
    "Version": 0,
    "Tombstone": 0
  },
  {
    "ContainerID": "fa8999e791a519f9a8312d5878d555188d824eb7851454e8197d4c9f13d70263",
    "Origin": "d6:fd:65:1a:99:f9",
    "Addr": 170393600,
    "Hostname": "ecs-agent.weave.local.",
    "Version": 0,
    "Tombstone": 0
  },
  {
    "ContainerID": "fa28f96435df3df106cc0eddfa4599b97b3f6ee75e5fe3f850afa748796e0b5d",
    "Origin": "36:2f:21:46:97:0c",
    "Addr": 170655746,
    "Hostname": "httpserver.weave.local.",
    "Version": 0,
    "Tombstone": 0
  },
  {
    "ContainerID": "8741bc17abd466f2edb1ff09cae87cbdd9d39de4c1e85c0971f1ee5161cd638e",
    "Origin": "6a:74:05:31:eb:80",
    "Addr": 169869315,
    "Hostname": "httpserver.weave.local.",
    "Version": 0,
    "Tombstone": 0
  },
  {
    "ContainerID": "7ea2bc9e2c60e2d519d11a91bb68217f5844954e11704b619b1ed54c7a469d4d",
    "Origin": "d6:fd:65:1a:99:f9",
    "Addr": 170393602,
    "Hostname": "httpserver.weave.local.",
    "Version": 0,
    "Tombstone": 0
  }
]

I will now leave this for a day.

Don't periodically delete all non-tombstone entries. Fixes #1177.

2opremio added the bug label Jul 17, 2015

2opremio assigned tomwilkie Jul 17, 2015

2opremio added the [component/dns] label Jul 17, 2015

rade added this to the current milestone Jul 17, 2015

tomwilkie mentioned this issue Jul 20, 2015

Don't periodically delete all non-tombstone entries. #1195

Merged

rade closed this as completed in #1195 Jul 20, 2015

rade added a commit that referenced this issue Jul 20, 2015

Merge pull request #1195 from weaveworks/1177-dns-disappear

f1b5a00

Don't periodically delete all non-tombstone entries. Fixes #1177.

rade modified the milestones: current, 1.1.0 Jul 21, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DNS records disappear after a few hours #1177

DNS records disappear after a few hours #1177

2opremio commented Jul 17, 2015

2opremio commented Jul 17, 2015

tomwilkie commented Jul 20, 2015

tomwilkie commented Jul 20, 2015

DNS records disappear after a few hours #1177

DNS records disappear after a few hours #1177

Comments

2opremio commented Jul 17, 2015

2opremio commented Jul 17, 2015

tomwilkie commented Jul 20, 2015

tomwilkie commented Jul 20, 2015