Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker DNS does not resolve container address by name on user-defined network in Windows #500

Closed
minherz opened this issue Feb 15, 2017 · 30 comments

Comments

@minherz
Copy link

minherz commented Feb 15, 2017

Expected behavior

A command like ping db where db is a name of the container does not succeed returning "Ping request could not find host minio1. Please check the name and try again."

Actual behavior

A command that use container name should resolve to container IP and succeed.

Information

  • Diagnostic ID: F9F388E9-7AB0-4CE8-BD59-045CF28124A7/2017-02-15_16-45-57
  • Docker host: Version 1.13.1-beta42 (10069)
  • Windows version: Windows 10 x64 Enterprise Version 1607 (build 14393.693)
  • docker version command:
Client:
 Version:      1.13.1
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   092cba3
 Built:        Wed Feb  8 08:47:51 2017
 OS/Arch:      windows/amd64

Server:
 Version:      1.13.1
 API version:  1.26 (minimum version 1.24)
 Go version:   go1.7.5
 Git commit:   092cba3
 Built:        Wed Feb  8 08:47:51 2017
 OS/Arch:      windows/amd64
 Experimental: true

Steps to reproduce the behavior

The following can be reproduced by executing docker run commands as well.

  1. Create a user-defined network for nat driver (the default nat network mask is 172.25.128.0/20):
docker network create -d nat -o com.docker.network.windowsshim.networkname=prototype_net --subnet=172.25.132.0/24 --gateway=172.25.132.1 prototype_net
  1. Create a docker-compose.yml file with the following content:
version: '2.1'
services:
  db:
    image: microsoft/nanoserver
    command: powershell /c sleep 3600
    networks:
      - isolated

  web:
    image: microsoft/nanoserver
    command: ping db
    depends_on:
      - db
    networks:
      - isolated

networks:
  isolated:
    external:
      name: prototype_net
  1. Run following command from the folder where the file is stored:
docker up

The output will be something like

Creating temp_db_1
Creating temp_web_1
Attaching to temp_db_1, temp_web_1
web_1  | Ping request could not find host db. Please check the name and try again.
temp_web_1 exited with code 1

Using docker run command with --network-alias in addition to --name has the same effect.

@rn
Copy link
Contributor

rn commented Feb 15, 2017

@friism do you know if name resolution with compose names is supposed to work with Windows containers?

@minherz
Copy link
Author

minherz commented Feb 15, 2017

Regarding added area labels: I used docker-compose in order to not post dockerfile with two run commands. The effect is the same.

@friism
Copy link

friism commented Feb 15, 2017

Yeah, it works fine: https://github.com/docker/labs/blob/master/windows/windows-containers/MultiContainerApp.md

@minherz can you try without creating a custom network, and just relying on the existing NAT network?

Or if you're on Windows 10 insider build, you can try the new overlay networking support: https://blogs.technet.microsoft.com/virtualization/2017/02/09/overlay-network-driver-with-support-for-docker-swarm-mode-now-available-to-windows-insiders-on-windows-10/

@minherz
Copy link
Author

minherz commented Feb 15, 2017

I use Windows 10, but not sure whether it is insider or official build.
The docker-compose works on default network. The functionality is also work when using CLI.
Isn't it suppose to work for user-defined nat network as well?

@friism
Copy link

friism commented Feb 15, 2017

I haven't personally tested divying up the NAT network, but the docs are here: https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/container-networking#multiple-nat-networks

My point is that this is way simpler using the overlay driver (available in Windows 10 insider builds): https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/swarm-mode#creating-an-overlay-network

@minherz minherz changed the title Docker DNS does not resolve container address by name on Windows Docker DNS does not resolve container address by name on user-defined network in Windows Feb 15, 2017
@minherz
Copy link
Author

minherz commented Feb 15, 2017

This is what I did (see step#1 in the "Steps to reproduce the behavior").
The question remain why it does not resolve the name.
I have updated the title of the issue to reflect that DNS does not work only for user-defined networks.
The documentation does not require any additional tweaks in order to have containers to be resolved when connected to user-defined network.

@friism
Copy link

friism commented Feb 15, 2017

pinging @msabansal

@msabansal
Copy link

@minherz This should have worked, Not sure why this is not working. Trying it out.

@msabansal
Copy link

@minherz Can you please let me know what happens when you run docker run -it --net=prototype_net --name=db microsoft/nanoserver cmd

and then try to ping db from another container.

@minherz
Copy link
Author

minherz commented Feb 16, 2017

@msabansal,
today, after restarting the computer with docker, every scenario works as expected. i tried both the docker-compose script scenario and running two containers from base images and it worked. i am unaware of any update to docker or windows OS that happened during this time.

@minherz minherz closed this as completed Feb 16, 2017
@rn
Copy link
Contributor

rn commented Feb 16, 2017

@minherz thanks for getting back and glad it is working for you. It's weird indeed.

@jonashackt
Copy link

jonashackt commented Mar 27, 2017

Ok guys, it´s been a week or two now. I had the described problem - no, or just particularly DNS alias resolution of Docker Windows containers - without a user-defined network, but just with the standard nat.

After so much time reading through all the docs (gotchas and caveeats etc.) and trying mostly anything, the only thing that helped me was just the link to https://github.com/docker/labs/blob/master/windows/windows-containers/MultiContainerApp.md with one small sentence:

Temporary workaround for Windows DNS client weirdness

And the following inserts in my Dockerfile:

SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop';"]
RUN set-itemproperty -path 'HKLM:\SYSTEM\CurrentControlSet\Services\Dnscache\Parameters' -Name ServerPriorityTimeLimit -Value 0 -Type DWord

It would have been really nice to have this one in the docs! Hope to save some other people a lot of time! Thanks for the link @friism anyway

@friism
Copy link

friism commented Mar 27, 2017

@jonashackt sorry this caused so much grief - I thought this was supposed to have been fixed in Windows

@msabansal
Copy link

@friism There was an issue with the update package and we fixed it in the upcoming windows patch. I thought we got it documented somewhere.

@darkshade9
Copy link

My workaround for now involves changing the DNS servers on docker build

RUN powershell Set-DnsClientServerAddress -InterfaceIndex <interface index number> -ServerAddresses <dns addr 1>, <dns addr 2>

@dyvniy
Copy link

dyvniy commented Jun 8, 2017

This helps me too.
To full answer, InterfaceIndex for each container differ.
You can get in by command:
netsh interface ipv4 show interfaces

@minherz
Copy link
Author

minherz commented Jun 8, 2017

@darkshade9 , @dyvniy how do you automate it? you cannot manually place it each time for docker build command unless the whole devops process is done manually. at least, we cannot...

@darkshade9
Copy link

darkshade9 commented Jun 8, 2017

I have a Powershell script that I add via the Dockerfile

ADD DNSfix.ps1 "C:\DNSfix.ps1"
RUN powershell "C:\DNSfix.ps1"

that executes the following:

$action = New-ScheduledTaskAction -Execute 'Powershell.exe'-Argument '-NoProfile -WindowStyle Hidden -command "& {Set-DnsClientServerAddress -InterfaceAlias vEthernet* -ServerAddresses <IP.DNS.1>, <IP.DNS.2>}"';$trigger = New-ScheduledTaskTrigger -AtStartup;Register-ScheduledTask -Action $action -Trigger $trigger -TaskName "FixLocalDNS" -Description "Changes local DNS settings to address issue in Docker" -RunLevel Highest -User System;

@minherz
Copy link
Author

minherz commented Jun 8, 2017

And <IP.DNS.1> <IP.DNS.2> are derived from what? You assume that you know kubernetes DNS? I think its default value is smth like 10.0.0.10 but not 100%. It might change in custom network topologies as well.

@darkshade9
Copy link

It's whatever your DNS servers are, they can be hard set or pulled from whatever service is running -- I'm not running Kubernetes because we had some issues with it integrating with our Windows containers, we're running with Portainer at the moment. We are using DNS servers hosted on our AD controllers.

@minherz
Copy link
Author

minherz commented Jun 8, 2017

My DNS servers "do not know" about Kubernetes services. The problem was that it is impossible to resolve kubernetes services by name. It is happening because some NIC in container don't have kubernetes DNS

@dyvniy
Copy link

dyvniy commented Jun 8, 2017

@minherz I have wrote program in Python 3 with docker-py.
I update DNS in container creation operation with special command.

@minherz
Copy link
Author

minherz commented Jun 19, 2017

My deployments are done using kubernetes which operates (similar to docker swarm) on the service level of abstraction. Kubernetes injects its own DNS to each container's NIC allowing to resolve services by name. I had some issues with it when deploying in Azure.
Does anyone know if the problems with resolving containers by name when running multiple in a single docker host still exist?

@blakeja
Copy link

blakeja commented Apr 14, 2018

I ran into something similar with CE 18 running the official Redis:latest image as an LCOW container with Docker for windows running in windows container mode. Basically I pull the image and run it with all the defaults, I just add “-h redis”. The container starts just fine, I can ping by IP, redis is functional using IP. But if I try to resolve by the hostname (redis or whatever) it does not resolve.

I am still digging into this, but so far I have not been able to resolve the issue.

@putz612
Copy link

putz612 commented Apr 18, 2018

I am hitting this with CE 18 with LCOW as well. I have three containers in a compost file, two Windows and one Linux. From one if the windows container I am unable to resolve either the Windows hostname or the Linux hostname. Looking at the above solutions the changes are already in there.

workflow_1  | HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Dnscache\Parameters
workflow_1  |     ServerPriorityTimeLimit    REG_DWORD    0x0
workflow_1  |     ServiceDll    REG_EXPAND_SZ    %SystemRoot%\System32\dnsrslvr.dll
workflow_1  |     ServiceDllUnloadOnStop    REG_DWORD    0x1

Output of docker version:

Client:
 Version:       18.04.0-ce
 API version:   1.37
 Go version:    go1.9.4
 Git commit:    3d479c0
 Built: Tue Apr 10 18:13:15 2018
 OS/Arch:       windows/amd64
 Experimental:  false
 Orchestrator:  swarm

Server:
 Engine:
  Version:      18.04.0-ce
  API version:  1.37 (minimum version 1.24)
  Go version:   go1.9.4
  Git commit:   3d479c0
  Built:        Tue Apr 10 18:30:47 2018
  OS/Arch:      windows/amd64
  Experimental: true

@Dressee
Copy link

Dressee commented May 24, 2018

I am hitting this exact issue with docker 17.06.02-ee-11 build 06fc007.
Randomly or when doing docker compose down / up my windows containers cannot reach eachother on hostname.

The ServerPriorityTimeLimit is set to 0

@richardgavel
Copy link

richardgavel commented Dec 12, 2018

I am also hitting this issue, occurs randomly. Sometimes the next time the containers/network are stood up it works, sometimes I need to restart the Docker service. Always, the service restart seems to fix it. I also tried an nslookup with the DNS and could not resolve and I don't think the ServerPriorityTimeLimit would matter because no caching would occur with an nslookup call.

Containers: 1 Running: 1 Paused: 0 Stopped: 0 Images: 56 Server Version: 18.09.0 Storage Driver: windowsfilter Windows: Logging Driver: json-file Plugins: Volume: local Network: ics l2bridge l2tunnel nat null overlay transparent Log: awslogs etwlogs fluentd gelf json-file local logentries splunk syslog Swarm: inactive Default Isolation: process Kernel Version: 10.0 17134 (17134.1.amd64fre.rs4_release.180410-1804) Operating System: Windows Server Datacenter Version 1803 (OS Build 17134.228) OSType: windows Architecture: x86_64 CPUs: 4 Total Memory: 32GiB Name: ci-bambi-01002 ID: HQJP:2EHV:3IRG:5CFG:KLKZ:XSRX:HHRS:6XLT:YGDP:VKFF:7KUX:D3AW Docker Root Dir: C:\ProgramData\docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: artifacts.cddev.tv 127.0.0.0/8 Live Restore Enabled: false

@Iristyle
Copy link

Iristyle commented May 2, 2019

Also seeing problems with flaky DNS resolutions against the Docker DNS resolver.

Given a 3 container cluster started by compose, attaching a new container to the existing network intermittently fails to resolve DNS names of the other containers:

docker run --rm -i -t --network s_default --dns-search local --name foo1.local --hostname foo1.local --entrypoint '/bin/sh' puppet/puppet-agent-alpine

Note the sporadic resolutions

/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25 s_puppet_1.s_default
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25 s_puppet_1.s_default
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25 s_puppet_1.s_default
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25 s_puppet_1.s_default
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25 s_puppet_1.s_default
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'puppet.local': Name does not resolve
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'puppet.local': Name does not resolve
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'puppet.local': Name does not resolve
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25 s_puppet_1.s_default
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'puppet.local': Name does not resolve
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25 s_puppet_1.s_default
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25 s_puppet_1.s_default
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25 s_puppet_1.s_default
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'puppet.local': Name does not resolve
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'puppet.local': Name does not resolve
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'puppet.local': Name does not resolve
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'puppet.local': Name does not resolve
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'puppet.local': Name does not resolve
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      puppet.local
Address 1: 172.17.212.25
/ # nslookup puppet.local
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'puppet.local': Name does not resolve

docker info

Client:
 Debug Mode: false
 Plugins:
  app: Docker Application (Docker Inc., v0.8.0-beta2)
  buildx: Build with BuildKit (Docker Inc., v0.2.0-6-g509c4b6-tp)

Server:
 Containers: 3
  Running: 3
  Paused: 0
  Stopped: 0
 Images: 135
 Server Version: master-dockerproject-2019-04-28
 Storage Driver: windowsfilter (windows) lcow (linux)
  Windows:
  LCOW:
 Logging Driver: json-file
 Plugins:
  Volume: local
  Network: ics l2bridge l2tunnel nat null overlay transparent
  Log: awslogs etwlogs fluentd gcplogs gelf json-file local logentries splunk syslog
 Swarm: inactive
 Default Isolation: hyperv
 Kernel Version: 10.0 17763 (17763.1.amd64fre.rs5_release.180914-1434)
 Operating System: Windows 10 Enterprise Version 1809 (OS Build 17763.437)
 OSType: windows
 Architecture: x86_64
 CPUs: 2
 Total Memory: 16GiB
 Name: ci-lcow-prod-1
 ID: 0ac02c9d-aaba-42f4-8749-5a64af3068d8
 Docker Root Dir: C:\ProgramData\docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: true
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

@Iristyle
Copy link

Iristyle commented May 3, 2019

The intermittent issue I mentioned above appears to be related to a combination of LCOW and Alpine - I've filed issues at moby/libnetwork#2371 and microsoft/opengcs#303 because I'm not sure where my particular problem occurs.

@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

@docker docker locked and limited conversation to collaborators Jun 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests