-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apparent DNS failure in Docker image alpine:3.8, nslookup: can't resolve '(null)' #476
Comments
The BusyBox In your example The line |
Thanks @bboreham for the note, could this failure to resolve the DNS server cause a host-name resolution failure in a Java program? I might be asking the wrong questions, but I guess I'm grasping at straws, trying to explain why in my environment & tests the Java image openjdk:8-jre-alpine (derived from alpine:3.8 image) fails but the Java image openjdk:8-jre-slim (derived from debian) works just fine. |
It’s not really a failure at all; it’s just a program printing something out that isn’t helpful or interesting. What Alpine nslookup does has no bearing on what a Java program does. |
I ran nslookup to check whether the running container is able to resolve a name to an IP address. I hear you saying, nslookup is not a reliable indicator. Please suggest a better way. |
I'm not saying it is not a reliable indicator, I'm saying the line where it prints Check the return code from |
I have not yet figured out the problem, pls see below for shortest possible Java debugging material, hope this will help other people. file ResolveHostName.java
file Dockerfile-alpine
file build.sh
|
Please, include this fix in all alpine versions kubernetes/kubernetes#56903 (comment) |
@jcperezamin - I think that's a different issue. Also, I think that'd just be hacking in a workaround for the underlying issue, which is a linux kernel bug to be fixed in v5.0 - torvalds/linux@4e35c1c . Also, that change would break some IPv6 support. |
and the relative busybox codes (e.g. alpine3.8 is using http://busybox.net/downloads/busybox-1.28.4.tar.bz2 , networking/nslookup.c, function nslookup_main) and the debugging logs with strace showed that the program tried the DNS query for the given NS server ('forward' or 'reverse' DNS query, it depends on the second NS param you gave, if the NS param was some ns ip it did reverse query .aka PTR query) first of all, you got that msg because of the empty global NS server at the initialized time.
I have been running most of applications (java/go/nodejs/c/c++/python) based on my customized alpine docker for 1+ years. thanks, |
- There appear to be many reported issues against Alpine DNS. This is an attempt to work around the ones we're experiencing. - In local testing (specifically under LCOW), DNS resolution under Alpine seems to be very problematic. `nslookup` may repeatedly fail to perform a DNS resolution against another container name like `puppet.local` repeatedly. Lookup failures will resemble something like: / # nslookup puppet.local nslookup: can't resolve '(null)': Name does not resolve nslookup: can't resolve 'puppet.local': Name does not resolve Even successes have problems with the DNS server / # nslookup puppet.local nslookup: can't resolve '(null)': Name does not resolve Name: puppet.local Address 1: 172.17.212.25 - Supposedly the "can't resolve '(null)'" part is innocuous, but it's unclear if that is the case. More info at: nicolaka/netshoot#6 gliderlabs/docker-alpine#476 - It seems that just having the `bind-tools` package installed will increase the reliability, but after running dig once against the given host, intermittnet DNS resolution problems seem to go away / # nslookup puppet.local Server: 172.17.208.1 Address: 172.17.208.1#53 Non-authoritative answer: Name: puppet.local Address: 172.17.212.25 So the script is changed to query for the postgres hostname - We don't use curl here because we're mostly interested in making sure a host with a given name *should* exist. There are scenarios where host / dig will succeed, but latter checks with curl may not - and we want to differentiate those failure modes as much as possible https://serverfault.com/questions/335359/how-is-it-possible-that-i-can-do-a-host-lookup-but-not-a-curl
I have similar issue in our Kubernetes cluster when I try to get the ip of Kubernetes DNS server. I'm using the this docker image:
|
fixed with config below with k8s deploy spec:
template:
metadata:
labels:
app: activityreservation
spec:
dnsConfig:
options:
- name: ndots
value: "1" refer to https://github.com/WeihanLi/ActivityReservation/blob/d3e4de902af70ad1c85618db8a481f7fbfe1a964/k8s/reservation-deployment.yaml for details |
@xbmono if you are complaining about "can't resolve '(null)'" please read my explanation at #476 (comment). It is nothing to worry about. If something else, I suggest you open a new issue. |
Having the same issue with Alpine 3.9 (not Kubernetes case). Is there any workaround for this? |
I have the same problem with alpine:3.10 |
Same for me.. |
same here node:alpine brings this problem |
- There appear to be many reported issues against Alpine DNS. This is an attempt to work around the ones we're experiencing. - In local testing (specifically under LCOW), DNS resolution under Alpine seems to be very problematic. `nslookup` may repeatedly fail to perform a DNS resolution against another container name like `puppet.local` repeatedly. Lookup failures will resemble something like: / # nslookup puppet.local nslookup: can't resolve '(null)': Name does not resolve nslookup: can't resolve 'puppet.local': Name does not resolve Even successes have problems with the DNS server / # nslookup puppet.local nslookup: can't resolve '(null)': Name does not resolve Name: puppet.local Address 1: 172.17.212.25 - Supposedly the "can't resolve '(null)'" part is innocuous, but it's unclear if that is the case. More info at: nicolaka/netshoot#6 gliderlabs/docker-alpine#476 - It seems that just having the `bind-tools` package installed will increase the reliability, but after running dig once against the given host, intermittnet DNS resolution problems seem to go away / # nslookup puppet.local Server: 172.17.208.1 Address: 172.17.208.1#53 Non-authoritative answer: Name: puppet.local Address: 172.17.212.25 So the script is changed to query for the postgres hostname - We don't use curl here because we're mostly interested in making sure a host with a given name *should* exist. There are scenarios where host / dig will succeed, but latter checks with curl may not - and we want to differentiate those failure modes as much as possible https://serverfault.com/questions/335359/how-is-it-possible-that-i-can-do-a-host-lookup-but-not-a-curl
any workaround with docker here? |
This was very useful. There are apparently a different nslookup implementation available in busybox which can be enabled with BTW, the official docker image has moved to https://github.com/alpinelinux/docker-alpine. Since this was a config option in upstream alpine, it would have been good if it was reported upstream to |
The small nslookup does not work with musl, so lets enable the musl compatible variant. ref: gliderlabs/docker-alpine#476
The small nslookup does not work with musl, so lets enable the musl compatible variant. ref: gliderlabs/docker-alpine#476 (cherry picked from commit cfb652d)
This should be fixed in alpinelinux/aports@e5c984f and will be available next release ( |
This is a bit of an arse, but I have spotted since switching to alpine as the base image the healthchecks no longer work. If I `docker exec` onto the instance I can replicate the wget calls fine. But the healthchecks report the same thing in both ```json { "Start": "2020-02-27T13:02:17.0507153Z", "End": "2020-02-27T13:02:17.2414162Z", "ExitCode": 1, "Output": "wget: bad address '|| exit 1'\n" }, ``` I believe the problem is a known issue with the DNS in Alpine images. What I'm struggling with right now is trying to find a clean example of what I need to do to resolve it - https://medium.com/@xavier.priour/docker-alpine-dns-issue-bad-address-84594d128d9f - https://forums.docker.com/t/resolved-service-name-resolution-broken-on-alpine-and-docker-1-11-1-cs1/19307 - gliderlabs/docker-alpine#476 - https://unix.stackexchange.com/questions/441664/alpine-linux-sometimes-dns-is-not-resolved - docker/for-linux#755 - https://stackoverflow.com/questions/57202039/resolve-conf-cant-be-changed-docker-alpine - I have also tried playing with and removing the rails user (in case it was a permissions issue) and carrying out a `apk upgrade -U -a` as part of the build to ensure everything in the image is the latest and greatest but still no joy. So as I never actually see these when I'm googling for examples, and I know the apps are currently working, I'm removing them for now. I would like to bring them back in later though once I've got a bit more time to look into the problem.
A temporary workaround which worked for me: try to create the docker container in host mode:
|
- There appear to be many reported issues against Alpine DNS. This is an attempt to work around the ones we're experiencing. - In local testing (specifically under LCOW), DNS resolution under Alpine seems to be very problematic. `nslookup` may repeatedly fail to perform a DNS resolution against another container name like `puppet.local` repeatedly. Lookup failures will resemble something like: / # nslookup puppet.local nslookup: can't resolve '(null)': Name does not resolve nslookup: can't resolve 'puppet.local': Name does not resolve Even successes have problems with the DNS server / # nslookup puppet.local nslookup: can't resolve '(null)': Name does not resolve Name: puppet.local Address 1: 172.17.212.25 - Supposedly the "can't resolve '(null)'" part is innocuous, but it's unclear if that is the case. More info at: nicolaka/netshoot#6 gliderlabs/docker-alpine#476 - It seems that just having the `bind-tools` package installed will increase the reliability, but after running dig once against the given host, intermittnet DNS resolution problems seem to go away / # nslookup puppet.local Server: 172.17.208.1 Address: 172.17.208.1#53 Non-authoritative answer: Name: puppet.local Address: 172.17.212.25 So the script is changed to query for the postgres hostname - We don't use curl here because we're mostly interested in making sure a host with a given name *should* exist. There are scenarios where host / dig will succeed, but latter checks with curl may not - and we want to differentiate those failure modes as much as possible https://serverfault.com/questions/335359/how-is-it-possible-that-i-can-do-a-host-lookup-but-not-a-curl
- There appear to be many reported issues against Alpine DNS. This is an attempt to work around the ones we're experiencing. - In local testing (specifically under LCOW), DNS resolution under Alpine seems to be very problematic. `nslookup` may repeatedly fail to perform a DNS resolution against another container name like `puppet.local` repeatedly. Lookup failures will resemble something like: / # nslookup puppet.local nslookup: can't resolve '(null)': Name does not resolve nslookup: can't resolve 'puppet.local': Name does not resolve Even successes have problems with the DNS server / # nslookup puppet.local nslookup: can't resolve '(null)': Name does not resolve Name: puppet.local Address 1: 172.17.212.25 - Supposedly the "can't resolve '(null)'" part is innocuous, but it's unclear if that is the case. More info at: nicolaka/netshoot#6 gliderlabs/docker-alpine#476 - It seems that just having the `bind-tools` package installed will increase the reliability, but after running dig once against the given host, intermittnet DNS resolution problems seem to go away / # nslookup puppet.local Server: 172.17.208.1 Address: 172.17.208.1#53 Non-authoritative answer: Name: puppet.local Address: 172.17.212.25 So the script is changed to query for the postgres hostname - We don't use curl here because we're mostly interested in making sure a host with a given name *should* exist. There are scenarios where host / dig will succeed, but latter checks with curl may not - and we want to differentiate those failure modes as much as possible https://serverfault.com/questions/335359/how-is-it-possible-that-i-can-do-a-host-lookup-but-not-a-curl
Seeing odd DNS behavior in Docker image alpine:3.8. I'm baffled that nslookup complains yet finds the IP address. In comparison, ping works perfectly. See below. This little test runs a docker container to resolve the name of the host VM. If I can get that working the next test will be to have the docker container resolve the name of other running containers.
Traced this back from behavior in an OpenJDK image, in which Java cannot resolve host names. I'd really prefer to use an Alpine version of a Java/JRE image, it's half the size of a non-Alpine (debian) Java/JRE image, but this network glitch is kind of a killer.
So far I've run this test under a plain Ubuntu VM running docker 17.05.0-ce and under Kubernetes running docker version 18.09.1. Same behavior in both. I know there are many external variables that might affect this so it might not be an Alpine issue at all, altho issue #255 sure seems to be related.
Would someone possibly take a minute to explain please? Thanks in advance.
The text was updated successfully, but these errors were encountered: