Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsdist: Downstreams stay down after reboot #13837

Closed
rgacogne opened this issue Feb 27, 2024 Discussed in #13811 · 0 comments · Fixed by #13834
Closed

dnsdist: Downstreams stay down after reboot #13837

rgacogne opened this issue Feb 27, 2024 Discussed in #13811 · 0 comments · Fixed by #13834

Comments

@rgacogne
Copy link
Member

Discussed in #13811

Originally posted by zeha February 18, 2024
I have dnsdist in a setup which some people would probably call "strange". dnsdist sits on my router VM to provide recursive DNS for my LAN, and I want to use these downstreams: nextdns (mostly for tracking blocking), a local pdns-recursor (fallback), and a local dnsmasq (for LAN domains).

This setup generally works, but on a reboot, the nextdns servers are marked down and never make it into the "up" state. Here is a boot log excerpt:

Feb 18 16:41:42 las-sh01 systemd[1]: Starting dnsdist.service - DNS Loadbalancer...
Feb 18 16:41:42 las-sh01 dnsdist[494]: Passing a plain-text password via the 'password' parameter to 'setWebserverConfig()' is not advised, please consider generating a hashed one using 'hashPassword()' instead.
Feb 18 16:41:42 las-sh01 dnsdist[494]: Passing a plain-text API key via the 'apiKey' parameter to 'setWebserverConfig()' is not advised, please consider generating a hashed one using 'hashPassword()' instead.
Feb 18 16:41:42 las-sh01 dnsdist[494]: Configuration '/etc/dnsdist/dnsdist.conf' OK!
Feb 18 16:41:42 las-sh01 dnsdist[543]: dnsdist 1.9.0.38.master.g22c29777a comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it according to the terms of the GPL version 2
Feb 18 16:41:42 las-sh01 dnsdist[543]: Passing a plain-text password via the 'password' parameter to 'setWebserverConfig()' is not advised, please consider generating a hashed one using 'hashPassword()' instead.
Feb 18 16:41:42 las-sh01 dnsdist[543]: Passing a plain-text API key via the 'apiKey' parameter to 'setWebserverConfig()' is not advised, please consider generating a hashed one using 'hashPassword()' instead.
Feb 18 16:41:42 las-sh01 dnsdist[543]: Added downstream server 127.0.0.1:5300
Feb 18 16:41:42 las-sh01 dnsdist[543]: Added downstream server 127.0.0.1:5301
Feb 18 16:41:42 las-sh01 dnsdist[543]: Error connecting to new server with address [2a07:a8c0::3f:2ba1]:53: connecting socket to [2a07:a8c0::3f:2ba1]:53: Network is unreachable
Feb 18 16:41:42 las-sh01 dnsdist[543]: Added downstream server [2a07:a8c0::3f:2ba1]:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: Error connecting to new server with address [2a07:a8c1::3f:2ba1]:53: connecting socket to [2a07:a8c1::3f:2ba1]:53: Network is unreachable
Feb 18 16:41:42 las-sh01 dnsdist[543]: Added downstream server [2a07:a8c1::3f:2ba1]:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: Error connecting to new server with address 45.90.28.62:53: connecting socket to 45.90.28.62:53: Network is unreachable
Feb 18 16:41:42 las-sh01 dnsdist[543]: Added downstream server 45.90.28.62:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: Error connecting to new server with address 45.90.30.62:53: connecting socket to 45.90.30.62:53: Network is unreachable
Feb 18 16:41:42 las-sh01 dnsdist[543]: Added downstream server 45.90.30.62:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised send buffer to 212992 for local address '172.16.172.4:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised receive buffer to 212992 for local address '172.16.172.4:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Listening on 172.16.172.4:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised send buffer to 212992 for local address '172.16.172.1:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised receive buffer to 212992 for local address '172.16.172.1:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Listening on 172.16.172.1:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised send buffer to 212992 for local address '127.0.0.1:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised receive buffer to 212992 for local address '127.0.0.1:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Listening on 127.0.0.1:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised send buffer to 212992 for local address '172.16.20.1:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised receive buffer to 212992 for local address '172.16.20.1:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Listening on 172.16.20.1:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised send buffer to 212992 for local address '[fe80::20d:b9ff:fe41:41b0]:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised receive buffer to 212992 for local address '[fe80::20d:b9ff:fe41:41b0]:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Listening on [fe80::20d:b9ff:fe41:41b0]:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised send buffer to 212992 for local address '[2a02:1748:fad4:5021::1]:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised receive buffer to 212992 for local address '[2a02:1748:fad4:5021::1]:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Listening on [2a02:1748:fad4:5021::1]:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised send buffer to 212992 for local address '[2a02:1748:fad4:5022::1]:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Raised receive buffer to 212992 for local address '[2a02:1748:fad4:5022::1]:53'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Listening on [2a02:1748:fad4:5022::1]:53
Feb 18 16:41:42 las-sh01 dnsdist[543]: ACL allowing queries from: 10.0.0.0/8, 100.64.0.0/10, 127.0.0.0/8, 169.254.0.0/16, 172.16.0.0/12, 192.168.0.0/16, ::1/128, 2a02:1748:fad4:5021::/64, 2a02:1748:fad4:5022::/64, fc00::/7, fe80::/10
Feb 18 16:41:42 las-sh01 dnsdist[543]: Console ACL allowing connections from: 127.0.0.0/8, ::1/128
Feb 18 16:41:42 las-sh01 dnsdist[543]: Error checking the health of backend [2a07:a8c0::3f:2ba1]:53: connecting to [2a07:a8c0::3f:2ba1]:53: Network is unreachable
Feb 18 16:41:42 las-sh01 dnsdist[543]: Marking downstream [2a07:a8c0::3f:2ba1]:53 as 'down'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Error checking the health of backend [2a07:a8c1::3f:2ba1]:53: connecting to [2a07:a8c1::3f:2ba1]:53: Network is unreachable
Feb 18 16:41:42 las-sh01 dnsdist[543]: Marking downstream [2a07:a8c1::3f:2ba1]:53 as 'down'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Error checking the health of backend 45.90.28.62:53: connecting to 45.90.28.62:53: Network is unreachable
Feb 18 16:41:42 las-sh01 dnsdist[543]: Marking downstream 45.90.28.62:53 as 'down'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Error checking the health of backend 45.90.30.62:53: connecting to 45.90.30.62:53: Network is unreachable
Feb 18 16:41:42 las-sh01 dnsdist[543]: Marking downstream 45.90.30.62:53 as 'down'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Marking downstream 127.0.0.1:5301 as 'up'
Feb 18 16:41:42 las-sh01 dnsdist[543]: Accepting control connections on 127.0.0.1:5199
Feb 18 16:41:42 las-sh01 dnsdist[543]: Webserver launched on 172.16.172.1:8083
Feb 18 16:41:42 las-sh01 systemd[1]: Started dnsdist.service - DNS Loadbalancer.
Feb 18 16:41:42 las-sh01 dnsdist[543]: Marking downstream 127.0.0.1:5300 as 'up'

For the servers, I have this config:

setVerboseHealthChecks(true)
-- pdns-recursor for direct resolving
newServer({address="127.0.0.1:5300", pool=""})
-- dnsmasq for lan domains
newServer({address="127.0.0.1:5301", pool="dnsmasq", checkName='las-sh01.home.arpa.', healthCheckMode='lazy', lazyHealthCheckMode='TimeoutOnly'})

newServer({address="[2a07:a8c0::3f:2ba1]", pool="nextdns", healthCheckMode='lazy', lazyHealthCheckFailedInterval=30, maxCheckFailures=3, lazyHealthCheckThreshold=30, lazyHealthCheckSampleSize=100, lazyHealthCheckMode='TimeoutOnly'})
newServer({address="[2a07:a8c1::3f:2ba1]", pool="nextdns", healthCheckMode='lazy', lazyHealthCheckFailedInterval=30, maxCheckFailures=3, lazyHealthCheckThreshold=30, lazyHealthCheckSampleSize=100, lazyHealthCheckMode='TimeoutOnly'})
newServer({address="45.90.28.62", pool="nextdns", healthCheckMode='lazy', lazyHealthCheckFailedInterval=30, maxCheckFailures=3, lazyHealthCheckThreshold=30, lazyHealthCheckSampleSize=100, lazyHealthCheckMode='TimeoutOnly'})
newServer({address="45.90.30.62", pool="nextdns", healthCheckMode='lazy', lazyHealthCheckFailedInterval=30, maxCheckFailures=3, lazyHealthCheckThreshold=30, lazyHealthCheckSampleSize=100, lazyHealthCheckMode='TimeoutOnly'})

Note that my actions have this (excerpt):

addAction(PoolAvailableRule("nextdns"), PoolAction("nextdns"))
addAction(AllRule(), PoolAction(""))

Manually restarting dnsdist later makes the nextdns upstreams work.

Obviously, during reboot the WAN interfaces are down, so the upstreams are unreachable. But I was hoping for the lazy healthcheck to recover after some time, but that seems not to happen. Any ideas?


16:25 ch@las-sh01:~ % uname -a
Linux las-sh01 6.1.0-18-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux
16:26 ch@las-sh01:~ % lsb_release -a
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 12 (bookworm)
Release:	12
Codename:	bookworm
16:26 ch@las-sh01:~ % dpkg -l dnsdist
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version                                   Architecture Description
+++-==============-=========================================-============-=================================
ii  dnsdist        1.9.0+master.38.g22c29777a-1pdns.bookworm amd64        DNS loadbalancer
```</div>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant