Linux DNS stopped working
The newest Linux/Ubuntu 22.04+ DNS resolve stack is a fucking spaghetti of resolvconf, systemd-resolved, Network Manager
and 34724 other cryptic shit. In the good old days you would add nameserver
to /etc/resolv.conf
and be
done with it. Now you can’t even edit the file since it’s a symlink to ../run/resolvconf/resolv.conf
with nameserver 127.0.0.53
which is a DNS server running locally on your machine by systemd-resolved.
If you see something like this, read on:
$ ping www.google.com
ping: www.google.com: Name or service not known
$ host -v www.google.com
Trying "www.google.com"
;; connection timed out; no servers could be reached
$ host -v www.google.com 127.0.0.53
Trying "www.google.com"
;; connection timed out; no servers could be reached
$ host -v www.google.com 8.8.8.8
Trying "www.google.com"
;; now works
How DNS resolution work in Linux as of 2024
As a rule of thumb, most programs (ping
, ssh
, firefox
) use the library calls getaddrinfo
or gethostbyname
provided by glibc
.
glibc
then employs GNU Name Service Switch (NSS) to perform a lookup. NSS is configured via a file called /etc/nsswitch.conf
to determine the lookup priority and policy of how to perform different lookups.
The important line from that file is this one:
hosts: files mdns4_minimal [NOTFOUND=return] dns mymachines
See man page for nsswitch.conf. The rule says:
- files: Consult a file for known host names, which is
/etc/hosts
. - If nothing is found: try
mdns4_minimal
which calls nss-mdns to perform a.local
mDNS lookup (more on this later) [NOTFOUND=return]
says: ifmdns4_minimal
was not able to resolve.local
domain, stop the lookup: no point resolving.local
in a DNS server since DNS servers don’t contain such records. However, ifmdns4_minimal
is not able to resolvewww.google.com
this rule is ignored and next approach is tried out.dns
: ask a DNS server mentioned in/etc/resolv.conf
. Nowadays it’s 127.0.0.53:53 which goes to localsystemd-resolved
.
Some programs are designed to specifically perform DNS-server requests only; those are host
, dig
and resolvectl query
. They skip
the NSS configuration and go directly to the DNS, which these days is systemd-resolved
. They usually fail to resolve .local
stuff
since DNS is not responsible for holding .local
records. That’s why host
can fail to resolve machines on your LAN while ping
can.
But more on this later.
systemd-resolved
Since /etc/resolv.conf
isn’t powerful enough to handle all sorts of crazy scenarios like VPNs, different network interfaces
having different DNS servers, VMs, a different solution was designed: systemd-resolved
.
systemd-resolved
handles DNS resolution in modern Linux distros. It exposes a local (your-machine-only) DNS server on 127.0.0.53:53
; /etc/resolv.conf
then
contains nameserver 127.0.0.53
which tells all Linux commands to resolve DNS via systemd-resolved
.
To learn of the systemd-resolved DNS routing scheme, type in the following:
$ resolvectl status
Global
Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Link 2 (enp2s0f0)
Current Scopes: none
Protocols: -DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Link 5 (wlp3s0)
Current Scopes: DNS
Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.0.1
DNS Servers: 192.168.0.1
Link 6 (virbr0)
Current Scopes: none
Protocols: -DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Link 7 (docker0)
Current Scopes: none
Since DNS is NOT responsible for .local
resolution, generally you want -mDNS
and -LLMNR
(those services disabled). They’re disabled by default.
However systemd-resolved
should resolve all other DNS names. You can test this out by using the following commands:
$ host www.google.com
$ host www.google.com 8.8.8.8
$ dig www.google.com
$ resolvectl query www.google.com
Additional DNS servers (e.g. local dnsmasq)
You can force systemd-resolved
to use additional DNS servers you need (e.g. a local dnsmasq server).
Edit /etc/systemd/resolved.conf
and set DNS=127.0.0.1
(or other IP; see resolved.conf man pages for more info).
Then, restart systemd-resolved
and check that your DNS server is now in effect:
$ sudo systemctl restart systemd-resolved
$ resolvectl status
systemd-resolved
will now try your custom DNS server first, but it will fall back to
whatever it used before, since additional DNS servers are coming from your link (via dhclient or network manager).
This is exactly what we want: first try my custom DNS, but then fallback to a DNS
that can resolve everything, so that my network resolution continues to work correctly.
Troubleshooting
Print DNS servers for all links via resolvectl status
. Make sure you can ping the DNS servers by their IP address (that
they are actually accessible from your client machine).
You can view its logs and/or status, maybe they will reveal something of interest:
$ journalctl -u systemd-resolved.service
$ systemctl status systemd-resolved.service
Alternatively, try to restart the systemd-resoved service:
$ sudo systemctl restart systemd-resolved.service
$ resolvectl status
wireguard + Ubuntu 22.10
If you run wg-quick up mywg
and suddenly DNS stops working (while it worked on Ubuntu 22.04),
the reason is that resolvectl is now configured a bit differently when wireguard conf file contains a DNS entry.
- On Ubuntu 22.04, any wireguard DNS entry would go into the
Global
section ofresolvectl status
, and your wireguard link’sCurrent Scopes
would readnone
- On Ubuntu 22.10, the wireguard DNS entry goes into your wireguard link, sets the
Current Scopes
toDNS
and addsDNS Domain: ~.
which somehow breaks your DNS.
Workaround: comment out the DNS entry in your wireguard conf file.
Local/mDNS/LLMNR
The .local
top-level-domain (TLD) is reserved to be used for your LAN. Every machine
has its own hostname, and if configured properly, it is accessible on your LAN by
the name of hostname.local
.
There are two competing standards, mDNS/multicast DNS and LLMNR; both use the same principle of listening to broadcasts. mDNS is primarily used by Linux and Apple, LLMNR is used primarily by Windows; I’ll focus on mDNS.
mDNS performs lookup by broadcasting on UDP 224.0.0.251:5353
which sends the query to all machines on your LAN.
All mDNS-capable server machines on LAN use avahi-daemon
to listen
on UDP port 5353 and reply to the original sender with its IP address if the lookup targets that particular machine.
mDNS Resolvers
In modern Linux desktops, you have multiple mDNS resolvers:
- nss-mdns plugs as GNU Name Service Switch (NSS) into
glibc
and resolves mDNS viaavahi-daemon
(see StackExchange). You can test this resolver viaavahi-resolve --name foo.local
. All linux programs includingping
andssh
uses this one. host
,dig
andresolvectl query
usessystemd-resolved
to resolve mDNS.
Since ping uses avahi-daemon
, it’s possible that ping
is able to resolve foo.local
stuff while host
and resolvectl query
can’t,
since they go through systemd-resolved
.
Remember that even though systemd-resolved
is technically capable of performing mDNS resolution, it really shouldn’t.
It’s the right configuration to have mDNS turned off in systemd-resolved
.
Enabling nss-mdns
Enabling this one is more important than systemd-resolved
since all Linux programs except DNS clients such as host
and dig
use this method.
See nss-mdns: Activation. In order to activate
this nss module you need to edit /etc/nsswitch.conf
and make sure mdns4
or mdns4_minimal
are
included in the hosts:
line. Also make sure libnss-mdns
is installed: sudo apt install libnss-mdns
.
Enabling mDNS in systemd-resolved (not recommended)
Your resolvectl query foo.local
/host
/dig
will fail to resolve the local server by default. That’s recommended,
but maybe in your use-case you may need to enable mDNS. First, run
$ resolvectl status
and check whether mDNS is enabled. If not, see ArchLinux Wiki on enabling mDNS:
sudo vim /etc/systemd/resolved.conf
and enableMulticastDNS
in[Resolve]
.- Restart systemd-resolved:
sudo systemctl restart systemd-resolved.service
mDNS from a virtual machine
The broadcast may not traverse more complex network setups which pretty much include VM networks. One solution is to run the VM in the bridge networking mode, but beware: this exposes your VM to all machines on your LAN. For example:
- Linux-on-Linux KVM with
virtio
and NAT: the guest can only resolve the host but not any other machine on the network.- Switching to bridged mode solved this. See Virt Manager: Bridge for more info.
- Linux-on-Mac UTM with
virtio-net-pci
and Shared Network: the same situation, the guest can only resolve the host but not any other machine on the network. See UTM Network docs for more details.- Switching to bridged mode solved this: it exposed the VM as if running on the LAN along with the host MacBook, and mDNS lookups started to pick up other machines on the LAN. The VM name was also resolvable from other LAN machines and was even pingable.
Another way could be NAT traversal but I have no idea whether mDNS can be configured that way.
Further Down The Rabbit Hole
I haven’t touched VPNs and other more complex stuff, please see the gnome.org blogpost on understanding DNS for more details.