I've recently changed DNS provider for this blog, and that forced me to look into how DNS works a bit closer. I did manage a DNS server for a couple years circa 2006, but I have to say I'd forgotten most of it. In case I forget it again, I'm recording my notes here.

## DNS zone

A DNS "zone" is a set of records. There are various types of records; in general, a record is essentially a line of text of the form:

domain ttl class type data


The class is usually IN (standing for "internet"). There are many types; some common ones are A, MX, NS, and TXT. A records indicate the IP address of the server corresponding to the listed domain; MX records indicate the address (usually as a domain name) of the mail server to send emails @ that domain to; NS records indicate the name servers for the listed domain; and TXT records are free-form text. The TTL (time to live) is a number of seconds during which the entry can be cached.

The DNS zone for a given domain can include entries for subdomains, which is why the domain is part of the record. For example, these two records are part of the one zone for cuddly-octo-palm-tree.com:

cuddly-octo-palm-tree.com.        30    IN    A    52.204.159.248
www.cuddly-octo-palm-tree.com.    30    IN    A    52.204.159.248


There are two other common types of records: SOA records declare the "start of authority" for a domain (more on that below), and CNAME records are essentially redirects: their data field is a domain name, and they assert that in order to resolve the IP address of the domain on the left, you should resolve the A record for the domain on the right.

It is bad form for the data field of a CNAME entry to point to a domain name that itself ultimately results in a CNAME record; good DNS practice would limit the number of such redirections to a single one. Also note that DNS requests ask for specific records, so it's the responsibility of the client to know to ask for a CNAME record if the request for the A record fails.

## Caching server

A DNS "name server" is a system that serves DNS records, i.e. an IP address that responds to DNS queries. Queries come in the form of domain + type, i.e. "please give me the A record for example.com". If the server has that in its own cache, it checks when it got it and whether that's more seconds ago than the TTL. If it is not, it means the entry is still valid, so it serves it.

If the server does not have a fresh-enough response, it needs to fetch it. There may be multiple levels of caching here; many DNS servers have another DNS server configured as a fallback. But, ultimately, we'll reach a point where we need to query the authority on that domain. This is where the hierarchical nature of DNS comes in.

## Authoritative server

DNS segments read from right to left: images.example.com is a subdomain of example.com which is a subdomain of com. If subdomains are thought of as being under their domain, it makes sense that the last segment is called a top-level domain. There is a finite, well-known list of top-level domains (TLDs), administered (the list) by ICANN.

Imagine you are a DNS server trying to resolve the IP address (i.e. the A record) for images.example.com. Who do you ask? There are a handful of well-known root servers, so that's where you should start. They, in essence, are considered to have the authority over all public DNS.

Servers can, through NS records, delegate authority for subdomains to other servers. The root servers delegate to the TLDs, which delegate to individual DNS servers, which can in turn delegate further, and so on. The system assumes you start by querying the root servers and work your way down, and it's the responsibility of the server to tell you when you can stop because it thinks it's the authority for the domain you're querying.

This may be a bit surprising so let me reiterate: a server being authoritative over a domain is something the server itself asserts. "Parent" servers redirect requests, but do not specifically assert that the servers they send to have authority, just that they might know more.

In practice, though, you do expect that, if you follow the redirections from the root servers down, you'll end up at a server that asserts its authority, and, conversely, that if a server asserts authority over a (public) DNS domain, it is a server you can reach by working your way down from the root servers. In both cases, possibly after multiple levels of redirection.

## DNS requests

Let's walk through a hypothetical example of trying to resolve the A record for images.example.com, and let us assume that the zone for images.example.com is stored in ns2.example.com, while the zone for example.com is stored in ns1.example.com.

DNS responses are not just a line with a single record. They are structured messages with multiple sections, among which QUESTION (DNS goes over UDP, not TCP, so it's useful to have the question repeated), ANSWER, and AUTHORITY. A DNS response also contains a number of flags, of which the most important for our discussion here is the aa flag, which is set when the server sending the response considers itself in a position to give an authoritative answer.

The first request our hypothetical resolver would send would be to one of the root servers (assuming we're in the context of resolving public DNS), and it would ask for the entirety of the query (i.e. the resolver does not explicitly deconstruct the domain itself, and instead lets servers do that).

The root server would send back a response in which the ANSWER section is empty, because it does not consider itself the authority on the images.example.com domain. That response also does not have the aa flag set. In the AUTHORITY section, the response will have a list of servers the root server thinks may be able to answer, in this case a list of DNS servers that the root server believes may know more about the com domain. If we're lucky (this will be the case for the root servers, but may not always be true as we move down the chain) the response will also contain an ADDITIONAL section with the A (and AAAA, which are like A but for IPv6 addresses) records for the domains listed in AUTHORITY, saving us a recursive-trip to resolve them.

The next two steps would be similar: we send the same request for an A record to one of the servers suggested by the response from the root server, which will similarly send back a non-authoritative response that contains ns1.example.com in its AUTHORITY section, its IP address in the form of an A record in the ADDITIONAL section, and an empty ANSWER section. The response from ns1.example.com will, again, be similar, and send us to ns2.example.com.

Finally, the response from ns2.example.com would generally not contain either an AUTHORITY or an ADDITIONAL section, but just an ANSWER section with the actual records requested. Importantly, that response would have the aa flag set, indicating that the resolver can cache it (for the duration of the TTL).

This lets us look at NS records in a slightly different way depending on where they appear. When they appear in a parent zone, they are delegation records, indicating that maybe you should look over there for your answer. When they appear in the zone itself, they are authority records, which clients are expected to cache. The DNS server serving the response knows it's the authority because of the SOA record it has.

Every DNS zone should have an SOA record.1 Among other uses, this record will be returned in the AUTHORITY section of a DNS response that does not yield any result (i.e. when asking for a record that does not exist). For example, for this blog, the zone is defined on cuddly-octo-palm-tree.com with no subdomain (i.e. no NS entry). Requesting a non-existent domain below that yields:

$dig this.does-not-exist.cuddly-octo-palm-tree.com @ns-582.awsdns-08.net. +norec ; <<>> DiG 9.10.6 <<>> this.does-not-exist.cuddly-octo-palm-tree.com @ns-582.awsdns-08.net. +norec ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 43820 ;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;this.does-not-exist.cuddly-octo-palm-tree.com. IN A ;; AUTHORITY SECTION: cuddly-octo-palm-tree.com. 900 IN SOA ns-582.awsdns-08.net. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400 ;; Query time: 59 msec ;; SERVER: 205.251.194.70#53(205.251.194.70) ;; WHEN: Sun Oct 17 20:44:42 CEST 2021 ;; MSG SIZE rcvd: 155$


\$ dig this.does-not-exist.cuddly-octo-palm-tree.com @a.root-servers.net +norec


and follow the responses. The +norec flag tells dig not to recurse itself, so you can have the joy of doing it manually.

## Registrars

TLDs are often owned by entities with no desire to interact directly with DNS users, or to host a DNS server for their TLD. I imagine the details may vary from TLD to TLD, but, in general, people who want to register a domain name (registrants) under a TLD need to go through a registrar that has an existing relationship with the corresponding TLD owner.

Registrars make sure the very first step in DNS resolution works. The details are not super important here, but the registrar can be thought of as managing the DNS zone of the TLD, with the special case that that zone only contains NS entries.

That is, of course, not all a registrar does: the bulk of their work is to keep track of who owns each domain name, following the rules of ICANN as well as the additional constraints imposed by the TLD (e.g. the eu TLD mandates that domain holders be EU residents).

Registering a domain name is generally not very useful unless you also have a server hosting your zone. Most "DNS providers" will serve both roles and do the "tell your registrar about your zone" step for you, but it is definitely possible for the registrar and the DNS name server host to be separate entities. In that case, the registrant needs to tell the registrar the names of the DNS servers that host the zone, essentially replicating the NS record for the domain.

## hosts file

End-user devices are generally configured to look for DNS entries in two steps. First, there is a hosts file somewhere (/etc/hosts on Unix systems, C:\Windows\System32\drivers\etc\hosts on Windows) that takes absolute priority. This is a simple text file where each line is an entry with two space-separated fields, where the first field is an IP address and the second field is a domain name. Any request for a domain name present in the hosts file will resolve to the IP address on the same line.

This may or may not be how "end-of-caching-chain" DNS servers resolve TLDs.

If an address is not in the hosts file (which should be the vast majority of them), then the device will send a DNS query to its configured DNS server(s). For most consumer devices those are set automatically to their ISP's DNS servers through the DHCP protocol, but that can be overridden. In either case, the DNS servers are designated by their IP address, as specifying a name for them would be a bit recursive. In a bad way.

The hosts file is very useful for temporary testing, but in general it shouldn't be relied upon too much as it gives you a different view of the internet from everyone else's.

## curl DNS override

I recently discovered that the curl command has a CLI flag that lets you override DNS resolution. This is a great alternative to a temporary change to the hosts file as it does not require root access and there is no risk to forget to set things back.

The flag is --connect-to domain:port:IP:port, and it will resolve domain to IP (and map port to port, though I generally set those to the same value, as you might have guessed) without going through any DNS resolution.

## Conclusion

That's it: my working knowledge of how DNS works. It's a bit fuzzy in some places, but it's been good enough to let me "make things work" so far.

If any of the above is blatantly wrong, please let me know.

1. Every DNS provider I've ever used filled that record for me, so I've never had a reason to dig very deeply into its internal structure.