Language:English VersionChinese Version

You Learned DNS Once. Then You Forgot Everything Except “It Propagates.”

DNS is one of those technologies that every developer encounters, few truly understand, and most troubleshoot by waiting 48 hours and hoping. The “just wait for propagation” advice is usually wrong — DNS changes do not propagate like a wave. They expire from caches at different times based on TTL values. That distinction matters when you are debugging why half your users see the old site and half see the new one.

This is not a networking textbook chapter. This is the practical DNS knowledge that backend developers actually need when they are setting up domains, debugging connectivity issues, or trying to figure out why their email goes to spam.

How DNS Resolution Actually Works

When a user types api.example.com into their browser, here is what actually happens:

  1. Browser cache: The browser checks its own DNS cache first. Chrome caches DNS entries for 60 seconds by default.
  2. OS resolver: If not cached, the OS resolver checks its cache (managed by systemd-resolved on Linux, mDNSResponder on macOS).
  3. Recursive resolver: If not cached locally, the query goes to a recursive resolver (your ISP, or 8.8.8.8, or 1.1.1.1). This is the server that does the actual work.
  4. Root nameservers: The recursive resolver asks a root nameserver “who handles .com?”
  5. TLD nameservers: The root responds with the .com TLD nameservers. The recursive resolver asks them “who handles example.com?”
  6. Authoritative nameservers: The TLD nameservers respond with the authoritative nameservers for example.com. The recursive resolver asks them “what is the address of api.example.com?”
  7. Answer: The authoritative nameserver returns the IP address, and the recursive resolver caches it for the duration specified by the TTL.
# Watch the full resolution chain with dig +trace
$ dig +trace api.example.com

; <<>> DiG 9.18.18 <<>> +trace api.example.com
;; global options: +cmd
.                 86400   IN  NS  a.root-servers.net.
.                 86400   IN  NS  b.root-servers.net.
;; Received 239 bytes from 127.0.0.53#53

com.              172800  IN  NS  a.gtld-servers.net.
com.              172800  IN  NS  b.gtld-servers.net.
;; Received 836 bytes from 198.41.0.4#53(a.root-servers.net)

example.com.      172800  IN  NS  ns1.example.com.
example.com.      172800  IN  NS  ns2.example.com.
;; Received 286 bytes from 192.5.6.30#53(a.gtld-servers.net)

api.example.com.  300     IN  A   93.184.216.34
;; Received 62 bytes from 93.184.216.34#53(ns1.example.com)

Record Types That Actually Matter

A and AAAA Records

A records map a hostname to an IPv4 address. AAAA records map to IPv6. These are the most fundamental DNS records:

# A record: hostname -> IPv4
api.example.com.    300    IN    A    93.184.216.34

# AAAA record: hostname -> IPv6
api.example.com.    300    IN    AAAA    2606:2800:220:1:248:1893:25c8:1946

# You can have multiple A records for the same hostname (round-robin DNS)
api.example.com.    300    IN    A    93.184.216.34
api.example.com.    300    IN    A    93.184.216.35
# Clients will typically rotate between these addresses

CNAME Records

CNAME (Canonical Name) records create an alias from one hostname to another. They are the most misunderstood record type:

# CNAME: "www" is an alias for "example.com"
www.example.com.    3600    IN    CNAME    example.com.

# Common use: pointing to a CDN or load balancer
app.example.com.    300     IN    CNAME    d123456.cloudfront.net.

# CRITICAL RULE: A CNAME record cannot coexist with any other
# record type at the same name. This means you CANNOT put a 
# CNAME on your root domain (example.com) because the root
# domain already has SOA and NS records.

# This is INVALID:
# example.com.    300    IN    CNAME    myapp.herokuapp.com.  ← WRONG
# example.com.    300    IN    SOA      ...                    ← conflicts

# Solution: Use ALIAS/ANAME records (provider-specific) or 
# Cloudflare's CNAME flattening

MX Records

MX (Mail Exchange) records tell other mail servers where to deliver email for your domain:

# MX records with priority (lower = preferred)
example.com.    3600    IN    MX    10    mail1.example.com.
example.com.    3600    IN    MX    20    mail2.example.com.
example.com.    3600    IN    MX    30    mail-backup.example.com.

# If you use Google Workspace:
example.com.    3600    IN    MX    1     aspmx.l.google.com.
example.com.    3600    IN    MX    5     alt1.aspmx.l.google.com.
example.com.    3600    IN    MX    5     alt2.aspmx.l.google.com.
example.com.    3600    IN    MX    10    alt3.aspmx.l.google.com.
example.com.    3600    IN    MX    10    alt4.aspmx.l.google.com.

TXT Records

TXT records store arbitrary text and are used for domain verification, email authentication, and various proofs of domain ownership:

# SPF record: which servers can send email for your domain
example.com.    3600    IN    TXT    "v=spf1 include:_spf.google.com ~all"

# DKIM record: public key for email signing
google._domainkey.example.com.    3600    IN    TXT    "v=DKIM1; k=rsa; p=MIIBIjANBg..."

# DMARC record: email authentication policy
_dmarc.example.com.    3600    IN    TXT    "v=DMARC1; p=reject; rua=mailto:dmarc@example.com"

# Domain verification (Google, Microsoft, etc.)
example.com.    3600    IN    TXT    "google-site-verification=abc123..."

# Let's Encrypt DNS-01 challenge
_acme-challenge.example.com.    300    IN    TXT    "gfj9Xq...Rg85nM"

SRV Records

SRV records specify the host and port for specific services. They are used by protocols like SIP, XMPP, and LDAP, and increasingly by modern service discovery systems:

# Format: _service._protocol.domain.  TTL  IN  SRV  priority weight port target
_sip._tcp.example.com.    3600    IN    SRV    10 60 5060 sipserver.example.com.
_xmpp._tcp.example.com.   3600    IN    SRV    10 0  5222 xmpp.example.com.

TTL: The Most Important Number You Keep Ignoring

TTL (Time To Live) is the number of seconds a DNS record can be cached by resolvers. It is the single most important operational parameter in DNS, and getting it wrong is the source of most DNS-related headaches.

# Common TTL values and when to use them
# 300  (5 min)  - Records you change frequently or need to failover quickly
# 3600 (1 hour) - Standard records that change occasionally
# 86400 (1 day) - Records that rarely change (NS records, MX records)

# THE GOLDEN RULE FOR MIGRATIONS:
# 1. BEFORE the change: lower TTL to 300 (wait for the old TTL to expire)
# 2. Make the change
# 3. Verify everything works
# 4. AFTER the change: raise TTL back to 3600+

# Example timeline for migrating to a new server:
# Day 0:  Change TTL from 3600 to 300
# Day 1:  (Wait 24 hours for old 3600 TTL to expire everywhere)
# Day 2:  Change the A record to the new IP
# Day 2+5min: Most users see the new IP (300s TTL)
# Day 3:  Verify everything, raise TTL back to 3600

Debugging DNS Like a Professional

Essential Commands

# Basic query
$ dig api.example.com
# Returns: A record, TTL, authoritative server

# Query a specific record type
$ dig api.example.com AAAA
$ dig example.com MX
$ dig example.com TXT
$ dig _dmarc.example.com TXT

# Query a specific nameserver directly (bypass cache)
$ dig @ns1.example.com api.example.com

# Query a public resolver (check what the world sees)
$ dig @8.8.8.8 api.example.com
$ dig @1.1.1.1 api.example.com

# Short output (just the answer)
$ dig +short api.example.com
93.184.216.34

# Full trace (see the entire resolution chain)
$ dig +trace api.example.com

# Check if a specific resolver has stale data
$ dig @8.8.8.8 +short api.example.com
$ dig @1.1.1.1 +short api.example.com
# If these differ, one resolver has cached the old value

# Reverse DNS lookup
$ dig -x 93.184.216.34

# Check the SOA record (useful for debugging zone issues)
$ dig example.com SOA

Common Debugging Scenarios

# Scenario: "I changed my DNS record but nothing happened"
# Step 1: Check what the authoritative server returns
$ dig @ns1.your-provider.com your-domain.com A
# If this shows the OLD value, your change did not save properly.
# If this shows the NEW value, it is a caching issue.

# Step 2: Check the TTL on the old record
# The old record had a TTL of 86400 (24 hours). Even after you 
# changed it, resolvers that cached the old value will keep it 
# for up to 24 hours.

# Scenario: "Email is not working after domain migration"
# Check MX records
$ dig example.com MX +short
10 mail.example.com.
# Then check that the MX target resolves
$ dig mail.example.com A +short
93.184.216.34
# Then check SPF
$ dig example.com TXT | grep spf
# Then check DMARC
$ dig _dmarc.example.com TXT

DNS for Modern Infrastructure

Internal DNS and Service Discovery

In Kubernetes and Docker environments, internal DNS handles service discovery:

# Kubernetes DNS format:
# ..svc.cluster.local

# From inside a pod, you can resolve:
$ dig payment-service.production.svc.cluster.local
# Returns the ClusterIP of the payment-service

# Headless services return individual pod IPs:
$ dig payment-service-headless.production.svc.cluster.local
# Returns A records for each pod backing the service

# CoreDNS is the default DNS server in Kubernetes
# Its configuration lives in a ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
        }
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }

DNS-Based Load Balancing

DNS can distribute traffic across multiple servers, but it has significant limitations compared to a real load balancer:

# Round-robin DNS: multiple A records
api.example.com.    30    IN    A    10.0.1.1
api.example.com.    30    IN    A    10.0.1.2
api.example.com.    30    IN    A    10.0.1.3

# Limitations:
# - No health checking (returns dead servers)
# - Clients may cache and stick to one IP
# - No session affinity
# - No weighted distribution (without provider support)

# Better: Use DNS for geographic routing, load balancers for server selection
# Route53 geolocation routing, Cloudflare load balancing, etc.

DNS over HTTPS (DoH) and DNS over TLS (DoT)

Traditional DNS queries are sent in plaintext over UDP, which means anyone on the network path can see which domains you resolve. DoH and DoT encrypt DNS queries:

# Query DNS over HTTPS with curl
$ curl -s -H 'accept: application/dns-json' \
  'https://1.1.1.1/dns-query?name=example.com&type=A' | jq .

{
  "Status": 0,
  "Answer": [
    {
      "name": "example.com",
      "type": 1,
      "TTL": 3600,
      "data": "93.184.216.34"
    }
  ]
}

The DNS Checklist for Every New Project

  1. Set reasonable TTLs: 300-600 for records that might change (A, AAAA), 3600+ for stable records (MX, NS).
  2. Configure email authentication: SPF, DKIM, and DMARC records. Without these, your email goes to spam.
  3. Use at least two nameservers: Preferably from different providers for redundancy.
  4. Add CAA records: Specify which certificate authorities can issue certificates for your domain.
  5. Set up monitoring: Monitor DNS resolution from multiple geographic locations.
  6. Document your DNS records: Keep a source-of-truth document (or use Infrastructure as Code) for all DNS records.
  7. Lower TTLs before migrations: Always reduce TTL at least one full TTL period before making changes.

DNS is not sexy infrastructure. Nobody writes blog posts about their beautiful DNS configuration. But DNS failures are some of the most visible and confusing outages you will ever encounter, because they manifest differently for different users based on caching, geography, and resolver behavior. Understanding how DNS actually works — not just how to add an A record in your provider’s dashboard — is the difference between resolving an issue in five minutes and spending three hours convinced your server is broken when it is actually a stale CNAME.

By Michael Sun

Founder and Editor-in-Chief of NovVista. Software engineer with hands-on experience in cloud infrastructure, full-stack development, and DevOps. Writes about AI tools, developer workflows, server architecture, and the practical side of technology. Based in China.

Leave a Reply

Your email address will not be published. Required fields are marked *