DNS resolution inside clusters

DNS resolution inside clusters

Master the phonebook of the cluster. Learn how CoreDNS handles service discovery, simplifies cross-namespace communication, and scales to millions of queries per second.

DNS Resolution: The Phonebook of the Kubernetes Cluster

In a world where Pods and Services are constantly being created and destroyed, you can't rely on IP addresses. We've said this many times. But what is the magic that allows a Next.js container to find a FastAPI service just by typing its name?

The answer is CoreDNS.

CoreDNS is a flexible and extensible DNS server that has been the default for Kubernetes since version 1.13. It is the "Phonebook" of the cluster. Every time you create a Service, CoreDNS automatically creates a DNS record for it. Every Pod in your cluster is pre-configured to ask CoreDNS whenever it needs to "Look up" another service.

In this lesson, we will master the CoreDNS Architecture, understand the structure of Service FQDNs (Fully Qualified Domain Names), and learn how to troubleshoot and optimize DNS for high-performance AI applications.


1. The Anatomy of a Kubernetes DNS Name

In Kubernetes, every service gets a very specific, hierarchical name. It follows this pattern: <service-name>.<namespace>.svc.cluster.local

  • service-name: The name you gave the service in its YAML (e.g., ai-api).
  • namespace: The namespace it lives in (e.g., production).
  • svc: Indicates that this is a Service record.
  • cluster.local: The default cluster domain (the "Last Name" of your cluster).

Short-hand Discovery:

You don't always have to type the whole thing.

  • Inside the same namespace: You can just use ai-api.
  • Across namespaces: You can use ai-api.production.

2. How the Pod Knows (The resolv.conf)

When a Pod starts up, the Kubelet injects a file into it: /etc/resolv.conf.

This file tells the Linux operating system where to go when it needs to resolve a name. It usually points to the internal IP of the CoreDNS Service.

The "Search" Path:

The magic of "Short names" comes from the search path.

search production.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10

When you ping ai-api, the OS automatically tries ai-api.production.svc.cluster.local. If it finds a record there, it returns the IP. This is why you don't have to worry about long domain names during development.


3. The CoreDNS Engine: How it Operates

CoreDNS runs as a Deployment in the kube-system namespace. It is a very lightweight container that uses a Corefile for its configuration.

How it stays updated:

CoreDNS has a plugin called kubernetes. This plugin is constantly watching the API Server.

  • New Service Created: CoreDNS sees the event and immediately adds a new entry to its internal memory map.
  • Pod IPs Change: CoreDNS updates the record for Headless Services (special services used for databases where you want the pod IPs directly).

4. Visualizing the DNS Lookup Process

sequenceDiagram
    participant App as FastAPI Pod
    participant OS as OS Resolver
    participant DNS as CoreDNS Pod
    participant API as K8s API Server
    
    API->>DNS: "New Service 'db' created at 10.0.0.5"
    App->>OS: "Where is 'db'?"
    OS->>OS: Apply search path: 'db.default.out...'
    OS->>DNS: Query 'db.default.svc.cluster.local'
    DNS->>OS: "It's at 10.0.0.5"
    OS->>App: "Found it: 10.0.0.5"

5. External DNS and custom Forwarding

What if your pod needs to talk to a service outside the cluster, like an internal On-Premise Database or a private Active Directory?

You can configure CoreDNS to "Forward" specific domains to an external DNS server.

The Corefile Customization:

.ai-company.internal {
    forward . 192.168.1.50
}

With this rule, any request for database.ai-company.internal will be sent to your corporate DNS server instead of trying to find it inside the cluster.


6. Practical Example: Debugging DNS Failures

DNS is notoriously difficult to debug. If your app is saying "Could not resolve host," follow this checklist:

# 1. Start a 'Network-Tools' pod
kubectl run test-dns --image=infoblox/dnstools -it --rm

# 2. Check the resolv.conf
cat /etc/resolv.conf

# 3. Directly query CoreDNS
nslookup ai-api-service.production.svc.cluster.local

# 4. Check CoreDNS Logs
kubectl logs -n kube-system -l k8s-app=kube-dns

Common issues usually involve the Namespace. If your pod is in dev and the service is in prod, you MUST include the namespace in your call.


7. AI Implementation: High-Throughput DNS Tuning

In an AI cluster with thousands of pods (e.g., for batch processing thousands of documents through an LLM), DNS can become a bottleneck. If every pod makes 100 DNS queries per second, CoreDNS will eventually give up.

The "NodeLocal DNSCache" Pattern:

For high-scale AI, we use NodeLocal DNSCache.

  • Instead of all pods calling the central CoreDNS service, we run a tiny DNS cache on Every Worker Node.
  • The Pod calls its local node cache.
  • The node cache only calls CoreDNS if it doesn't already know the answer. Result: Your DNS latency drops from 10ms to < 1ms, and you eliminate "DNS Storms" that can crash your cluster.

8. Summary and Key Takeaways

  • FQDN: Understand the name.namespace.svc.cluster.local pattern.
  • CoreDNS: The flexible, API-driven engine that powers discovery.
  • resolv.conf: The file that connects your application to the DNS service.
  • Namespaces: DNS names are relative to the caller's namespace.
  • Optimization: Use NodeLocal DNSCache for massive AI workloads.

Congratulations!

You have completed Module 5: Networking in Kubernetes. You have mastered the "Vascular System" of your cluster. You can now build complex, secure, and fast communication paths for your entire microservice empire.

Next Stop: In Module 6: Storage and Volumes, we will go deeper into stateful data, dynamic provisioning, and disaster recovery for your databases and AI models.


9. SEO Metadata & Keywords

Focus Keywords: Kubernetes internal DNS tutorial, how CoreDNS works K8s, svc.cluster.local explained, K8s DNS search path resolv.conf, debugging Kubernetes DNS lookups, NodeLocal DNSCache for AI.

Meta Description: Master the service discovery system of Kubernetes. Learn how CoreDNS handles internal phonebook duties, understand the fully qualified domain name structure, and discover how to optimize DNS for high-performance AI and enterprise clusters.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn