
Cluster networking basics
Decode the nervous system of Kubernetes. Learn how Pods talk to each other, how Services find Pods, and the role of the CNI in a distributed network.
Cluster Networking Basics: The Nervous System of Kubernetes
Networking is often considered the most complex part of Kubernetes. In a traditional server environment, you have a physical machine with a static IP address. In Kubernetes, you have thousands of Pods spread across dozens of Nodes, with IPs that change every minute as pods are created and destroyed.
How can a React frontend reliably find a FastAPI backend in this chaos? How do pods on two different servers talk to each other without knowing the physical server's IP?
To answer these questions, we must understand the Kubernetes Networking Model and the Container Network Interface (CNI). In this lesson, we will decode the "Nervous System" of the cluster. We will move away from the "Magic" and understand the actual routing tables and virtual bridges that make communication possible.
1. The Four Ground Rules of K8s Networking
Kubernetes has a very strict philosophy regarding how networking should work. These are the "Rules of the House":
- Direct Pod-to-Pod Communication: Every Pod in the cluster has its own IP address. Any Pod can talk to any other Pod without using Network Address Translation (NAT).
- Node-to-Pod Communication: Every Node can talk to every Pod on any other Node without NAT.
- No IP Conflicts: The IP address a Pod sees for itself (its internal IP) is the same IP that other Pods see for it. There is no "Inner-Outer" IP confusion.
- Isolation via Policy: By default, everything can talk to everything. If you want to block traffic, you must explicitly use a Network Policy (which we will cover in Module 5).
2. Pod-to-Pod Communication: The Overlay Network
Imagine you have two nodes: Node A and Node B. Pod 1 is on Node A, and Pod 2 is on Node B.
How does a packet from Pod 1 reach Pod 2?
The Virtual Bridge
Each node has a virtual bridge (often called cbr0 or docker0). When a Pod is created, it is given a virtual ethernet pair. One end is in the Pod's namespace, and the other is attached to the host's bridge. This gives the pod "Connectivity."
The CNI (Container Network Interface)
Since Kubernetes supports many different cloud providers and data centers, it doesn't handle the actual networking logic itself. Instead, it uses a CNI Plugin.
- AWS VPC CNI: (The standard for AWS EKS) It gives Every Pod a "Real" IP address from your AWS VPC. This is incredibly fast because it uses AWS's native high-performance networking.
- Calico / Flannel / Cilium: These are "Overlay" networks. They wrap your Pod's traffic inside a "VXLan" or "BGP" envelope to move it across servers.
Visualizing the Packet Journey
sequenceDiagram
participant P1 as Pod 1 (Node A)
participant B1 as Bridge (Node A)
participant CNI as CNI / Overlay
participant B2 as Bridge (Node B)
participant P2 as Pod 2 (Node B)
P1->>B1: Send request to 10.244.1.2
B1->>CNI: "I don't know where this IP is!"
CNI->>CNI: Encapsulate packet (Tunnel to Node B)
CNI->>B2: Decapsulate packet
B2->>P2: Deliver to target Pod
3. The Service: Providing a Name to the Cluster
We established that Pod IPs are untrustworthy because they change. To solve this, Kubernetes uses the Service object.
The Virtual IP (VIP)
A Service is essentially a load balancer with a Permanent IP. When you create a service, it is given a name (e.g., ai-backend-service). Kubernetes handles the Internal DNS so that your app can just call http://ai-backend-service and it will work forever.
Types of Services:
- ClusterIP (Default): Exposes the service on a cluster-internal IP. Reachable only from inside the cluster. Perfect for your FastAPI backend that only your frontend should talk to.
- NodePort: Exposes the service on each Node's IP at a static port (between 30000-32767). Reachable from outside the cluster, but not recommended for production.
- LoadBalancer: Exposes the service externally using a cloud provider's load balancer (e.g., an AWS ELB). This gives you a public DNS name.
4. DNS Resolution: CoreDNS
Kubernetes runs a specialized service called CoreDNS. It is the "Phonebook" of the cluster.
When your Python code runs requests.get("http://my-service"), the following happens:
- The Pod's
/etc/resolv.confpoints to the IP of the CoreDNS service. - CoreDNS lookups up "my-service" in its database (which is synchronized with the API Server).
- It returns the ClusterIP of the service.
- Your Linux kernel's kube-proxy (Lesson 3) then intercepts the traffic and points it to a healthy Pod.
5. Practical Example: Cross-Service Communication in Python
Let's look at how a Next.js frontend talks to a FastAPI backend using K8s DNS.
The Frontend Call (Next.js)
Because the frontend is running inside the cluster, it uses the service name.
// Typical API call in a Next.js component
const response = await fetch('http://ai-api-service/summarize', {
method: 'POST',
body: JSON.stringify({ text: "Kubernetes is complex but cool." })
});
The Kubernetes Service YAML
apiVersion: v1
kind: Service
metadata:
name: ai-api-service
spec:
selector:
app: ai-api # Match labels on our FastAPI pods
ports:
- protocol: TCP
port: 80 # The port NEXT.JS calls
targetPort: 8000 # The port FASTAPI is listening on
type: ClusterIP
6. The Future of K8s Networking: Service Mesh
As your application grows to hundreds of services (e.g., a "Customer Service," a "Billing Service," a "Recommendation Service"), managing standard K8s services becomes difficult. You might want:
- Mutual TLS: Encrypting all traffic between pods automatically.
- Retries/Timeouts: If a call to the "AI model" fails, try again 3 times.
- Canary Deployment: Send 10% of users to Version 2 and 90% to Version 1.
This is where a Service Mesh (like Istio or Linkerd) comes in. It sits on top of the Kubernetes networking layer to provide these "Layer 7" (Application Level) features. We will touch on this in Module 12.
7. AI Implementation: Low-Latency CNI Optimization
For high-performance AI tasks—like streaming a real-time response from AWS Bedrock via a WebSocket—networking latency is your enemy.
Why AWS VPC CNI is Better for AI
If you use a standard "Overlay" network, your packets are "wrapped" in a tunnel. This adds a few milliseconds of latency and consumes more CPU. By using the AWS VPC CNI, every pod in your cluster gets its own Elastic Network Interface (ENI) attachment.
- Benefit: Your AI agent talks to AWS Bedrock as if it were running directly on a bare-metal server. No overhead, no tunnels, just raw performance.
8. Summary and Key Takeaways
- IP-per-Pod: Every pod is a first-class citizen on the network.
- No NAT: Direct communication is the standard.
- CNI Plugin: The underlying engine (e.g., AWS VPC CNI, Calico).
- Services: The stable, load-balanced gateways that solve the "Mortal Pod IP" problem.
- CoreDNS: The internal phonebook providing human-readable names to applications.
In the final lesson of this module, we will explore how we combine all these resources into logical groups using Namespaces and Resource Isolation.
9. SEO Metadata & Keywords
Focus Keywords: Kubernetes cluster networking basics, CNI vs Overlay network, Pod-to-Pod communication explained, K8s Service types ClusterIP vs NodePort, CoreDNS tutorial K8s, low latency AI networking.
Meta Description: Decode the networking architecture of Kubernetes. Master the CNI standard, understand how Pods communicate across multiple servers, and learn how Services provide stable, load-balanced endpoints for your FastAPI and Next.js applications.