
Services: ClusterIP, NodePort, and LoadBalancer
Master the gateway to your applications. Understand how Kubernetes provides stable endpoints, handles load balancing, and integrates with cloud-native traffic managers.
Services: The Gateways to Your Resilience
In a world before Kubernetes, if you wanted to connect two servers, you’d give one a static IP address and hardcode it into the other. This worked fine for a single physical machine, but it is a recipe for disaster in a cloud-native environment. In Kubernetes, Pods are mortal. They are born, they die, and they are reborn with different IP addresses on different servers.
If you have a Next.js frontend trying to talk to an AI inference service, you cannot use IP addresses. You need a permanent, stable "Gateway" that never changes, even if the pods behind it are constantly being swapped out. This is the Service.
In this lesson, we will master the Service Object. We will look at how it uses Selectors to find Pods, how to choose between ClusterIP, NodePort, and LoadBalancer, and how to integrate your cluster with high-performance traffic managers like AWS Elastic Load Balancing (ELB).
1. Why Do We Need Services? (The "Mortal Pod" Problem)
Imagine you are running a FastAPI backend.
- K8s creates
Pod-Awith IP10.244.1.10. - Your frontend starts sending requests to that IP.
Pod-Acrashes. The Deployment recreates it asPod-B.Pod-Bhas IP10.244.1.15.- Failure: Your frontend is still sending traffic to
.10and getting "Connection Refused."
A Service solves this by providing a Stable DNS Name and a Stable Virtual IP.
2. How Services Work: Labels and Selectors
The magic of the Service lies in its Selector. It doesn't "own" pods; it just "looks" for them based on their labels.
apiVersion: v1
kind: Service
metadata:
name: ai-api-service
spec:
selector:
app: ai-engine # Find any pod with "app: ai-engine"
ports:
- protocol: TCP
port: 80 # The port the SERVICE exposes
targetPort: 8000 # The port the POD is listening on
The Endpoints Object
Behind the scenes, Kubernetes creates a hidden object called an Endpoint slice. This is a list of all the IPs of pods that match the selector and are currently in a "Ready" state.
- If a pod fails its Readiness Probe (Lesson 1), it is removed from the Endpoint list.
- The Service automatically stops sending traffic to it. This is how K8s handles "Self-Healing" at the network level.
3. The Three Types of Services
Choosing the right service type is critical for both security and functionality.
A. ClusterIP (The Internal Secure Gateway)
This is the default. It provides a stable IP address that is only reachable from inside the cluster.
- Use Case: Your database, your internal Redis cache, or your AI backend that should only be accessed by your frontend app.
- Benefit: Security. No one from the outside world can even "see" this service.
B. NodePort (The "Front Porch" Access)
Exposes the service on a static port on every Node's physical IP address.
- Range: 30000 - 32767.
- Use Case: Good for testing, or if you are running K8s without a cloud-native load balancer.
- Caveat: It’s not very secure and difficult to manage at scale.
C. LoadBalancer (The Modern Cloud Standard)
When you create a service of type LoadBalancer on a cloud like AWS, Kubernetes sends a command to the cloud provider to provision a "Real" load balancer (e.g., an NLB or ALB).
- Use Case: Your public-facing Next.js website.
- Benefit: You get a professional DNS name (e.g.,
a72...us-east-1.elb.amazonaws.com) that handles thousands of concurrent users.
4. Visualizing the Traffic Entry
graph TD
User["Internet User"] --> ELB["AWS Load Balancer (Type: LoadBalancer)"]
subgraph "Kubernetes Cluster"
ELB --> Svc["K8s Service (IP: 10.0.0.5)"]
Svc --> Pod1["Pod A (Healthy)"]
Svc --> Pod2["Pod B (Healthy)"]
Svc --> PodX["Pod C (UNHEALTHY - Removed)"]
end
style PodX fill:#eee,stroke:#999,stroke-dasharray: 5 5
style Svc fill:#f96,stroke:#333
5. DNS and Service Discovery
Kubernetes runs a service called CoreDNS. It maps service names to their ClusterIPs.
When your Python code makes a call:
import requests
# You don't use IPs! You use the service name.
response = requests.get("http://ai-api-service/v1/generate")
The underlying network layer sees ai-api-service and checks the internal DNS record for the current namespace. It finds the IP 10.0.0.5 and routes the packet. This is "Zero-Configuration" service discovery.
6. Practical Example: Multi-Port Services
Sometimes an app needs multiple ports. For example, your AI container has one port for the FastAPI REST API (8000) and another for Prometheus metrics (9090).
apiVersion: v1
kind: Service
metadata:
name: dual-port-service
spec:
selector:
app: ai-engine
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8000
- name: metrics
protocol: TCP
port: 9090
targetPort: 9090
7. AI Implementation: Session Affinity for Agents
In AI Applications, you might be using WebSockets or long-polling to stream an AI response from AWS Bedrock.
If the client is connected to Pod A, you want them to stay connected to Pod A for the duration of that conversation. If the load balancer suddenly switches them to Pod B mid-stream, the conversation might break.
The Solution: Session Affinity
You can tell the Service to remember the client's IP and keep sending them to the same pod.
spec:
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800 # Keep them on the same pod for 3 hours
8. Summary and Key Takeaways
- Abstraction: Services hide the complexity of mortal pod IPs from the developer.
- Selectors: Use labels to find healthy pods dynamically.
- Service Types:
- ClusterIP: Internal only (secure).
- NodePort: Direct node access (testing).
- LoadBalancer: External cloud integration (production).
- DNS: Use service names, not IPs, for all inter-app communication.
- Stickiness: Use
sessionAffinityfor stateful AI conversations.
In the next lesson, we will see how we handle the "Variables" and "Secrets" of these applications using ConfigMaps and Secrets.
9. SEO Metadata & Keywords
Focus Keywords: Kubernetes Service types explained, ClusterIP vs NodePort vs LoadBalancer, K8s Service Selector tutorial, Service Discovery in Kubernetes, CoreDNS internal DNS K8s, how k8s load balancing works.
Meta Description: Master the networking gateways of Kubernetes. Learn how to expose your applications securely using Services, understand the difference between ClusterIP and LoadBalancer, and discover how service discovery brings resilience to your AI and web services.