Services: ClusterIP, NodePort, and LoadBalancer

Services: ClusterIP, NodePort, and LoadBalancer

Master the gateway to your applications. Understand how Kubernetes provides stable endpoints, handles load balancing, and integrates with cloud-native traffic managers.

Services: The Gateways to Your Resilience

In a world before Kubernetes, if you wanted to connect two servers, you’d give one a static IP address and hardcode it into the other. This worked fine for a single physical machine, but it is a recipe for disaster in a cloud-native environment. In Kubernetes, Pods are mortal. They are born, they die, and they are reborn with different IP addresses on different servers.

If you have a Next.js frontend trying to talk to an AI inference service, you cannot use IP addresses. You need a permanent, stable "Gateway" that never changes, even if the pods behind it are constantly being swapped out. This is the Service.

In this lesson, we will master the Service Object. We will look at how it uses Selectors to find Pods, how to choose between ClusterIP, NodePort, and LoadBalancer, and how to integrate your cluster with high-performance traffic managers like AWS Elastic Load Balancing (ELB).


1. Why Do We Need Services? (The "Mortal Pod" Problem)

Imagine you are running a FastAPI backend.

  1. K8s creates Pod-A with IP 10.244.1.10.
  2. Your frontend starts sending requests to that IP.
  3. Pod-A crashes. The Deployment recreates it as Pod-B.
  4. Pod-B has IP 10.244.1.15.
  5. Failure: Your frontend is still sending traffic to .10 and getting "Connection Refused."

A Service solves this by providing a Stable DNS Name and a Stable Virtual IP.


2. How Services Work: Labels and Selectors

The magic of the Service lies in its Selector. It doesn't "own" pods; it just "looks" for them based on their labels.

apiVersion: v1
kind: Service
metadata:
  name: ai-api-service
spec:
  selector:
    app: ai-engine # Find any pod with "app: ai-engine"
  ports:
    - protocol: TCP
      port: 80 # The port the SERVICE exposes
      targetPort: 8000 # The port the POD is listening on

The Endpoints Object

Behind the scenes, Kubernetes creates a hidden object called an Endpoint slice. This is a list of all the IPs of pods that match the selector and are currently in a "Ready" state.

  • If a pod fails its Readiness Probe (Lesson 1), it is removed from the Endpoint list.
  • The Service automatically stops sending traffic to it. This is how K8s handles "Self-Healing" at the network level.

3. The Three Types of Services

Choosing the right service type is critical for both security and functionality.

A. ClusterIP (The Internal Secure Gateway)

This is the default. It provides a stable IP address that is only reachable from inside the cluster.

  • Use Case: Your database, your internal Redis cache, or your AI backend that should only be accessed by your frontend app.
  • Benefit: Security. No one from the outside world can even "see" this service.

B. NodePort (The "Front Porch" Access)

Exposes the service on a static port on every Node's physical IP address.

  • Range: 30000 - 32767.
  • Use Case: Good for testing, or if you are running K8s without a cloud-native load balancer.
  • Caveat: It’s not very secure and difficult to manage at scale.

C. LoadBalancer (The Modern Cloud Standard)

When you create a service of type LoadBalancer on a cloud like AWS, Kubernetes sends a command to the cloud provider to provision a "Real" load balancer (e.g., an NLB or ALB).

  • Use Case: Your public-facing Next.js website.
  • Benefit: You get a professional DNS name (e.g., a72...us-east-1.elb.amazonaws.com) that handles thousands of concurrent users.

4. Visualizing the Traffic Entry

graph TD
    User["Internet User"] --> ELB["AWS Load Balancer (Type: LoadBalancer)"]
    
    subgraph "Kubernetes Cluster"
        ELB --> Svc["K8s Service (IP: 10.0.0.5)"]
        
        Svc --> Pod1["Pod A (Healthy)"]
        Svc --> Pod2["Pod B (Healthy)"]
        Svc --> PodX["Pod C (UNHEALTHY - Removed)"]
    end
    
    style PodX fill:#eee,stroke:#999,stroke-dasharray: 5 5
    style Svc fill:#f96,stroke:#333

5. DNS and Service Discovery

Kubernetes runs a service called CoreDNS. It maps service names to their ClusterIPs.

When your Python code makes a call:

import requests
# You don't use IPs! You use the service name.
response = requests.get("http://ai-api-service/v1/generate")

The underlying network layer sees ai-api-service and checks the internal DNS record for the current namespace. It finds the IP 10.0.0.5 and routes the packet. This is "Zero-Configuration" service discovery.


6. Practical Example: Multi-Port Services

Sometimes an app needs multiple ports. For example, your AI container has one port for the FastAPI REST API (8000) and another for Prometheus metrics (9090).

apiVersion: v1
kind: Service
metadata:
  name: dual-port-service
spec:
  selector:
    app: ai-engine
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 8000
    - name: metrics
      protocol: TCP
      port: 9090
      targetPort: 9090

7. AI Implementation: Session Affinity for Agents

In AI Applications, you might be using WebSockets or long-polling to stream an AI response from AWS Bedrock.

If the client is connected to Pod A, you want them to stay connected to Pod A for the duration of that conversation. If the load balancer suddenly switches them to Pod B mid-stream, the conversation might break.

The Solution: Session Affinity

You can tell the Service to remember the client's IP and keep sending them to the same pod.

spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800 # Keep them on the same pod for 3 hours

8. Summary and Key Takeaways

  • Abstraction: Services hide the complexity of mortal pod IPs from the developer.
  • Selectors: Use labels to find healthy pods dynamically.
  • Service Types:
    • ClusterIP: Internal only (secure).
    • NodePort: Direct node access (testing).
    • LoadBalancer: External cloud integration (production).
  • DNS: Use service names, not IPs, for all inter-app communication.
  • Stickiness: Use sessionAffinity for stateful AI conversations.

In the next lesson, we will see how we handle the "Variables" and "Secrets" of these applications using ConfigMaps and Secrets.


9. SEO Metadata & Keywords

Focus Keywords: Kubernetes Service types explained, ClusterIP vs NodePort vs LoadBalancer, K8s Service Selector tutorial, Service Discovery in Kubernetes, CoreDNS internal DNS K8s, how k8s load balancing works.

Meta Description: Master the networking gateways of Kubernetes. Learn how to expose your applications securely using Services, understand the difference between ClusterIP and LoadBalancer, and discover how service discovery brings resilience to your AI and web services.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn