Module 10 Lesson 4: Policy and Guardrails
·Agentic AI

Module 10 Lesson 4: Policy and Guardrails

Governance at scale. Implementing global rules that restrict agent behavior regardless of the prompt.

Policy and Guardrails: Global Governance

If one developer builds an agent that can delete databases, and another developer builds an agent that can't, you have a Security Mismatch. In an ADK environment, we move safety logic from individual agents to Global Policies.

1. What is a Policy?

A policy is a "Hard Constraint" that is applied to every agent loop automatically. It acts as a wrapper around the LLM's input and output.

Example Policies:

  • PII Filter: Scan all outputs for social security numbers or emails.
  • Budget Policy: If an agent has spent more than $5 today, kill the process.
  • Tone Policy: If the sentiment of an output is "Angry," block it.

2. Hard Guardrails vs. Soft Prompts

  • Soft Prompt: "Please don't be mean." (The LLM can still be mean if it's pushed).
  • Hard Guardrail: A Python script that checks the word list and blocks the response if a forbidden word is found. (The LLM has no choice).

In ADK, we use Hard Guardrails.


3. Visualizing the Policy Wrap

graph TD
    Input[User Message] --> P1[Policy: Sanitizer]
    P1 --> Brain[Agent Reasoning]
    Brain --> P2[Policy: Output Filter]
    P2 --> Result[Clean Response]

4. Pre-built Guardrail Libraries

You don't need to write every filter from scratch.

  • NeMo Guardrails (NVIDIA): A powerful engine for defining "Rails" in a specialized language (Colang).
  • Guardrails AI: A library for validating that LLM output follows a specific XML/JSON schema.
  • Llama Guard: A specialized model from Meta that acts as a "Moderator" for other models.

5. Implementation Strategy: The "Middleware" Node

In LangGraph (Module 6), you can implement a Policy as a "Guard Node" that exists between every turn.

def global_policy_check(state: State):
    content = state["messages"][-1].content
    # 1. Check for SQL Injection keywords
    if "DROP TABLE" in content.upper():
        return {"messages": ["Error: Policy Violation."], "status": "blocked"}
    return {"status": "ok"}

Key Takeaways

  • Policies ensure consistent safety and compliance across all company agents.
  • Hard Guardrails are code-based checks that the LLM cannot bypass.
  • Output filtering is critical for preventing the leakage of private data (PII).
  • ADK allows you to centrally manage these policies for all agents at once.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn