Policy and Guardrails: Global Governance

If one developer builds an agent that can delete databases, and another developer builds an agent that can't, you have a Security Mismatch. In an ADK environment, we move safety logic from individual agents to Global Policies.

1. What is a Policy?

A policy is a "Hard Constraint" that is applied to every agent loop automatically. It acts as a wrapper around the LLM's input and output.

Example Policies:

PII Filter: Scan all outputs for social security numbers or emails.
Budget Policy: If an agent has spent more than $5 today, kill the process.
Tone Policy: If the sentiment of an output is "Angry," block it.

2. Hard Guardrails vs. Soft Prompts

Soft Prompt: "Please don't be mean." (The LLM can still be mean if it's pushed).
Hard Guardrail: A Python script that checks the word list and blocks the response if a forbidden word is found. (The LLM has no choice).

In ADK, we use Hard Guardrails.

3. Visualizing the Policy Wrap

graph TD
    Input[User Message] --> P1[Policy: Sanitizer]
    P1 --> Brain[Agent Reasoning]
    Brain --> P2[Policy: Output Filter]
    P2 --> Result[Clean Response]

4. Pre-built Guardrail Libraries

You don't need to write every filter from scratch.

NeMo Guardrails (NVIDIA): A powerful engine for defining "Rails" in a specialized language (Colang).
Guardrails AI: A library for validating that LLM output follows a specific XML/JSON schema.
Llama Guard: A specialized model from Meta that acts as a "Moderator" for other models.

5. Implementation Strategy: The "Middleware" Node

In LangGraph (Module 6), you can implement a Policy as a "Guard Node" that exists between every turn.

def global_policy_check(state: State):
    content = state["messages"][-1].content
    # 1. Check for SQL Injection keywords
    if "DROP TABLE" in content.upper():
        return {"messages": ["Error: Policy Violation."], "status": "blocked"}
    return {"status": "ok"}

Key Takeaways

Policies ensure consistent safety and compliance across all company agents.
Hard Guardrails are code-based checks that the LLM cannot bypass.
Output filtering is critical for preventing the leakage of private data (PII).
ADK allows you to centrally manage these policies for all agents at once.

Module 10 Lesson 4: Policy and Guardrails