Module 10 Lesson 4: Policy and Guardrails
Governance at scale. Implementing global rules that restrict agent behavior regardless of the prompt.
Policy and Guardrails: Global Governance
If one developer builds an agent that can delete databases, and another developer builds an agent that can't, you have a Security Mismatch. In an ADK environment, we move safety logic from individual agents to Global Policies.
1. What is a Policy?
A policy is a "Hard Constraint" that is applied to every agent loop automatically. It acts as a wrapper around the LLM's input and output.
Example Policies:
- PII Filter: Scan all outputs for social security numbers or emails.
- Budget Policy: If an agent has spent more than $5 today, kill the process.
- Tone Policy: If the sentiment of an output is "Angry," block it.
2. Hard Guardrails vs. Soft Prompts
- Soft Prompt: "Please don't be mean." (The LLM can still be mean if it's pushed).
- Hard Guardrail: A Python script that checks the word list and blocks the response if a forbidden word is found. (The LLM has no choice).
In ADK, we use Hard Guardrails.
3. Visualizing the Policy Wrap
graph TD
Input[User Message] --> P1[Policy: Sanitizer]
P1 --> Brain[Agent Reasoning]
Brain --> P2[Policy: Output Filter]
P2 --> Result[Clean Response]
4. Pre-built Guardrail Libraries
You don't need to write every filter from scratch.
- NeMo Guardrails (NVIDIA): A powerful engine for defining "Rails" in a specialized language (Colang).
- Guardrails AI: A library for validating that LLM output follows a specific XML/JSON schema.
- Llama Guard: A specialized model from Meta that acts as a "Moderator" for other models.
5. Implementation Strategy: The "Middleware" Node
In LangGraph (Module 6), you can implement a Policy as a "Guard Node" that exists between every turn.
def global_policy_check(state: State):
content = state["messages"][-1].content
# 1. Check for SQL Injection keywords
if "DROP TABLE" in content.upper():
return {"messages": ["Error: Policy Violation."], "status": "blocked"}
return {"status": "ok"}
Key Takeaways
- Policies ensure consistent safety and compliance across all company agents.
- Hard Guardrails are code-based checks that the LLM cannot bypass.
- Output filtering is critical for preventing the leakage of private data (PII).
- ADK allows you to centrally manage these policies for all agents at once.