
Action Verification: Precision over Persistence
Learn how to reduce agentic retry-loops through early verification. Master the 'Self-Correction' techniques that save thousands of tokens on failed attempts.
Action Verification: Precision over Persistence
One of the costliest agent behaviors is "Blind Persistence." An agent calls a tool, the tool returns a vague error, and the agent just tries again with the same parameters.
Action Verification is the practice of adding a "Quality Gate" after every tool call. Instead of just "Observing" the result, the agent is forced to Verify if the result is useful. This prevents the agent from move to Step 2 with Step 1's "Garbage" data, saving 100% of the tokens that would have been wasted on the doomed Step 2.
In this lesson, we learn how to build "Validation Nodes" and how to use small models to "Veto" an agent's progress if the incoming data is junk.
1. The Cost of "Garbage In, Garbage Out"
Scenario:
- Agent Step 1: Search for "Sales 2024".
- Result: "Empty result."
- Agent (Blind): "Now I will analyze the Sales for 2024. [Writes 500 words of analysis based on nothing]."
- Result: 500 wasted tokens.
Scenario with Verification:
- Agent Step 1: Search for "Sales 2024".
- Verify Node: "Result is empty. Do not proceed to analysis. Alert user."
- Result: 0 wasted analysis tokens.
2. The Specialist "Verifier" (Supervisor Pattern)
As we learned in Module 9.2, splitting agents is good. A Verifier Agent is a specialist that doesn't have tools. Its only job is to look at a Search Result and say "Yes/No."
graph LR
A[Worker Agent] -->|Tool Result| V[Verifier Agent]
V -->|Pass| B[Next Step]
V -->|Fail| C[Re-try or Halt]
style V fill:#f96,stroke-dasharray: 5 5
By using a Cheaper Model (GPT-4o mini) for the "Veto," you save the expensive model (GPT-4o) from wasting tokens on bad data.
3. Implementation: The Validation Gate (Python)
Python Code: Explicit Action Check
def verify_search_quality(result: str, query: str) -> bool:
"""
Check if the search result actually contains
the answer to the query.
"""
prompt = (
f"Query: {query}. Result: {result}. "
"Does this result answer the query? Answer: [YES/NO]"
)
# We use a very fast, cheap model for the 'Yes/No' check
decision = call_cheap_model(prompt)
return "YES" in decision.upper()
# In the Agent Loop
res = search_tool(q)
if not verify_search_quality(res, q):
# Stop the agent before it starts 'Thinking' about junk data
return "FAILED: Search returned no signal."
4. Setting "Confidence Thresholds"
For numerical data, your Python code should verify the format.
If an agent is supposed to find a price, and the tool returns a string like "Price not found," a simple Python regex check can "Veto" the turn before any LLM tokens are even used.
Rule: Always use Deterministic Code (Python) for verification before falling back to Probabilistic Code (LLM).
5. Token Savings: Recursive Correction
If an agent makes a mistake in Step 1, it will likely repeat that mistake in Step 2. Verification acts as a Circuit Breaker.
Efficiency ROI: One verification check costs ~50 tokens. One failed "Analysis" turn costs ~1,000 tokens. One "Veto" pays for 20 verification checks.
6. Summary and Key Takeaways
- Verify Early: Don't let an agent reason about empty or bad data.
- Cheaper Verifiers: Use small models for the "Quality Gate" check.
- Deterministic First: Use Python logic (Regex, Length Check) to verify results before using AI tokens.
- Veto is a Success: Stopping early on bad data is a win for the budget.
In the next lesson, Human-in-the-Loop for Token Savings, we look at چگونه to use the "Ultimate Intelligence" to save the "Artificial Intelligence" tokens.
Exercise: The Junk Filter
- Run a search for a non-existent topic (e.g. "The 12th moon of Jupiter").
- Capture the messy search result.
- Write a Python function using an LLM that reads that mess and says "FAIL" if the moon isn't specifically named.
- Verify: Integrate this into an agent loop.
- Analyze: Did the agent stop, or did it try to "Guess" the moon's name?
- (If it guessed, you need to harden your Verification Prompt).