Output Parsing: The Structured Shield

A text response like "The price is $150" is hard for a computer to validate. A JSON response like {"price": 150, "currency": "USD"} is easy to validate. In agentic systems, we use Structured Output to prevent the LLM from drifting into creative (but incorrect) prose.

1. Using Pydantic for Validation

Pydantic is a Python library that enforces data types. In an agentic flow, we can force the model to fill in a Pydantic object.

from pydantic import BaseModel, Field

class StockReport(BaseModel):
    symbol: str = Field(description="The ticker symbol")
    price: float = Field(description="The current price")
    is_bullish: bool = Field(description="Is the sentiment positive?")

If the LLM tries to put "banana" in the price field, the code will fail before the bad data reaches your database.

2. Pydantic -> Prompt Conversion

You don't have to manually write the prompt for JSON. Frameworks like LangChain can convert your Pydantic class into a schema the LLM understands.

The Prompt generated automatically: "Return the output in the following JSON format: { 'symbol': 'string', 'price': 'number' ... }"

3. Visualizing the Validation Pipe

graph LR
    User[Query] --> Brain[LLM Brain]
    Brain --> Raw[Raw String JSON]
    Raw --> Parser[Pydantic Parser]
    Parser -- Success --> App[Application logic]
    Parser -- Fail --> Feedback[Send error back to LLM]
    Feedback --> Brain

4. The "Correction" Loop

When the Parser fails (e.g., the JSON is missing a comma), you don't crash. You send the Python Error Message back to the LLM.

System: "Your JSON was invalid: 'Expecting property name at line 3 column 5'. Please fix it and return the corrected JSON."
Models like GPT-4 are excellent at fixing their own syntax errors when provided with the error log.

5. Why Validating "Types" Prevents Hallucinations

Hallucinations often happen when a model gets "wordy." By forcing it into a strict JSON schema, you take away its ability to wander. It has to find a number for the price field. If it can't find one, it's more likely to trigger a null or an error than to hide the failure in a long sentence.

Key Takeaways

Structured Output makes AI reliable enough for software integration.
Pydantic is the industry standard for defining these structures.
Automated Correction Loops fix 90% of syntax errors.
Restricting a model's format is a powerful method for restricting its hallucinations.

Module 12 Lesson 3: Output Parsing and Validation