
Handling Model Outputs: Streaming, JSON, and Safety
Master the art of receiving data from Gemini. Learn about streaming responses for UX, forcing JSON mode for code, and handling safety interruptions.
Handling Model Outputs
Getting the prompt right is half the battle. The other half is handling what Gemini sends back. In production, you can't just print(response.text). You need to handle Streaming, Structured Data, and Safety Blocks.
1. Streaming Responses
When you generate a long essay, it might take 10 seconds to finish. If the user stares at a spinner for 10 seconds, they will leave. Streaming allows you to show words as they originate (like a typewriter effect).
# Standard (Blocking)
# response = model.generate_content("Write a long story")
# print(response.text) # Waits until 100% done
# Streaming
response = model.generate_content("Write a long story", stream=True)
for chunk in response:
print(chunk.text, end="", flush=True)
# In a web app, you would push 'chunk.text' to the frontend via WebSocket
Why Stream?
- Perceived Latency: The user feels the app is fast because text appears in 0.5s, even if the total job takes 10s.
2. JSON Mode (Structured Output)
If you are building an app, you usually don't want a paragraph; you want JSON to parse. Gemini allows you to force JSON Response.
import typing_extensions as typing
# Define the schema you want
class Recipe(typing.TypedDict):
name: str
ingredients: list[str]
# Pass 'response_mime_type'
response = model.generate_content(
"Give me a cookie recipe",
generation_config=genai.GenerationConfig(
response_mime_type="application/json", response_schema=Recipe
)
)
print(response.text)
# Output: {"name": "Cookies", "ingredients": ["Flour", "Sugar", ...]}
This is critical for reliability. It prevents the model from saying "Sure! Here is your JSON: ..." which breaks your parser.
3. Handling Safety Blocks
Sometimes, Gemini refuses to answer.
- Scenario: User asks "How to make poison?"
- Result: The
response.textmight be empty, or raise an error.
You must check for Finish Reasons.
if response.prompt_feedback.block_reason:
print(f"Blocked due to: {response.prompt_feedback.block_reason}")
try:
print(response.text)
except Exception:
print("No text returned (likely filtered).")
Common Finish Reasons:
STOP: Natural end of generation.SAFETY: Blocked by safety filter.MAX_TOKENS: Cut off because it hit the token limit.
Summary
- Stream long responses for better UX.
- Use JSON Mode for building robust applications.
- Catch Safety Blocks gracefully so your app doesn't crash on controversial inputs.
Module 2 Complete! You now understand the inner workings of the Model. In Module 3, we move to the tools: Google AI Studio Basics.