Streaming Responses and Progress States

As we've discussed, agentic steps can take time. If a user waits 15 seconds for a "Thinking" spinner to finish, they will likely refresh the page or assume the system is broken. Streaming is the technical solution to this "Latency Gap."

In this lesson, we will learn how to stream both Tokens (text) and Objects (state updates) from your Python LangGraph backend to your React frontend.

1. Technical Implementation: Server-Sent Events (SSE)

For most AI agents, SSE is superior to WebSockets. It is a one-way stream from the server to the client, it is lighter, and it handles disconnections automatically.

The SSE Flow

User sends a POST request.
Server returns a 200 OK with Content-Type: text/event-stream.
Server keeps the connection open.
Every time a node in the LangGraph finishes, the server sends a "chunk" of data.

# FastAPI Example
from sse_starlette.sse import EventSourceResponse

@app.post("/chat")
async def chat_stream(request: Request):
    # This generator yields data from the LangGraph '.stream()' method
    generator = agent_execute(request.query)
    return EventSourceResponse(generator)

2. Streaming Tokens vs. Streaming Steps

You must distinguish between these two types of data:

Token Streaming (The Answer)

The raw words generated by the LLM.

UX: Words "type" out on the screen.
Goal: Immediate feedback.

Step Streaming (The Reasoning)

The metadata about what the agent is doing.

UX: A checklist that updates. "Searching...", "Analyzing...", "Found result!".
Goal: Transparency and trust.

3. The "Thought Stream" Pattern

Research shows that users feel an agent is "Smarter" if they can see it "Thinking."

React Component: A collapsible "Thoughts" section.
Content: The thought variable from your agent's state.

4. Handling State in React

When 10 "Chunks" of data arrive via SSE, how do you handle them in React?

The "Cumulative" Approach

For text, you append the new string to the previous string.

setMessages(prev => {
    const lastMessage = prev[prev.length - 1];
    return [...prev.slice(0, -1), { ...lastMessage, content: lastMessage.content + chunk }];
});

The "Snap-to-State" Approach

For tools or progress bars, the server sends a "Finished" event for each node.

Event: NODE_START: search -> Set React state to "Searching..."
Event: NODE_END: search -> Set React state to "Search Complete" + show results.

5. Progress Bars for Non-Linear Tasks

If an agent is auditing 50 files, a "Total Progress" bar is essential.

The State in LangGraph should store processed_files and total_files.
Send this ratio to the UI every 5 seconds.
Result: "Auditing Files: [|||||| ] 60%"

6. Dealing with Errors in Streams

If an SSE stream breaks halfway:

React must realize the connection closed without a "FINISH" tag.
React should display a "Connection Lost. Reconnecting..." message.
The Checkpointer (Module 6.3) allows the agent to resume from the exact token it was on.

Summary and Mental Model

Think of Streaming like a Documentary.

A standard API is a Still Photo of the final result.
A Streaming API is the Filming Process.

The user wants to see the "Behind the scenes" because it proves the work is being done.

Exercise: Stream design

Hierarchy: An agent is researching a topic. It will take 30 seconds.
- Design a stream that shows 3 Progress Messages and 1 token-by-token summary at the end.
Technical: Why is JSON.parse harder to use with streaming than standard strings?
- (Hint: What happens if a JSON chunk only contains half of a key-value pair?)
State Management: If two different agents are streaming data to the same UI, how do you prevent their text from getting "Tangled" in the chat window?
- (Hint: Using unique request_id or agent_id for every SSE event). Ready to build trust? Next lesson: User Trust and Transparency.

The Heartbeat of the Agent: Streaming and Progress