Module 19 Lesson 2: Persistence and Recovery
AI with Memory. How AgentCore preserves state across days or weeks, allowing workflows to survive crashes and wait for human input.
Persistence: AI that doesn't forget
In standard APIs, if the internet goes out during a call, the work is lost. In AgentCore, every time a node finishes, the system saves a Checkpoint (State). This allows the workflow to resume exactly where it was, even days later.
1. The "State" Object
The State is a shared JSON object that moves between nodes.
- Node 1: Writes
user_age: 35. - Node 2: Reads
user_age, calculatesrisk, and writesrisk: LOW. - Node 3: Reads
riskto perform the final action.
2. Long-running Tasks
Because AgentCore is persistent, it can "Wait" for external events.
- An agent can start a background job (like a messy data migration), Stop, and wait for a "Finished" notification from a Lambda before moving to the next node.
3. Visualizing State Checkpoints
graph TD
N1[Node 1: Collect Info] --> S[[CHECKPOINT: DB Saved]]
S --> N2[Node 2: Verify ID]
N2 --> S2[[CHECKPOINT: DB Saved]]
Crash[SYSTEM CRASH!] -.- S2
S2 --> Recovery[Resume at Node 3]
💡 Guidance for Learners
Persistence is why AgentCore is the standard for Supply Chain and Project Management AI. These tasks aren't just "Questions"—they are "Journeys" that involve multiple people and systems over a long period.
Summary
- Checkpoints save the state of the workflow after every node.
- Workflows can be Paused and Resumed based on external events.
- Failure Recovery is built into the architecture.
- State Objects are the "Shared Memory" for the entire graph.