Module 19 Lesson 2: Persistence and Recovery
·AWS Bedrock

Module 19 Lesson 2: Persistence and Recovery

AI with Memory. How AgentCore preserves state across days or weeks, allowing workflows to survive crashes and wait for human input.

Persistence: AI that doesn't forget

In standard APIs, if the internet goes out during a call, the work is lost. In AgentCore, every time a node finishes, the system saves a Checkpoint (State). This allows the workflow to resume exactly where it was, even days later.

1. The "State" Object

The State is a shared JSON object that moves between nodes.

  • Node 1: Writes user_age: 35.
  • Node 2: Reads user_age, calculates risk, and writes risk: LOW.
  • Node 3: Reads risk to perform the final action.

2. Long-running Tasks

Because AgentCore is persistent, it can "Wait" for external events.

  • An agent can start a background job (like a messy data migration), Stop, and wait for a "Finished" notification from a Lambda before moving to the next node.

3. Visualizing State Checkpoints

graph TD
    N1[Node 1: Collect Info] --> S[[CHECKPOINT: DB Saved]]
    S --> N2[Node 2: Verify ID]
    N2 --> S2[[CHECKPOINT: DB Saved]]
    
    Crash[SYSTEM CRASH!] -.- S2
    S2 --> Recovery[Resume at Node 3]

💡 Guidance for Learners

Persistence is why AgentCore is the standard for Supply Chain and Project Management AI. These tasks aren't just "Questions"—they are "Journeys" that involve multiple people and systems over a long period.


Summary

  • Checkpoints save the state of the workflow after every node.
  • Workflows can be Paused and Resumed based on external events.
  • Failure Recovery is built into the architecture.
  • State Objects are the "Shared Memory" for the entire graph.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn