Reasoning, Planning, and Acting: The Core Loop

To accomplish non-trivial tasks, an agent must do more than just "calculate" an answer. It must exhibit three behaviors: Reasoning (the "Why"), Planning (the "How"), and Acting (the "Do").

1. What is Reasoning?

Reasoning is the model's ability to evaluate its internal state and external surroundings to determine if it is on the right path.

In agentic AI, we often use Chain of Thought (CoT). By forcing the model to write out its "Thoughts," we improve its logical accuracy. This is the "Internal Monologue" of the agent.

2. What is Planning?

Planning is the decomposition of a massive goal into sub-tasks. There are two main styles:

Up-front Planning: The agent lists all 10 steps it will take before starting step 1. (Like a project manager).
Just-in-Time Planning: The agent only thinks about the next step after seeing the result of the current step. (Like a scout).

3. The ReAct Pattern

The ReAct (Reason + Act) pattern is the industry standard for single-agent systems. It formalizes the loop between the LLM and the outside world.

graph LR
    Thought[Thought: What do I need?] --> Action[Action: Call a tool]
    Action --> Observation[Observation: Result from tool]
    Observation --> Thought

The Sequence:

Thought: "I need to know the price of Bitcoin. I will use the yahoo_finance tool."
Action: get_price(symbol="BTC")
Observation: "BTC price is $65,000."
Thought: "Now that I have the price, I can calculate the user's profit..."

4. Code Example: A ReAct Prompt Template

This is how we "program" an LLM to be an agent. We don't use Python if statements; we use a structured prompt:

Solve the following task. You have access to these tools: [Search, Calculator].

Use the following format:
Thought: your internal reasoning about what to do next.
Action: the tool to call (always one of [Search, Calculator]).
Action Input: the input to the tool.
Observation: the result of the tool.
... (this Thought/Action/Observation can repeat N times)
Thought: I now know the final answer.
Final Answer: the final answer to the original input question.

Begin!
Task: Who is the CEO of Nvidia and what is his current net worth?

5. Why "Thinking" Improves Performance

Researchers found that when an LLM is allowed to "Think" before it provides an "Action," its success rate at complex tasks increases by over 40%. The "Thinking" space acts as a Working Memory where the model can catch its own mistakes and adjust its plan.

Key Takeaways

Reasoning is the "Internal Monologue" that guides the agent.
Planning breaks high-level goals into executable steps.
ReAct is the fundamental loop of Thought -> Action -> Observation.
Agents use Observations to update their Mental Model of the problem.

Module 1 Lesson 3: Reasoning, Planning, and Acting