
Task Instructions: Decomposing Goals into Actionable Steps
Learn the science of task decomposition for AI agents. Master the techniques for breaking down complex user goals into granular, sequential, and parallelizable sub-tasks that Gemini can execute with high precision.
Task Instructions: Decomposing Goals into Actionable Steps
While System Instructions define who the agent is, Task Instructions define what the agent needs to do in a specific moment. For complex agents, a user's goal is rarely a single step. "Build me a website" or "Research this company" are High-Level Goals that require an agent to perform dozens of sub-tasks.
In this lesson, we will explore the art and science of Task Decomposition. We will learn how to structure prompts that guide Gemini through a logical sequence of actions, how to handle "Parallel vs. Sequential" workflows, and how to define clear "Stop Conditions" so the agent doesn't wander off-task.
1. Goal vs. Task: The Decomposition Gap
A Goal is the desired final state (e.g., "A completed market report"). A Task is a single, atomic unit of work required to reach that state (e.g., "Search for the company's 2025 revenue").
The primary failure mode of AI agents is attempting to jump straight from the Goal to the Result without a plan. The Gemini ADK helps bridge this gap by encouraging a "Plan-First" architecture.
graph TD
A[User Goal: Analyze Competitor X] --> B{Decomposition}
B --> C[Task 1: Search for Financials]
B --> D[Task 2: Identify Key Products]
B --> E[Task 3: Find Executive team]
C --> F[Synthesis]
D --> F
E --> F
F --> G[Final Result]
style B fill:#4285F4,color:#fff
2. Techniques for Effective Task Prompting
To ensure the agent executes these tasks accurately, we use three primary techniques.
A. Chain of Thought (CoT)
Instruct the agent to "think out loud" before acting.
- Prompt: "Before using any tools, write a 3-step plan for how you will solve this request."
- Why it works: It forces the model to generate "hidden state" tokens that act as a roadmap for the subsequent actions.
B. Least-to-Most Prompting
For extremely complex tasks, you don't give the whole goal at once. You give the agent the easiest sub-task first, and use its output to trigger the next, more complex task.
- Example: 1. List the files -> 2. Summarize each file -> 3. Compare the summaries.
C. The "Success Template"
Tell the agent exactly what a "Successful Sub-Task" looks like.
- Prompt: "For each competitor you find, your output MUST include: Company Name, URL, and Core Offering."
3. Sequential vs. Parallel Task Execution
One of the most powerful features of the Gemini ADK is its ability to handle multiple tasks.
Sequential Tasks (Dependencies)
Task B cannot start until Task A is finished.
- Example: You can't summarize a document until you've successfully read it.
- Prompting Strategy: Use ordinal numbers (First, Second, Third) and explicit dependencies ("Using the result from Step 1, perform Step 2").
Parallel Tasks (Efficiency)
Multiple tasks can be performed at the same time.
- Example: Searching for "Competitor A" and "Competitor B" can happen simultaneously.
- Prompting Strategy: "Identify the top 3 competitors and research each of them independently." Gemini's Parallel Function Calling capability is specifically designed for this.
4. Handling Variables and Dynamic Data
Task instructions often need to be "templated."
The Placeholder Pattern
Instead of writing a new prompt for every request, you use placeholders.
# The Task Template
task_template = """
You are currently investigating the company: {company_name}.
Your task is to find their latest {data_type} and format it as a table.
"""
# At Runtime
current_task = task_template.format(company_name="Google", data_type="revenue")
5. Defining the "Stop Condition"
An agent without a stop condition is a "Token Vampire." It will keep "thinking" and "refining" until it runs out of budget.
Types of Stop Conditions:
- Logical Completion: "Once you have found at least 3 sources, provide your final answer and end the session."
- Negative Outcome: "If you cannot find the file after 2 attempts, inform the user and stop."
- Confidence Threshold: "If your confidence in the answer is below 80%, ask the user for clarification instead of proceeding."
6. Implementation: The Multi-Stage Task Script
Let's build a small agent that uses Task Decomposition to research a topic.
import google.generativeai as genai
model = genai.GenerativeModel('gemini-1.5-pro')
def research_agent(topic: str):
# 1. State the Goal
goal = f"Research and summarize the impact of {topic} on the environment."
# 2. Provide the 'Decomposition' Instruction
decomposition_prompt = f"""
Goal: {goal}
Task Instructions:
1. Break this goal into 3 logical research questions.
2. Search for each question and extract key facts.
3. Synthesize the facts into a 200-word summary.
Format your response as:
PLAN: <your 3 steps>
FINDINGS: <your raw data>
SUMMARY: <your final answer>
"""
# In a real ADK app, this would be a multi-turn chat
response = model.generate_content(decomposition_prompt)
return response.text
# print(research_agent("Solid State Batteries"))
7. Common Pitfalls in Task Prompting
- The "Vague Request": "Do some research." Gemini won't know when to stop. Fix: "Find the top 5 news articles from today regarding X."
- Step-Skipping: Gemini might try to give a summary before it has actually finished the "Finding" step. Fix: Use a State Checker middleware to ensure Step N is complete before allowing Step N+1.
- Context Overload: Giving the agent a task that requires more information than fits in the window (though rare with Gemini). Fix: Use RAG or Recursive Summarization.
8. Summary and Exercises
Task instructions are the Tactical Map of the agent.
- Decompose goals into atomic actions.
- Use CoT to force planning.
- Understand the difference between Sequential and Parallel steps.
- Always define a Stop Condition.
Exercises
- Decomposition Practice: A user says: "I want to start a podcast about AI. Give me everything I need." Decompose this into 5 distinct, sequential agent tasks.
- Prompt Hardening: Write a task instruction that forces an agent to "Double-Check" its math. How would you tell it to verify its own work before presenting it?
- Template Design: Create a Pydantic model or a Python dictionary structure that represents a "Task Packet" for an agent. What fields (e.g.,
id,priority,dependency) are necessary?
In the next lesson, we will look at Guardrails and Constraints, the vital "Brakes" that keep our agents from making dangerous or prohibited moves.