
The Library of Action: High-Cardinality Tool Sets
Scale your agent's capabilities without overwhelming the model. Learn the 'Tool RAG' pattern to manage systems with hundreds or thousands of available actions.
Handling High-Cardinality Tool Sets
Most beginner agent tutorials show 2 or 3 tools. But what if you are building an agent for an enterprise like Salesforce or AWS, where there are thousands of possible actions?
You cannot send 1,000 tool descriptions to an LLM. It would exceed the context window, cost a fortune, and the model would likely hallucinate or get confused. This is the challenge of High-Cardinality.
In this lesson, we will learn the Tool RAG pattern—the industry standard for scaling agents to unlimited capabilities.
1. The Bottleneck: Tool-Induced Confusion
The more tools you have, the higher the Collision Probability.
- With 5 tools, the model has a 99% chance of picking the right one.
- With 500 tools, that probability drops significantly.
2. The Solution: Tool RAG (Retrieval Augmented Generation)
Instead of giving the agent all tools, we store the tool descriptions in a Vector Database (Chroma, Pinecone).
The Tool RAG Workflow:
- User Query: "Check the status of invoice 543 and email it to Sarah."
- Retrieval Step: A small, fast model (like Haiku) searches the Tool Vector DB for the most relevant tool descriptions.
- Filter: It finds
get_invoice_statusandsend_email_v2. - Dynamic Injection: Only these 2 tools are injected into the prompt for the "Primary" agent.
- Execution: The agent performs the task, unaware that 998 other tools exist.
3. Designing a "Tool Index"
To make Tool RAG work, your tool descriptions must be Semantically Rich.
The Vectorized Docstring
- Tool Name:
aws_s3_list_buckets - Keywords: "storage", "cloud files", "listing", "buckets", "s3".
- Usage Example: "Use this when the user asks 'What files do I have?'"
By including these "Searchable" terms in the vector DB, you ensure the retrieval step is highly accurate.
4. Hierarchy: The "Registry" Pattern
Another way to handle high-cardinality is to group tools into Registries or Packages.
- Level 1: A "Router" agent decides which Domain the task belongs to (e.g., Finance vs. HR).
- Level 2: The "Finance Agent" is loaded with its 20 specific tools.
graph TD
User --> Router{Router Agent}
Router -->|Finance| FinanceAgent[Finance Specialist]
Router -->|HR| HRAgent[HR Specialist]
subgraph Finance_Domain
FinanceAgent --> Tool1[Get Balance]
FinanceAgent --> Tool2[Send Wire]
end
5. Metadata Tagging
Each tool in your database should have metadata tags that help with the filtering logic:
permissions_required: "Admin"latency_tier: "High" (Don't use in real-time chats)cost_tier: "Paid"
6. Cold Start and Tool Caching
If you use Tool RAG, you add one extra "Hop" to your latency. To optimize:
- Cache Common Tools: Always include the most popular 5 tools (like
Search) in the base prompt. - Embed at Build Time: Don't calculate the tool embeddings on every request. Do it once when you start the server.
Summary and Mental Model
Think of High-Cardinality like a Giant Library.
- You don't try to read every book to find an answer.
- You look at the Index (Vector DB) to find the 2 right books.
- You bring those 2 books to your desk (The Prompt) and work with them.
Tool RAG allows an agent to stay focused even in a world of infinite options.
Exercise: Tool RAG Strategy
- The Index: You are building an agent with 1,000 tools for GitHub Management.
- Draft the "Search Keywords" for a tool that
merges_a_pull_request. - Which words will ensure the user can find it using terms like "close code" or "approve changes"?
- Draft the "Search Keywords" for a tool that
- Ambiguity: If a user says "Get me the report," and the Tool RAG finds two tools:
get_daily_reportandget_monthly_report.- How should the agent proceed?
- (Hint: It should ask a "Clarification Question" before picking a tool).
- Architecture: Why is a Sub-graph (Module 6.4) often better than a "Tool RAG" for fixed business processes like "Onboarding a new employee"? Ready to let your agents run on your own hardware? Next module: Fully Local Agent Architectures.