Dynamic Tool Discovery: Scaling to 100+ Tools

Dynamic Tool Discovery: Scaling to 100+ Tools

Master the art of large-scale tool management. Learn how to implement dynamic tool discovery, semantic search for tools, and hierarchical toolkits to prevent agent confusion and optimize context window usage.

Dynamic Tool Discovery: Scaling to 100+ Tools

When you start building with the Gemini ADK, you might have 3 or 5 tools. At this scale, you can just pass them all in the tools=[] list. But what happens when your enterprise agent needs to access 100 different microservices? Or 500 distinct SQL tables?

If you try to bind 500 tools to a single Gemini instance, you will encounter "Tool Saturation":

  1. Context Bloat: The tool definitions consume a massive amount of your context window tokens.
  2. Model Confusion: Gemini becomes more likely to pick the wrong tool or hallucinate arguments when faced with too many similar options.
  3. Latency: The model takes longer to calculate attention across a massive tool schema.

In this lesson, we will explore Dynamic Tool Discovery—the architectural pattern for managing tools at scale.


1. The Strategy: Two-Stage Tool Selection

Instead of giving Gemini ALL the tools, we use a Retriever to find the "Best Tools" for the current user query.

The Workflow:

  1. The Query: User says "Find the latest invoice for Client X."
  2. The Discovery Step: A small, fast "Search" (often using Vector Search or a Keyword Match) scans a Tool Catalog to find tools related to "invoices."
  3. The Binding Step: The ADK dynamically registers only the 5 most relevant tools (e.g., list_invoices, get_invoice_detail) for that specific conversation turn.
  4. The Execution: Gemini uses the 5 tools with 100% focus and zero noise from irrelevant tools (like restart_server or send_slack).

2. Tools as "Embeddings" (Semantic Search)

To make discovery work, we treat our Tool Descriptions as searchable text.

  • Storage: We store the name and docstring of every tool in a Vector Database.
  • Search: When a user asks a question, we "embed" that question and find the most similar tool docstrings.
  • Result: Even if the user says "Get the bill," the semantic search will find the get_invoice tool because "bill" and "invoice" are mathematically similar in the vector space.
graph TD
    A[User Prompt] --> B[Embedding Model]
    B -->|Search Vector| C[(Tool Catalog Vector DB)]
    C -->|Top 5 Tools| D[Dynamic Binder]
    D -->|Bind Selected Tools| E[Gemini Agent]
    E --> F[Final Action]
    
    style C fill:#4285F4,color:#fff

3. Hierarchical Toolkits (Namespacing)

Another way to handle scale is to group tools into Toolkits based on domain.

  • FinanceToolkit: 20 tools for money and billing.
  • DevOpsToolkit: 30 tools for servers and code.
  • HRToolkit: 15 tools for employee benefits.

The "Router Agent" Pattern:

You use a high-level Router Agent to identify the Domain first.

  • Router: "This is a Finance question."
  • System: Automatically binds the FinanceToolkit and ignores the others.

4. The Tool Catalog Pattern

A Tool Catalog is a central repository (JSON or YAML) that describes every available tool in your organization.

Example: catalog.yaml

tools:
  - name: "get_weather"
    description: "Returns the current temperature."
    endpoint: "https://api.weather.com"
    tags: ["environment", "real-time"]
    
  - name: "post_tweet"
    description: "Sends a message to Twitter/X."
    endpoint: "https://api.twitter.com"
    tags: ["social", "write-action"]

By maintaining this catalog, you can build a Developer Portal where humans can discover what tools the AI is using, and you can programmatically filter which ones are available to specific agents.


5. Performance Impact of Saturated Context

Every tool definition is composed of:

  • Function Name
  • Description
  • Parameter Names
  • Parameter Types
  • Parameter Descriptions

A single complex tool can consume 200-500 tokens.

  • 10 tools = 5,000 tokens (Minimal impact).
  • 100 tools = 50,000 tokens (Significant impact on TTFT and cost).
  • Goal: Always aim to keep the active tool definitions under 2,000 tokens.

6. Implementation: A Semantic Tool Retriever

Let's look at a conceptual Python example for retrieving the right tools at runtime.

# A flat list of all possible "Worker" functions
ALL_TOOLS = [get_weather, get_flight_status, book_hotel, cancel_subscription, ...]

def semantic_tool_retriever(user_query: str):
    # In a real app, this would use an Embedding search!
    # For now, we simulate a simple keyword discovery
    discovery_list = []
    
    if "weather" in user_query.lower():
        discovery_list.append(get_weather)
    if "hotel" in user_query.lower() or "travel" in user_query.lower():
        discovery_list.append(get_flight_status)
        discovery_list.append(book_hotel)
        
    return discovery_list

def run_dynamic_agent(query: str):
    # 1. Discover the tools
    relevant_tools = semantic_tool_retriever(query)
    
    # 2. Bind only what's necessary
    model = genai.GenerativeModel(
        model_name='gemini-1.5-flash',
        tools=relevant_tools or None # None if no tools found
    )
    
    # 3. Execute
    return model.generate_content(query).text

7. Versioning and "Tool Obsolescence"

In a large system, APIs change. When an API version is updated, you have two choices:

  1. Update the Tool: Change the Python code in the connector.
  2. Deprecate the Tool: Create a "v2" tool and use the Retriever to slowly move traffic away from the "v1" tool.

The Dynamic Discovery pattern makes it easy to "hot-swap" tools without restarting your entire application infrastructure.


8. Summary and Exercises

Dynamic Tool Discovery is the Routing Layer of Enterprise AI.

  • Tool Saturation causes latency, confusion, and high cost.
  • Semantic Discovery enables scaling to hundreds of tools.
  • Hierarchical Toolkits provide domain-specific isolation.
  • Dynamic Binding ensures the context window remains lean and focused.

Exercises

  1. Search Strategy: Imagine you have 50 tools for "File Management" (Delete, Move, Rename, Zip, Unzip, etc.). How would you categorize these in a Tool Catalog to make them easy to find for a retriever?
  2. Context Calculation: Estimate the token cost of a tool definition for a function with 10 parameters. (Hint: Each parameter has a name, a type, and a description).
  3. Ambiguity Challenge: What happens if the Retriever finds two tools that do almost the same thing? How should the Supervisor Agent handle the conflict?

In the next lesson, we will look at Testing and Debugging Tools, learning how to ensure our tools work every time, even when the LLM is being creative.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn