TheBomb®
TheBomb® Logo
Start Project
Insight
205k Views
566 Shares

How to Set Up OpenClaw Multi-Agent Routing: Run Multiple AI Models

Learn how to configure OpenClaw multi-agent routing to use different AI models for different tasks. Route coding questions to Claude, creative writing to GPT, and research to Gemini — all from one gateway.

TheBomb®

Cody New

TheBomb® Editorial

Network diagram showing multiple AI models connected through OpenClaw routing hub

Why limit yourself to one AI model when you can use the best one for every task? OpenClaw’s multi-agent routing lets you configure multiple LLM providers and intelligently route requests based on the type of question, the channel it comes from, or explicit user commands.

In this guide, we’ll set up a multi-agent OpenClaw deployment that uses Claude for coding, GPT for creative tasks, and Gemini for research — all through a single gateway.


Why Multi-Agent Routing?

Different AI models have different strengths:

ModelBest For
Claude 3.5 SonnetCode generation, technical analysis, long-context reasoning
GPT-4oCreative writing, conversational tasks, image understanding
Gemini 2.0 FlashWeb search, research, real-time information, speed
Local Models (Ollama)Privacy-sensitive tasks, offline use, cost savings

Multi-agent routing gives you the best of all worlds without switching between apps or interfaces.


Prerequisites

  • OpenClaw installed and running (installation guide)
  • API keys from at least 2 LLM providers
  • OpenClaw v3.1+ (multi-agent was added in this version)

Step 1: Configure Multiple Providers

Add all your AI providers to the OpenClaw config:

# ~/.openclaw/config.yaml
ai:
  # Default model used when no routing rule matches
  default_provider: "anthropic"
  default_model: "claude-3.5-sonnet"

  providers:
    anthropic:
      api_key: "${ANTHROPIC_API_KEY}"
      models:
        - name: "claude-3.5-sonnet"
          max_tokens: 8192
          temperature: 0.3
        - name: "claude-3-haiku"
          max_tokens: 4096
          temperature: 0.5

    openai:
      api_key: "${OPENAI_API_KEY}"
      models:
        - name: "gpt-4o"
          max_tokens: 4096
          temperature: 0.7
        - name: "gpt-4o-mini"
          max_tokens: 2048
          temperature: 0.5

    google:
      api_key: "${GOOGLE_API_KEY}"
      models:
        - name: "gemini-2.0-flash"
          max_tokens: 8192
          temperature: 0.4

    ollama:
      host: "http://localhost:11434"
      models:
        - name: "llama3.2:70b"
          max_tokens: 4096
          temperature: 0.6

Step 2: Define Routing Rules

Routing rules determine which model handles each request. Rules are evaluated in order — the first match wins.

# ~/.openclaw/config.yaml
routing:
  rules:
    # Rule 1: Code-related requests → Claude
    - name: "coding"
      match:
        keywords: ["code", "function", "debug", "error", "implement", "refactor",
                    "javascript", "python", "typescript", "rust", "api", "sql",
                    "git", "deploy", "docker", "css", "html", "react"]
      route:
        provider: "anthropic"
        model: "claude-3.5-sonnet"
      description: "Technical and coding tasks"

    # Rule 2: Creative tasks → GPT-4o
    - name: "creative"
      match:
        keywords: ["write", "story", "poem", "creative", "brainstorm", "ideas",
                    "name", "slogan", "tagline", "marketing", "copy", "email draft"]
      route:
        provider: "openai"
        model: "gpt-4o"
      description: "Creative writing and ideation"

    # Rule 3: Research → Gemini Flash
    - name: "research"
      match:
        keywords: ["search", "find", "latest", "news", "research", "compare",
                    "what is", "who is", "when did", "how many", "statistics"]
      route:
        provider: "google"
        model: "gemini-2.0-flash"
      description: "Research and real-time information"

    # Rule 4: Quick/simple questions → Haiku (fast + cheap)
    - name: "quick"
      match:
        max_input_tokens: 100  # Short questions
      route:
        provider: "anthropic"
        model: "claude-3-haiku"
      description: "Quick simple responses"

    # Rule 5: Privacy-sensitive → Local model
    - name: "private"
      match:
        keywords: ["private", "confidential", "secret", "personal", "password",
                    "credential", "salary", "medical"]
      route:
        provider: "ollama"
        model: "llama3.2:70b"
      description: "Privacy-sensitive queries stay local"

Step 3: Channel-Based Routing

You can also route based on which messaging channel the request comes from:

routing:
  channel_overrides:
    # Dev team Discord channel always uses Claude
    discord:
      channels:
        "dev-chat":
          provider: "anthropic"
          model: "claude-3.5-sonnet"
        "marketing":
          provider: "openai"
          model: "gpt-4o"

    # Telegram defaults to fast model
    telegram:
      default:
        provider: "google"
        model: "gemini-2.0-flash"

Step 4: Manual Model Selection

Users can explicitly choose a model using prefixes:

routing:
  manual_prefixes:
    "claude:": { provider: "anthropic", model: "claude-3.5-sonnet" }
    "gpt:": { provider: "openai", model: "gpt-4o" }
    "gemini:": { provider: "google", model: "gemini-2.0-flash" }
    "local:": { provider: "ollama", model: "llama3.2:70b" }

Usage in any channel:

claude: Write a TypeScript function for binary search
gpt: Write a compelling product description for wireless earbuds
gemini: What are the latest AI developments this week?
local: Review this contract clause (stays completely local)

Step 5: Fallback Configuration

What happens when a provider is down? Configure fallbacks:

routing:
  fallback:
    # If primary provider fails, try these in order
    chain:
      - provider: "anthropic"
        model: "claude-3.5-sonnet"
      - provider: "openai"
        model: "gpt-4o"
      - provider: "google"
        model: "gemini-2.0-flash"
      - provider: "ollama"
        model: "llama3.2:70b"

    # Retry configuration
    max_retries: 2
    retry_delay_ms: 1000

    # Notify on failover
    notify_on_failover: true

Step 6: Cost Tracking & Budgets

Multi-model usage can get expensive. Set budgets per provider:

routing:
  budgets:
    daily_limit_usd: 25.00
    
    per_provider:
      anthropic:
        daily_limit_usd: 15.00
        alert_at_usd: 12.00
      openai:
        daily_limit_usd: 8.00
        alert_at_usd: 6.00
      google:
        daily_limit_usd: 5.00
        alert_at_usd: 4.00
      ollama:
        daily_limit_usd: 0  # Free (local)

    # When budget is exceeded
    over_budget_action: "fallback_to_cheapest"  # or "block" or "notify"

Check your usage anytime:

openclaw usage --today
┌──────────────┬──────────┬────────────┬─────────┐
│ Provider     │ Requests │ Tokens     │ Cost    │
├──────────────┼──────────┼────────────┼─────────┤
│ Anthropic    │ 47       │ 124,500    │ $4.21   │
│ OpenAI       │ 12       │ 38,200     │ $1.89   │
│ Google       │ 31       │ 89,000     │ $0.67   │
│ Ollama       │ 8        │ 22,100     │ $0.00   │
├──────────────┼──────────┼────────────┼─────────┤
│ Total        │ 98       │ 273,800    │ $6.77   │
└──────────────┴──────────┴────────────┴─────────┘

Step 7: A/B Testing Models

Want to compare model quality? Enable A/B routing:

routing:
  ab_testing:
    enabled: true
    experiments:
      - name: "code-quality-test"
        match:
          keywords: ["code", "function", "implement"]
        variants:
          - provider: "anthropic"
            model: "claude-3.5-sonnet"
            weight: 50
          - provider: "openai"
            model: "gpt-4o"
            weight: 50
        log_responses: true
        log_path: "~/.openclaw/data/ab-results/"

After running the experiment, review results:

openclaw ab-results code-quality-test --summary

Restart and Verify

After configuring multi-agent routing:

openclaw gateway restart

# Verify all providers are connected
openclaw diagnostics --providers
┌──────────────┬──────────┬──────────────────┐
│ Provider     │ Status   │ Latency (avg)    │
├──────────────┼──────────┼──────────────────┤
│ Anthropic    │ ● Online │ 1.2s             │
│ OpenAI       │ ● Online │ 0.9s             │
│ Google       │ ● Online │ 0.6s             │
│ Ollama       │ ● Online │ 2.1s             │
└──────────────┴──────────┴──────────────────┘

Conclusion

Multi-agent routing is one of OpenClaw’s most powerful features. Instead of being locked into one model’s strengths and weaknesses, you get the right tool for every job — automatically. Pair it with budget controls and fallback chains, and you’ve built a resilient, cost-effective AI infrastructure.

Need help designing a multi-model AI architecture? Our engineering team specializes in AI infrastructure that scales.

Reading Time

6 Minutes

Category

Development