MCP Server Framework and Tool Development library for building custom capabilities into agents.
Find a file
jottakka 98fad93d21
Adding MCP Servers supports to Arcade Evals (#689)
# MCP Server Tool Evaluation Support

## Overview
Add support for evaluating tools from remote MCP servers without
requiring Python callables. Enables direct evaluation of any
MCP-compatible tool server.

## What's New

### Core Features
- **`MCPToolRegistry`**: Evaluate tools from a single MCP server
- **`CompositeMCPRegistry`**: Evaluate tools from multiple MCP servers
simultaneously
- **Automatic loaders**: `load_from_stdio()` and `load_from_http()` to
fetch tools from running servers
- **Automatic namespacing**: Tools prefixed with server name (e.g.,
`server_tool_name`)
- **Smart name resolution**: Use short names if unique, full names if
ambiguous
- **OpenAI strict mode**: Automatic schema conversion prevents parameter
hallucinations

### Usage

**Automatic Loading:**
```python
from arcade_evals import load_from_stdio, MCPToolRegistry

# Load tools automatically from MCP server
tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-github"])
registry = MCPToolRegistry(tools)
```

**Single MCP Server:**
```python
from arcade_evals import MCPToolRegistry, ExpectedToolCall

registry = MCPToolRegistry(mcp_tools)
suite = EvalSuite(catalog=registry)

suite.add_case(
    expected_tool_calls=[
        ExpectedToolCall(tool_name="tool_name", args={...})
    ]
)
```

**Multiple MCP Servers:**
```python
from arcade_evals import CompositeMCPRegistry, load_from_stdio

# Load from multiple servers
github_tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-github"])
slack_tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-slack"])

composite = CompositeMCPRegistry(
    tool_lists={
        "github": github_tools,
        "slack": slack_tools,
    }
)

suite = EvalSuite(catalog=composite)

suite.add_case(
    expected_tool_calls=[
        ExpectedToolCall(tool_name="github_list_issues", args={...})
    ]
)
```

## Implementation

### Files Changed
- **`libs/arcade-evals/arcade_evals/registry.py`** (NEW): Registry
abstractions and implementations
- **`libs/arcade-evals/arcade_evals/loaders.py`** (NEW): Automatic tool
loading from MCP servers
- **`libs/arcade-evals/arcade_evals/eval.py`** (MODIFIED): Enhanced
`ExpectedToolCall` and evaluation logic
- **`libs/arcade-evals/arcade_evals/__init__.py`** (MODIFIED): Exported
new registries and loaders

### Key Technical Details
- Added `BaseToolRegistry` interface for abstraction
- `MCPToolRegistry` handles single server tools
- `CompositeMCPRegistry` manages multiple servers with collision
detection
- `load_from_stdio()` and `load_from_http()` for automatic tool
discovery
- Fixed name normalization bug: MCP tools use underscores (not dots)
- Optimized tool copying: 2.5x faster via shallow copy

## Testing
-  41 tests passing (25 new tests added)
-  `test_eval_mcp_registry.py`: MCPToolRegistry functionality
-  `test_eval_composite_mcp.py`: CompositeMCPRegistry with multiple
servers
-  Verified backward compatibility with Python tools

## Backward Compatibility
 **100% backward compatible** - No breaking changes


## Breaking Changes
**None**


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Adds end-to-end eval UX: examples, a robust CLI runner, and rich
outputs.
> 
> - **New examples**: `eval_arcade_gateway.py`,
`eval_stdio_mcp_server.py`, `eval_http_mcp_server.py`,
`eval_comprehensive_comparison.py` with timeouts, error handling, and
track-based comparisons; detailed `README.md`
> - **CLI runner**: `arcade_cli/evals_runner.py` to execute
evals/capture in parallel with progress, error isolation, failed-only
filtering, context inclusion, and multi-provider/model support
> - **Output formatters**: `arcade_cli/formatters/` (txt, md, html,
json) for evals and capture; comparative and multi-model HTML with tabs
and context rendering
> - **Display refactor**: `display.py` now supports writing multiple
formats, failed-only disclaimers, include-context, and improved console
summaries
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
ff8acf9c34a6b61462a019a1ee9df081006517d0. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Francisco Liberal <francisco@arcade.dev>
Co-authored-by: Mateo Torres <torresmateo@gmail.com>
2026-01-07 20:26:23 -03:00
.cursor/rules Cursor versioning rules (#715) 2025-12-09 11:31:15 -08:00
.github Run all tests with dependabot (#727) 2025-12-12 17:35:15 -08:00
.vscode Rename some 'toolkit' references to 'server' (#624) 2025-10-14 18:42:27 -07:00
contrib Fix broken links (#738) 2026-01-05 13:27:16 -08:00
examples Adding MCP Servers supports to Arcade Evals (#689) 2026-01-07 20:26:23 -03:00
libs Adding MCP Servers supports to Arcade Evals (#689) 2026-01-07 20:26:23 -03:00
schemas/preview Tool Metadata (#357) 2025-04-16 19:17:36 -08:00
toolkits Bump authlib from 1.3.0 to 1.6.5 (#724) 2025-12-12 17:16:14 -08:00
.editorconfig Fix ruff (#64) 2024-09-25 09:47:30 -07:00
.gitignore Re-import arcade_core errors into arcade_mcp_server (#620) 2025-10-13 17:48:54 -07:00
.pre-commit-config.yaml MCP Local (#563) 2025-09-25 15:28:15 -07:00
.prettierignore Fix ruff (#64) 2024-09-25 09:47:30 -07:00
.prettierrc.toml Fix ruff (#64) 2024-09-25 09:47:30 -07:00
.ruff.toml MCP Local (#563) 2025-09-25 15:28:15 -07:00
CONTRIBUTING.md Rename some 'toolkit' references to 'server' (#624) 2025-10-14 18:42:27 -07:00
core Dependency conflict rich library (#729) 2025-12-14 10:46:03 -08:00
cspell.config.yaml Replace arcade.client with arcadepy (#119) 2024-10-23 15:29:02 -07:00
LICENSE Update README and LICENSE (#220) 2025-01-23 19:43:48 -08:00
Makefile Rename _meta requirements field to arcade_requirements (#616) 2025-10-14 19:01:05 -07:00
pyproject.toml Adding MCP Servers supports to Arcade Evals (#689) 2026-01-07 20:26:23 -03:00
README.md Fix broken links (#738) 2026-01-05 13:27:16 -08:00
SECURITY.md Fix broken links (#738) 2026-01-05 13:27:16 -08:00
uv_setup.sh MCP Local (#563) 2025-09-25 15:28:15 -07:00

Prebuilt ToolsContact Us

Arcade MCP Server Framework

  • To see example servers built with Arcade MCP Server Framework (this repo), check out our examples

  • To learn more about the Arcade MCP Server Framework (this repo), check out our Arcade MCP documentation

  • To learn more about other offerings from Arcade.dev, check out our documentation.

Pst. hey, you, give us a star if you like it!

GitHub stars

Quick Start: Create a New Server

The fastest way to get started is with the arcade new CLI command, which creates a complete MCP server project:

# Install the CLI
uv tool install arcade-mcp

# Create a new server project
arcade new my_server

# Navigate to the project
cd my_server/src/my_server

This generates a project with:

  • server.py - Main server file with MCPApp and example tools

  • pyproject.toml - Dependencies and project configuration

  • .env.example - Example .env file containing a secret required by one of the generated tools in server.py

The generated server.py includes proper command-line argument handling and three example tools:

#!/usr/bin/env python3
"""simple_server MCP server"""

import sys
from typing import Annotated

import httpx
from arcade_mcp_server import Context, MCPApp
from arcade_mcp_server.auth import Reddit

app = MCPApp(name="simple_server", version="1.0.0", log_level="DEBUG")


@app.tool
def greet(name: Annotated[str, "The name of the person to greet"]) -> str:
    """Greet a person by name."""
    return f"Hello, {name}!"


# To use this tool locally, you need to either set the secret in the .env file or as an environment variable
@app.tool(requires_secrets=["MY_SECRET_KEY"])
def whisper_secret(context: Context) -> Annotated[str, "The last 4 characters of the secret"]:
    """Reveal the last 4 characters of a secret"""
    # Secrets are injected into the context at runtime.
    # LLMs and MCP clients cannot see or access your secrets
    # You can define secrets in a .env file.
    try:
        secret = context.get_secret("MY_SECRET_KEY")
    except Exception as e:
        return str(e)

    return "The last 4 characters of the secret are: " + secret[-4:]

# To use this tool locally, you need to install the Arcade CLI (uv tool install arcade-mcp)
# and then run 'arcade login' to authenticate.
@app.tool(requires_auth=Reddit(scopes=["read"]))
async def get_posts_in_subreddit(
    context: Context, subreddit: Annotated[str, "The name of the subreddit"]
) -> dict:
    """Get posts from a specific subreddit"""
    # Normalize the subreddit name
    subreddit = subreddit.lower().replace("r/", "").replace(" ", "")

    # Prepare the httpx request
    # OAuth token is injected into the context at runtime.
    # LLMs and MCP clients cannot see or access your OAuth tokens.
    oauth_token = context.get_auth_token_or_empty()
    headers = {
        "Authorization": f"Bearer {oauth_token}",
        "User-Agent": "{{ toolkit_name }}-mcp-server",
    }
    params = {"limit": 5}
    url = f"https://oauth.reddit.com/r/{subreddit}/hot"

    # Make the request
    async with httpx.AsyncClient() as client:
        response = await client.get(url, headers=headers, params=params)
        response.raise_for_status()

        # Return the response
        return response.json()

# Run with specific transport
if __name__ == "__main__":
    # Get transport from command line argument, default to "stdio"
    # - "stdio" (default): Standard I/O for Claude Desktop, CLI tools, etc.
    #   Supports tools that require_auth or require_secrets out-of-the-box
    # - "http": HTTPS streaming for Cursor, VS Code, etc.
    #   Does not support tools that require_auth or require_secrets unless the server is deployed
    #   using 'arcade deploy' or added in the Arcade Developer Dashboard with 'Arcade' server type
    transport = sys.argv[1] if len(sys.argv) > 1 else "stdio"

    # Run the server
    app.run(transport=transport, host="127.0.0.1", port=8000)

This approach gives you:

  • Complete Project Setup - Everything you need in one command

  • Best Practices - Proper dependency management with pyproject.toml

  • Example Code - Learn from working examples of common patterns

  • Production Ready - Structured for growth and deployment

Running Your Server

Run your server directly with Python:

# Run with stdio transport (default)
uv run server.py

# Run with http transport via command line argument
uv run server.py http

# Or use python directly
python server.py http
python server.py stdio

Your server will start and listen for connections. With HTTP transport, you can access the API docs at http://127.0.0.1:8000/docs.

Configure MCP Clients

Once your server is running, connect it to your favorite AI assistant:

arcade configure claude # Configure Claude Desktop to connect to your stdio server in your current directory
arcade configure cursor --transport http --port 8080 # Configure Cursor to connect to your local HTTP server on port 8080
arcade configure vscode --entrypoint my_server.py # Configure VSCode to connect to your stdio server that will run when my_server.py is executed directly

Installing this Repo from Source

git clone https://github.com/ArcadeAI/arcade-mcp.git && cd arcade-mcp && make install

Support and Community