MCP Server Framework and Tool Development library for building custom capabilities into agents.

Find a file

jottakka 98fad93d21 Adding MCP Servers supports to Arcade Evals (#689 ) # MCP Server Tool Evaluation Support ## Overview Add support for evaluating tools from remote MCP servers without requiring Python callables. Enables direct evaluation of any MCP-compatible tool server. ## What's New ### Core Features - `MCPToolRegistry`: Evaluate tools from a single MCP server - `CompositeMCPRegistry`: Evaluate tools from multiple MCP servers simultaneously - Automatic loaders: `load_from_stdio()` and `load_from_http()` to fetch tools from running servers - Automatic namespacing: Tools prefixed with server name (e.g., `server_tool_name`) - Smart name resolution: Use short names if unique, full names if ambiguous - OpenAI strict mode: Automatic schema conversion prevents parameter hallucinations ### Usage Automatic Loading: ```python from arcade_evals import load_from_stdio, MCPToolRegistry # Load tools automatically from MCP server tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-github"]) registry = MCPToolRegistry(tools) ``` Single MCP Server: ```python from arcade_evals import MCPToolRegistry, ExpectedToolCall registry = MCPToolRegistry(mcp_tools) suite = EvalSuite(catalog=registry) suite.add_case( expected_tool_calls=[ ExpectedToolCall(tool_name="tool_name", args={...}) ] ) ``` Multiple MCP Servers: ```python from arcade_evals import CompositeMCPRegistry, load_from_stdio # Load from multiple servers github_tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-github"]) slack_tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-slack"]) composite = CompositeMCPRegistry( tool_lists={ "github": github_tools, "slack": slack_tools, } ) suite = EvalSuite(catalog=composite) suite.add_case( expected_tool_calls=[ ExpectedToolCall(tool_name="github_list_issues", args={...}) ] ) ``` ## Implementation ### Files Changed - `libs/arcade-evals/arcade_evals/registry.py` (NEW): Registry abstractions and implementations - `libs/arcade-evals/arcade_evals/loaders.py` (NEW): Automatic tool loading from MCP servers - `libs/arcade-evals/arcade_evals/eval.py` (MODIFIED): Enhanced `ExpectedToolCall` and evaluation logic - `libs/arcade-evals/arcade_evals/__init__.py` (MODIFIED): Exported new registries and loaders ### Key Technical Details - Added `BaseToolRegistry` interface for abstraction - `MCPToolRegistry` handles single server tools - `CompositeMCPRegistry` manages multiple servers with collision detection - `load_from_stdio()` and `load_from_http()` for automatic tool discovery - Fixed name normalization bug: MCP tools use underscores (not dots) - Optimized tool copying: 2.5x faster via shallow copy ## Testing - ✅ 41 tests passing (25 new tests added) - ✅ `test_eval_mcp_registry.py`: MCPToolRegistry functionality - ✅ `test_eval_composite_mcp.py`: CompositeMCPRegistry with multiple servers - ✅ Verified backward compatibility with Python tools ## Backward Compatibility ✅ 100% backward compatible - No breaking changes ## Breaking Changes None <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Adds end-to-end eval UX: examples, a robust CLI runner, and rich outputs. > > - New examples: `eval_arcade_gateway.py`, `eval_stdio_mcp_server.py`, `eval_http_mcp_server.py`, `eval_comprehensive_comparison.py` with timeouts, error handling, and track-based comparisons; detailed `README.md` > - CLI runner: `arcade_cli/evals_runner.py` to execute evals/capture in parallel with progress, error isolation, failed-only filtering, context inclusion, and multi-provider/model support > - Output formatters: `arcade_cli/formatters/` (txt, md, html, json) for evals and capture; comparative and multi-model HTML with tabs and context rendering > - Display refactor: `display.py` now supports writing multiple formats, failed-only disclaimers, include-context, and improved console summaries > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit ff8acf9c34a6b61462a019a1ee9df081006517d0. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Francisco Liberal <francisco@arcade.dev> Co-authored-by: Mateo Torres <torresmateo@gmail.com>		2026-01-07 20:26:23 -03:00
.cursor/rules	Cursor versioning rules (#715 )	2025-12-09 11:31:15 -08:00
.github	Run all tests with dependabot (#727 )	2025-12-12 17:35:15 -08:00
.vscode	Rename some 'toolkit' references to 'server' (#624 )	2025-10-14 18:42:27 -07:00
contrib	Fix broken links (#738 )	2026-01-05 13:27:16 -08:00
examples	Adding MCP Servers supports to Arcade Evals (#689 )	2026-01-07 20:26:23 -03:00
libs	Adding MCP Servers supports to Arcade Evals (#689 )	2026-01-07 20:26:23 -03:00
schemas/preview	Tool Metadata (#357 )	2025-04-16 19:17:36 -08:00
toolkits	Bump authlib from 1.3.0 to 1.6.5 (#724 )	2025-12-12 17:16:14 -08:00
.editorconfig	Fix ruff (#64 )	2024-09-25 09:47:30 -07:00
.gitignore	Re-import arcade_core errors into arcade_mcp_server (#620 )	2025-10-13 17:48:54 -07:00
.pre-commit-config.yaml	MCP Local (#563 )	2025-09-25 15:28:15 -07:00
.prettierignore	Fix ruff (#64 )	2024-09-25 09:47:30 -07:00
.prettierrc.toml	Fix ruff (#64 )	2024-09-25 09:47:30 -07:00
.ruff.toml	MCP Local (#563 )	2025-09-25 15:28:15 -07:00
CONTRIBUTING.md	Rename some 'toolkit' references to 'server' (#624 )	2025-10-14 18:42:27 -07:00
core	Dependency conflict rich library (#729 )	2025-12-14 10:46:03 -08:00
cspell.config.yaml	Replace arcade.client with arcadepy (#119 )	2024-10-23 15:29:02 -07:00
LICENSE	Update README and LICENSE (#220 )	2025-01-23 19:43:48 -08:00
Makefile	Rename _meta requirements field to arcade_requirements (#616 )	2025-10-14 19:01:05 -07:00
pyproject.toml	Adding MCP Servers supports to Arcade Evals (#689 )	2026-01-07 20:26:23 -03:00
README.md	Fix broken links (#738 )	2026-01-05 13:27:16 -08:00
SECURITY.md	Fix broken links (#738 )	2026-01-05 13:27:16 -08:00
uv_setup.sh	MCP Local (#563 )	2025-09-25 15:28:15 -07:00

README.md

Prebuilt Tools • Contact Us

Arcade MCP Server Framework

To see example servers built with Arcade MCP Server Framework (this repo), check out our examples
To learn more about the Arcade MCP Server Framework (this repo), check out our Arcade MCP documentation
To learn more about other offerings from Arcade.dev, check out our documentation.

Pst. hey, you, give us a star if you like it!

Quick Start: Create a New Server

The fastest way to get started is with the arcade new CLI command, which creates a complete MCP server project:

# Install the CLI
uv tool install arcade-mcp

# Create a new server project
arcade new my_server

# Navigate to the project
cd my_server/src/my_server

This generates a project with:

server.py - Main server file with MCPApp and example tools
pyproject.toml - Dependencies and project configuration
.env.example - Example .env file containing a secret required by one of the generated tools in server.py

The generated server.py includes proper command-line argument handling and three example tools:

#!/usr/bin/env python3
"""simple_server MCP server"""

import sys
from typing import Annotated

import httpx
from arcade_mcp_server import Context, MCPApp
from arcade_mcp_server.auth import Reddit

app = MCPApp(name="simple_server", version="1.0.0", log_level="DEBUG")


@app.tool
def greet(name: Annotated[str, "The name of the person to greet"]) -> str:
    """Greet a person by name."""
    return f"Hello, {name}!"


# To use this tool locally, you need to either set the secret in the .env file or as an environment variable
@app.tool(requires_secrets=["MY_SECRET_KEY"])
def whisper_secret(context: Context) -> Annotated[str, "The last 4 characters of the secret"]:
    """Reveal the last 4 characters of a secret"""
    # Secrets are injected into the context at runtime.
    # LLMs and MCP clients cannot see or access your secrets
    # You can define secrets in a .env file.
    try:
        secret = context.get_secret("MY_SECRET_KEY")
    except Exception as e:
        return str(e)

    return "The last 4 characters of the secret are: " + secret[-4:]

# To use this tool locally, you need to install the Arcade CLI (uv tool install arcade-mcp)
# and then run 'arcade login' to authenticate.
@app.tool(requires_auth=Reddit(scopes=["read"]))
async def get_posts_in_subreddit(
    context: Context, subreddit: Annotated[str, "The name of the subreddit"]
) -> dict:
    """Get posts from a specific subreddit"""
    # Normalize the subreddit name
    subreddit = subreddit.lower().replace("r/", "").replace(" ", "")

    # Prepare the httpx request
    # OAuth token is injected into the context at runtime.
    # LLMs and MCP clients cannot see or access your OAuth tokens.
    oauth_token = context.get_auth_token_or_empty()
    headers = {
        "Authorization": f"Bearer {oauth_token}",
        "User-Agent": "{{ toolkit_name }}-mcp-server",
    }
    params = {"limit": 5}
    url = f"https://oauth.reddit.com/r/{subreddit}/hot"

    # Make the request
    async with httpx.AsyncClient() as client:
        response = await client.get(url, headers=headers, params=params)
        response.raise_for_status()

        # Return the response
        return response.json()

# Run with specific transport
if __name__ == "__main__":
    # Get transport from command line argument, default to "stdio"
    # - "stdio" (default): Standard I/O for Claude Desktop, CLI tools, etc.
    #   Supports tools that require_auth or require_secrets out-of-the-box
    # - "http": HTTPS streaming for Cursor, VS Code, etc.
    #   Does not support tools that require_auth or require_secrets unless the server is deployed
    #   using 'arcade deploy' or added in the Arcade Developer Dashboard with 'Arcade' server type
    transport = sys.argv[1] if len(sys.argv) > 1 else "stdio"

    # Run the server
    app.run(transport=transport, host="127.0.0.1", port=8000)

This approach gives you:

Complete Project Setup - Everything you need in one command
Best Practices - Proper dependency management with pyproject.toml
Example Code - Learn from working examples of common patterns
Production Ready - Structured for growth and deployment

Running Your Server

Run your server directly with Python:

# Run with stdio transport (default)
uv run server.py

# Run with http transport via command line argument
uv run server.py http

# Or use python directly
python server.py http
python server.py stdio

Your server will start and listen for connections. With HTTP transport, you can access the API docs at http://127.0.0.1:8000/docs.

Configure MCP Clients

Once your server is running, connect it to your favorite AI assistant:

arcade configure claude # Configure Claude Desktop to connect to your stdio server in your current directory
arcade configure cursor --transport http --port 8080 # Configure Cursor to connect to your local HTTP server on port 8080
arcade configure vscode --entrypoint my_server.py # Configure VSCode to connect to your stdio server that will run when my_server.py is executed directly

Installing this Repo from Source

git clone https://github.com/ArcadeAI/arcade-mcp.git && cd arcade-mcp && make install

Support and Community

Discord: Join our Discord community for real-time support and discussions.
GitHub: Contribute or report issues on the Arcade GitHub repository.
Documentation: Find in-depth guides and API references at Arcade Documentation.