arcade-mcp

634 commits 1 branch 0 tags 23 MiB

Author	SHA1	Message	Date
jottakka	bcee0f556f	Left over fixes for Windows Papercut PR (#781 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Low Risk > Mostly CI/test and CLI output tweaks, plus a small refactor to reuse existing subprocess termination logic; low risk with minor potential for CI environment/version compatibility issues. > > Overview > Expands CI coverage by adding Python `3.13` and `3.14` to the GitHub Actions matrices (main tests, install test, and no-auth CLI integration), and removes a redundant editable install step in the no-auth workflow. > > Cleans up Windows subprocess handling by dropping `arcade_cli.deploy._graceful_terminate` and calling the shared `arcade_core.subprocess_utils.graceful_terminate_process` directly, with corresponding test updates. > > Improves `arcade new` scaffolding guidance by printing numbered “Next steps” with explicit stdio/HTTP run options, and adds/updates CLI tests to assert this output. Also bumps package version to `1.11.2` and tightens pre-commit `ruff` excludes (no longer excluding `_scratch`). > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 55c2ae106f13e5657acdbebf63e00d74c171181f. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-02-26 13:24:15 -03:00
jottakka	98fad93d21	Adding MCP Servers supports to Arcade Evals (#689 ) # MCP Server Tool Evaluation Support ## Overview Add support for evaluating tools from remote MCP servers without requiring Python callables. Enables direct evaluation of any MCP-compatible tool server. ## What's New ### Core Features - `MCPToolRegistry`: Evaluate tools from a single MCP server - `CompositeMCPRegistry`: Evaluate tools from multiple MCP servers simultaneously - Automatic loaders: `load_from_stdio()` and `load_from_http()` to fetch tools from running servers - Automatic namespacing: Tools prefixed with server name (e.g., `server_tool_name`) - Smart name resolution: Use short names if unique, full names if ambiguous - OpenAI strict mode: Automatic schema conversion prevents parameter hallucinations ### Usage Automatic Loading: ```python from arcade_evals import load_from_stdio, MCPToolRegistry # Load tools automatically from MCP server tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-github"]) registry = MCPToolRegistry(tools) ``` Single MCP Server: ```python from arcade_evals import MCPToolRegistry, ExpectedToolCall registry = MCPToolRegistry(mcp_tools) suite = EvalSuite(catalog=registry) suite.add_case( expected_tool_calls=[ ExpectedToolCall(tool_name="tool_name", args={...}) ] ) ``` Multiple MCP Servers: ```python from arcade_evals import CompositeMCPRegistry, load_from_stdio # Load from multiple servers github_tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-github"]) slack_tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-slack"]) composite = CompositeMCPRegistry( tool_lists={ "github": github_tools, "slack": slack_tools, } ) suite = EvalSuite(catalog=composite) suite.add_case( expected_tool_calls=[ ExpectedToolCall(tool_name="github_list_issues", args={...}) ] ) ``` ## Implementation ### Files Changed - `libs/arcade-evals/arcade_evals/registry.py` (NEW): Registry abstractions and implementations - `libs/arcade-evals/arcade_evals/loaders.py` (NEW): Automatic tool loading from MCP servers - `libs/arcade-evals/arcade_evals/eval.py` (MODIFIED): Enhanced `ExpectedToolCall` and evaluation logic - `libs/arcade-evals/arcade_evals/__init__.py` (MODIFIED): Exported new registries and loaders ### Key Technical Details - Added `BaseToolRegistry` interface for abstraction - `MCPToolRegistry` handles single server tools - `CompositeMCPRegistry` manages multiple servers with collision detection - `load_from_stdio()` and `load_from_http()` for automatic tool discovery - Fixed name normalization bug: MCP tools use underscores (not dots) - Optimized tool copying: 2.5x faster via shallow copy ## Testing - ✅ 41 tests passing (25 new tests added) - ✅ `test_eval_mcp_registry.py`: MCPToolRegistry functionality - ✅ `test_eval_composite_mcp.py`: CompositeMCPRegistry with multiple servers - ✅ Verified backward compatibility with Python tools ## Backward Compatibility ✅ 100% backward compatible - No breaking changes ## Breaking Changes None <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Adds end-to-end eval UX: examples, a robust CLI runner, and rich outputs. > > - New examples: `eval_arcade_gateway.py`, `eval_stdio_mcp_server.py`, `eval_http_mcp_server.py`, `eval_comprehensive_comparison.py` with timeouts, error handling, and track-based comparisons; detailed `README.md` > - CLI runner: `arcade_cli/evals_runner.py` to execute evals/capture in parallel with progress, error isolation, failed-only filtering, context inclusion, and multi-provider/model support > - Output formatters: `arcade_cli/formatters/` (txt, md, html, json) for evals and capture; comparative and multi-model HTML with tabs and context rendering > - Display refactor: `display.py` now supports writing multiple formats, failed-only disclaimers, include-context, and improved console summaries > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit ff8acf9c34a6b61462a019a1ee9df081006517d0. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Francisco Liberal <francisco@arcade.dev> Co-authored-by: Mateo Torres <torresmateo@gmail.com>	2026-01-07 20:26:23 -03:00
Eric Gustin	7fb097f20f	Use monkeypatch for tests that use ARCADE_WORKER_SECRET (#694 ) Reverts the updates to unit tests in https://github.com/ArcadeAI/arcade-mcp/pull/691 and replaces with monkeypatch. They inadvertently changed global process state during the test run causing failure of post-merge and failure of PyPI publish. See https://github.com/ArcadeAI/arcade-mcp/actions/runs/19651637906/job/56283833231 to see what failed	2025-11-24 17:22:17 -08:00
Eric Gustin	44660d18ce	Only serve worker endpoints if secret is set (#691 ) Default to `ARCADE_WORKER_SECRET` being unset. This env var must be explicitly set now. Once it is set, the `worker/` endpoints will be served.	2025-11-24 14:39:14 -08:00
Sam Partee	b6b4cd0a4c	🏗️ Restructure: Multi-Package Architecture + uv Migration (#412 ) ### Overview Major restructuring from monolithic `arcade-ai` package to modular library architecture with standardized uv-based dependency management. ![arcade-ai Monorepo (2)](https://github.com/user-attachments/assets/25f102b0-bb87-4a04-9701-d227d05664b1) ### New Package Structure - `arcade-tdk` - Lightweight toolkit development kit (core decorators, auth) - `arcade-core` - Core execution engine and catalog functionality - `arcade-serve` - FastAPI/MCP server components - `arcade-ai` - Meta package that includes CLI functionality. Optionally include evals via the `evals` extra. Optionally include all packages via the `all` extra. ### Key Benefits - Lighter Dependencies: Toolkits now depend only on `arcade-tdk` (~2 deps) vs full `arcade-ai` (~30+ deps) - Faster Builds: uv provides 10-100x faster dependency resolution and installation - Better Modularity: Clear separation of concerns, consumers import only what they need - Standard Tooling: Eliminates custom poetry scripts, uses standard Python packaging ### Migration Impact - All 20 toolkits converted from poetry → uv with `arcade-tdk` dependencies plus `arcade-ai[evals]` and `arcade-serve` dev dependencies. When developing locally, devs should install toolkits via `make install-local`. - Modern Python 3.10+ type hints throughout - Standardized build system with hatchling backend - Enhanced Makefile with robust toolkit management commands - Removed `arcade dev` CLI command - Reduce the number of files created by `arcade new` and add an option to not generate a tests and evals folder. This foundation enables faster development cycles and cleaner dependency chains for the growing toolkit ecosystem. ### Todo After this PR is merged - [ ] Post-merge workflow(s) (release & publish containers, etc) - [ ] Release order plan. @EricGustin suggests releasing in the following order: 1. `arcade-core` version 0.1.0 2. `arcade-serve` version 0.1.0 and `arcade-tdk` version 0.1.0 3. `arcade-ai` version 2.0.0 4. Patch release for all toolkits (all changes in toolkits are internal refactors) - [ ] [Update docs](https://github.com/ArcadeAI/docs/pull/318) --------- Co-authored-by: Eric Gustin <eric@arcade.dev> Co-authored-by: Eric Gustin <34000337+EricGustin@users.noreply.github.com>	2025-06-11 16:48:17 -07:00

Renamed from arcade/tests/worker/test_worker_base.py (Browse further)

5 commits