<!-- CURSOR_SUMMARY -->
> [!NOTE]
> **Medium Risk**
> Touches authentication/login flow, credentials-file permissions, and
subprocess lifecycle behavior across platforms; while mostly defensive,
regressions could impact login or process management on Windows/macOS
runners.
>
> **Overview**
> Improves Windows/cross-platform reliability across the CLI and MCP
server: OAuth login now binds the callback server to `127.0.0.1`, avoids
slow loopback reverse-DNS, adds a configurable callback timeout
(`--timeout` + env default), and opens URLs via a Windows-friendly
`_open_browser` to avoid flashing console windows.
>
> Centralizes CLI output via a shared `console` that forces UTF-8 on
Windows, standardizes UTF-8 file reads/writes throughout, tightens
credentials-file permissions on Windows using `icacls`, and adds shared
Windows subprocess helpers for **no-window** process creation and
graceful termination (used by `deploy`, MCP reload, and usage-tracking
worker).
>
> Updates client configuration UX/robustness (Windows AppData resolution
via `platformdirs`, Cursor config path fallbacks + compatibility writes,
overwrite warnings, absolute `uv` path for GUI clients, safer path
display) and improves `deploy` child-process handling to avoid
pipe-buffer deadlocks while giving better debug-aware error messages.
>
> Expands CI to run tests on Linux/Windows/macOS, adds a no-auth CLI
integration workflow, disables usage tracking in toolkits CI, and adds
extensive regression tests for Windows signals, subprocess cleanup,
UTF-8, and config-path edge cases; bumps `arcade-core` to `4.4.2` and
`arcade-mcp-server` to `1.17.2` (with updated dependency pin).
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
0fabd8ca1cd647039ba6ddbdf3f7809c330bab9e. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
@EricGustin you can use this cli command:
```
uv run arcade evals mcp_building_evals_results/eval_toolkit_iteration_dict.py \
-p openai:gpt-4o,gpt-4o-mini \
-p anthropic:claude-sonnet-4-20250514 \
-k openai:$OPENAI_API_KEY \
-k anthropic:$ANTHROPIC_API_KEY \
-d \
--num-runs 3 \
--seed random \
--multi-run-pass-rule majority \
--max-concurrent 6 \
-o mcp_building_evals_results/results
```
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Touches core eval execution and all result formatters while adding new
CLI inputs and output schema (`run_stats`/`critic_stats` and capture
`runs`), so regressions could affect evaluation results and report
compatibility despite being additive and validated.
>
> **Overview**
> Adds **multi-run evaluation support** to `arcade evals` via new flags
`--num-runs`, `--seed`, and `--multi-run-pass-rule`, with upfront
validation and plumbing through the CLI runner into eval/capture suite
execution.
>
> Fixes provider selection UX/bug by making `--use-provider/-p`
**repeatable** (instead of a space-delimited string), updates
docs/examples accordingly, and extends capture mode to optionally record
**per-run tool calls** (`CapturedRun`) when `num_runs > 1`.
>
> Enhances all output formatters (HTML/Markdown/Text/JSON) to
**propagate and display** per-case `run_stats` and `critic_stats`,
including new HTML UI for run tabs/cards and comparative tables showing
mean ± stddev when multi-run data is present.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
2ee1654b7d1fbb9538373507355636164b16a066. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
1. Resolves
[TOO-363](https://linear.app/arcadedev/issue/TOO-363/arcade-deploy-fails-when-additional-deps-are-added-to-the-server).
2. Resolves
[TOO-364](https://linear.app/arcadedev/issue/TOO-364/arcade-cores-tool-skip-logic-is-missing-case-for-direct-execution).
3. Resolves
[TOO-358](https://linear.app/arcadedev/issue/TOO-358/missing-evals-error-message-shows-wrong-command).
4. Resolves
[TOO-365](https://linear.app/arcadedev/issue/TOO-365/arcade-evals-unit-tests-are-hanging).
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Medium risk because it changes how `arcade deploy` spawns the server
process and adjusts toolkit discovery skip logic, which can affect
deployments and tool discovery; however, the changes are small and
covered by new unit/integration tests.
>
> **Overview**
> `arcade deploy` now starts the validation server using the project’s
`.venv` interpreter (via `find_python_interpreter`) instead of the CLI’s
own `sys.executable`, preventing missing dependency failures when the
CLI is installed in an isolated env.
>
> `arcade-core`’s `Toolkit.tools_from_directory` skip logic is hardened
to also skip the currently executing entrypoint by module name
(`__main__.__spec__.name`) when file paths don’t match (e.g., bundled
execution). CLI error printing now escapes plain messages to avoid rich
markup issues, and `arcade-evals` lock acquisition accepts an optional
timeout default.
>
> Adds unit tests for the new toolkit skip behavior and an integration
test that boots the MCP server via direct Python invocation to mirror
deployment behavior, and bumps `arcade-core`, `arcade-mcp-server`, and
root dependency versions accordingly.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
e7785634c231c059f2e0bd1bc73a56bd7470a494. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Versions:
* arcade-mcp\==1.0.0rc1
* arcade-mcp-server\==1.0.0rc1
* arcade-core\==2.5.0rc1
* arcade-tdk\==2.6.0rc1
* arcade-serve\==2.2.0rc1
### Summary
Adds first-class MCP support across Arcade, introduces a new MCP server
and CLI, unifies the project under the arcade-mcp name, overhauls
templates/scaffolding, and improves developer tooling, secrets
management, and examples.
### Highlights
- **MCP Server & Core**
- New MCP server with stdio and HTTP/SSE transports, session management,
resumability, and lifecycle handling.
- FastAPI-like `MCPApp` for building servers with lazy init; integrated
worker+MCP HTTP app option.
- Middleware system (logging and error handling), robust exception
hierarchy, and Pydantic-based settings.
- Async-safe managers for tools, resources, and prompts backed by
registries and locks.
- Developer-facing, transport-agnostic runtime context interfaces (logs,
tools, prompts, resources, sampling, UI, notifications).
- Conversion from Arcade ToolDefinition to MCP tool schema; OpenAI JSON
tool schema converter.
- Parser supports `@app.tool`/`@app.tool(...)` decorators.
- **CLI**
- New `mcp` command to run MCP servers with stdio or HTTP/SSE.
- New `secret` command to set/list/unset tool secrets (supports .env
input, preserves original casing for lookups).
- `new` command refactored; option to create a full toolkit package with
scaffolding.
- `chat` command removed.
- `serve.py` imports updated to `arcade_serve.fastapi.telemetry`;
version retrieval now uses `arcade-mcp`.
- `show.py` refactor to use new local catalog utilities.
- `display_tool_details` improved: adds “Default” column and handles
nested properties.
- **Configuration & Discovery**
- New `configure.py` to set up Claude Desktop, Cursor, and VS Code to
connect to local or Arcade Cloud MCP servers.
- Discovery utilities to find/install toolkits, build `ToolCatalog`s,
analyze files for tools, load kits from directories (pyproject parsing),
and build minimal toolkits.
- Better handling of provider API key resolution and evaluation suite
loading.
- **Templates & Scaffolding**
- Reorganized template structure (minimal vs full); moved
`.pre-commit-config.yaml`, `.ruff.toml`, license, Makefile, README,
tests, and tools layout to correct paths.
- Minimal template adds `.env.example` for runtime secret injection.
- Template pyproject updated for MCP servers; includes sample server
with greeting and secret-reveal tools.
- Authorization flow in templates simplified.
- **Repo-wide Renaming & Examples**
- Migrates references from `arcade-ai` to `arcade-mcp` across READMEs,
scripts, and package metadata.
- Examples updated (LangChain/LangGraph/AI SDK/TypeScript) and package
name changed to `arcade-mcp-sdk`.
- **Evals & Core Utilities**
- Evals now use OpenAI tooling format (`OpenAIToolList`, `to_openai`);
`tool_eval` takes `provider_api_key`.
- Core utilities: fixed `does_function_return_value` by dedenting before
parse; version bump to `2.5.0rc1` and dependency cleanup.
- **Tooling & CI**
- `setup-uv-env` action splits toolkit vs contrib dependency
installation.
- Pre-commit: excludes `libs/arcade-mcp-server/mkdocs.yml` and
`libs/tests/` from YAML and Ruff hooks; Ruff per-file ignores (e.g.,
C901 in `libs/**/*.py`, TRY400 in server docs paths).
- Makefile updates for uv env setup, quality checks, tests, builds, and
new `shell` target.
- Added Makefile to MCP server library to streamline dev workflow.
- **Cleanup**
- Removed `claude.json` config.
- Simplified stdio entrypoint; removed unused imports (`arcade_gmail`,
`arcade_search`).
### Breaking Changes
- **CLI**: `chat` command removed; use `mcp`, `secret`, and updated
`new`.
- **Naming**: All users should update references from `arcade-ai` to
`arcade-mcp`.
- **Templates**: File paths moved; downstream scripts referencing old
template locations may need updates.
### Getting Started
- Run an MCP server:
- `arcade mcp --stdio --toolkits your_toolkit`
- `arcade mcp --http --toolkits your_toolkit`
- Manage secrets:
- `arcade secret set your_toolkit KEY=value`
- `arcade secret list your_toolkit`
- `arcade secret unset your_toolkit KEY`
- Configure clients:
- `arcade configure` to set up Claude Desktop, Cursor, and VS Code for
local/Arcade Cloud MCP.
---------
Co-authored-by: Sam Partee <sam@arcade-ai.com>
Co-authored-by: Shub <125150494+shubcodes@users.noreply.github.com>
### Overview
Major restructuring from monolithic `arcade-ai` package to modular
library architecture with standardized uv-based dependency management.

### New Package Structure
- **`arcade-tdk`** - Lightweight toolkit development kit (core
decorators, auth)
- **`arcade-core`** - Core execution engine and catalog functionality
- **`arcade-serve`** - FastAPI/MCP server components
- **`arcade-ai`** - Meta package that includes CLI functionality.
Optionally include evals via the `evals` extra. Optionally include all
packages via the `all` extra.
### Key Benefits
- **Lighter Dependencies**: Toolkits now depend only on `arcade-tdk` (~2
deps) vs full `arcade-ai` (~30+ deps)
- **Faster Builds**: uv provides 10-100x faster dependency resolution
and installation
- **Better Modularity**: Clear separation of concerns, consumers import
only what they need
- **Standard Tooling**: Eliminates custom poetry scripts, uses standard
Python packaging
### Migration Impact
- All 20 toolkits converted from poetry → uv with `arcade-tdk`
dependencies plus `arcade-ai[evals]` and `arcade-serve` dev
dependencies. When developing locally, devs should install toolkits via
`make install-local`.
- Modern Python 3.10+ type hints throughout
- Standardized build system with hatchling backend
- Enhanced Makefile with robust toolkit management commands
- Removed `arcade dev` CLI command
- Reduce the number of files created by `arcade new` and add an option
to not generate a tests and evals folder.
This foundation enables faster development cycles and cleaner dependency
chains for the growing toolkit ecosystem.
### Todo After this PR is merged
- [ ] Post-merge workflow(s) (release & publish containers, etc)
- [ ] Release order plan. @EricGustin suggests releasing in the
following order:
1. `arcade-core` version 0.1.0
2. `arcade-serve` version 0.1.0 and `arcade-tdk` version 0.1.0
3. `arcade-ai` version 2.0.0
4. Patch release for all toolkits (all changes in toolkits are internal
refactors)
- [ ] [Update docs](https://github.com/ArcadeAI/docs/pull/318)
---------
Co-authored-by: Eric Gustin <eric@arcade.dev>
Co-authored-by: Eric Gustin <34000337+EricGustin@users.noreply.github.com>