## Summary
Adds two strictly opt-in env vars that let toolkit developers see
`developer_message` / `stacktrace` content *in* the agent-facing error
message while debugging. Off by default; activation requires a specific
acknowledgement string, not a boolean — `true`/`1` is explicitly
rejected with a warning log.
- `ARCADE_UNSAFE_DEBUG_LEAK_DEVELOPER_MESSAGE_TO_AGENT`
- `ARCADE_UNSAFE_DEBUG_LEAK_STACKTRACE_TO_AGENT`
- Magic ack: `yes-i-accept-leaking-internals-to-the-agent`
Everything goes through a single funnel — `ToolOutputFactory.fail` /
`fail_retry` in `arcade_core/output.py` — so the behavior covers both
the MCP server path and the Arcade Worker path with no call-site
changes. A loud `logger.warning` fires once per process on activation,
and a big header comment in `output.py` tells future maintainers not to
add more flags of this shape (debug info belongs in `logger.debug`, not
in a field that gets shipped to the model and often to end users).
Bumps `arcade-core` 4.6.2 → 4.7.0. Non-breaking, additive.
## Why
Today the project does a lot of work to keep `developer_message` and
`stacktrace` off the agent's context. That's the right default, but it
makes iterating on a new toolkit painful — you end up adding temporary
logging or rebuilds just to see what blew up. This gives toolkit authors
a safe, ugly, loud-on-activation escape hatch.
## Safety design
- Two separate flags so you only leak what you need.
- Magic string (not a boolean) activates the flag. Boolean-style values
are rejected and log a pointer to `output.py`.
- First activation logs a `WARNING` identifying the flag and the risk.
- Flags documented only in `CLAUDE.md`, not in the public README.
- Top-of-file banner in `output.py` explicitly tells maintainers not to
add more flags of this shape.
## Test plan
- [x] Existing test suite passes (1154 tests —
`libs/tests/{core,tool,arcade_mcp_server}`).
- [x] End-to-end smoke test against the built `arcade_core-4.7.0` wheel,
driven through `ToolExecutor.run` (same path toolkits hit). Covered
cases:
- flags off → message unchanged
- `ARCADE_UNSAFE_..._DEVELOPER_MESSAGE_TO_AGENT=true` → flag rejected,
warning logged, message unchanged
- `ARCADE_UNSAFE_..._DEVELOPER_MESSAGE_TO_AGENT=<magic>` → `[DEBUG]
developer_message: ...` appended
- both flags with magic, `ToolRuntimeError` path → developer_message
appended (stacktrace absent because `ToolRuntimeError.stacktrace()`
returned `None`, which is existing behavior)
- stacktrace flag with magic, generic `Exception` path → full
`traceback.format_exc()` appended, activation `WARNING` visible
Made with [Cursor](https://cursor.com)
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Adds an opt-in path to include `developer_message` and stacktraces in
agent-facing MCP error messages, which could leak sensitive data if
misconfigured; safeguards (magic ack string + CI/pre-commit guard)
reduce but don’t eliminate risk.
>
> **Overview**
> Adds `arcade_mcp_server/_debug_exposure.py` with two env-gated debug
flags that, only when set to a specific acknowledgement string, append
`developer_message` and/or `stacktrace` into the agent-visible MCP tool
error `message` (and logs one-shot warnings on rejection/activation).
>
> Wires this into the MCP error path in `MCPServer._handle_call_tool`,
documents the flags in `CLAUDE.md`, bumps `arcade-mcp-server` to
`1.21.0`, and adds unit + integration tests plus a pre-commit hook and
GitHub Actions workflow (`scripts/check_debug_leak_flags_off.py`) to
ensure the magic ack string can’t be committed outside a small
allowlist.
>
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
30e242c454128ec7cc62e169c2afd116be735cb5. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- CURSOR_SUMMARY -->
> [!NOTE]
> **Low Risk**
> Mostly CI/test and CLI output tweaks, plus a small refactor to reuse
existing subprocess termination logic; low risk with minor potential for
CI environment/version compatibility issues.
>
> **Overview**
> Expands CI coverage by adding Python `3.13` and `3.14` to the GitHub
Actions matrices (main tests, install test, and no-auth CLI
integration), and removes a redundant editable install step in the
no-auth workflow.
>
> Cleans up Windows subprocess handling by dropping
`arcade_cli.deploy._graceful_terminate` and calling the shared
`arcade_core.subprocess_utils.graceful_terminate_process` directly, with
corresponding test updates.
>
> Improves `arcade new` scaffolding guidance by printing numbered “Next
steps” with explicit stdio/HTTP run options, and adds/updates CLI tests
to assert this output. Also bumps package version to `1.11.2` and
tightens pre-commit `ruff` excludes (no longer excluding `_scratch`).
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
55c2ae106f13e5657acdbebf63e00d74c171181f. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- CURSOR_SUMMARY -->
> [!NOTE]
> **Medium Risk**
> Touches authentication/login flow, credentials-file permissions, and
subprocess lifecycle behavior across platforms; while mostly defensive,
regressions could impact login or process management on Windows/macOS
runners.
>
> **Overview**
> Improves Windows/cross-platform reliability across the CLI and MCP
server: OAuth login now binds the callback server to `127.0.0.1`, avoids
slow loopback reverse-DNS, adds a configurable callback timeout
(`--timeout` + env default), and opens URLs via a Windows-friendly
`_open_browser` to avoid flashing console windows.
>
> Centralizes CLI output via a shared `console` that forces UTF-8 on
Windows, standardizes UTF-8 file reads/writes throughout, tightens
credentials-file permissions on Windows using `icacls`, and adds shared
Windows subprocess helpers for **no-window** process creation and
graceful termination (used by `deploy`, MCP reload, and usage-tracking
worker).
>
> Updates client configuration UX/robustness (Windows AppData resolution
via `platformdirs`, Cursor config path fallbacks + compatibility writes,
overwrite warnings, absolute `uv` path for GUI clients, safer path
display) and improves `deploy` child-process handling to avoid
pipe-buffer deadlocks while giving better debug-aware error messages.
>
> Expands CI to run tests on Linux/Windows/macOS, adds a no-auth CLI
integration workflow, disables usage tracking in toolkits CI, and adds
extensive regression tests for Windows signals, subprocess cleanup,
UTF-8, and config-path edge cases; bumps `arcade-core` to `4.4.2` and
`arcade-mcp-server` to `1.17.2` (with updated dependency pin).
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
0fabd8ca1cd647039ba6ddbdf3f7809c330bab9e. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Versions:
* arcade-mcp\==1.0.0rc1
* arcade-mcp-server\==1.0.0rc1
* arcade-core\==2.5.0rc1
* arcade-tdk\==2.6.0rc1
* arcade-serve\==2.2.0rc1
### Summary
Adds first-class MCP support across Arcade, introduces a new MCP server
and CLI, unifies the project under the arcade-mcp name, overhauls
templates/scaffolding, and improves developer tooling, secrets
management, and examples.
### Highlights
- **MCP Server & Core**
- New MCP server with stdio and HTTP/SSE transports, session management,
resumability, and lifecycle handling.
- FastAPI-like `MCPApp` for building servers with lazy init; integrated
worker+MCP HTTP app option.
- Middleware system (logging and error handling), robust exception
hierarchy, and Pydantic-based settings.
- Async-safe managers for tools, resources, and prompts backed by
registries and locks.
- Developer-facing, transport-agnostic runtime context interfaces (logs,
tools, prompts, resources, sampling, UI, notifications).
- Conversion from Arcade ToolDefinition to MCP tool schema; OpenAI JSON
tool schema converter.
- Parser supports `@app.tool`/`@app.tool(...)` decorators.
- **CLI**
- New `mcp` command to run MCP servers with stdio or HTTP/SSE.
- New `secret` command to set/list/unset tool secrets (supports .env
input, preserves original casing for lookups).
- `new` command refactored; option to create a full toolkit package with
scaffolding.
- `chat` command removed.
- `serve.py` imports updated to `arcade_serve.fastapi.telemetry`;
version retrieval now uses `arcade-mcp`.
- `show.py` refactor to use new local catalog utilities.
- `display_tool_details` improved: adds “Default” column and handles
nested properties.
- **Configuration & Discovery**
- New `configure.py` to set up Claude Desktop, Cursor, and VS Code to
connect to local or Arcade Cloud MCP servers.
- Discovery utilities to find/install toolkits, build `ToolCatalog`s,
analyze files for tools, load kits from directories (pyproject parsing),
and build minimal toolkits.
- Better handling of provider API key resolution and evaluation suite
loading.
- **Templates & Scaffolding**
- Reorganized template structure (minimal vs full); moved
`.pre-commit-config.yaml`, `.ruff.toml`, license, Makefile, README,
tests, and tools layout to correct paths.
- Minimal template adds `.env.example` for runtime secret injection.
- Template pyproject updated for MCP servers; includes sample server
with greeting and secret-reveal tools.
- Authorization flow in templates simplified.
- **Repo-wide Renaming & Examples**
- Migrates references from `arcade-ai` to `arcade-mcp` across READMEs,
scripts, and package metadata.
- Examples updated (LangChain/LangGraph/AI SDK/TypeScript) and package
name changed to `arcade-mcp-sdk`.
- **Evals & Core Utilities**
- Evals now use OpenAI tooling format (`OpenAIToolList`, `to_openai`);
`tool_eval` takes `provider_api_key`.
- Core utilities: fixed `does_function_return_value` by dedenting before
parse; version bump to `2.5.0rc1` and dependency cleanup.
- **Tooling & CI**
- `setup-uv-env` action splits toolkit vs contrib dependency
installation.
- Pre-commit: excludes `libs/arcade-mcp-server/mkdocs.yml` and
`libs/tests/` from YAML and Ruff hooks; Ruff per-file ignores (e.g.,
C901 in `libs/**/*.py`, TRY400 in server docs paths).
- Makefile updates for uv env setup, quality checks, tests, builds, and
new `shell` target.
- Added Makefile to MCP server library to streamline dev workflow.
- **Cleanup**
- Removed `claude.json` config.
- Simplified stdio entrypoint; removed unused imports (`arcade_gmail`,
`arcade_search`).
### Breaking Changes
- **CLI**: `chat` command removed; use `mcp`, `secret`, and updated
`new`.
- **Naming**: All users should update references from `arcade-ai` to
`arcade-mcp`.
- **Templates**: File paths moved; downstream scripts referencing old
template locations may need updates.
### Getting Started
- Run an MCP server:
- `arcade mcp --stdio --toolkits your_toolkit`
- `arcade mcp --http --toolkits your_toolkit`
- Manage secrets:
- `arcade secret set your_toolkit KEY=value`
- `arcade secret list your_toolkit`
- `arcade secret unset your_toolkit KEY`
- Configure clients:
- `arcade configure` to set up Claude Desktop, Cursor, and VS Code for
local/Arcade Cloud MCP.
---------
Co-authored-by: Sam Partee <sam@arcade-ai.com>
Co-authored-by: Shub <125150494+shubcodes@users.noreply.github.com>
### Overview
Major restructuring from monolithic `arcade-ai` package to modular
library architecture with standardized uv-based dependency management.

### New Package Structure
- **`arcade-tdk`** - Lightweight toolkit development kit (core
decorators, auth)
- **`arcade-core`** - Core execution engine and catalog functionality
- **`arcade-serve`** - FastAPI/MCP server components
- **`arcade-ai`** - Meta package that includes CLI functionality.
Optionally include evals via the `evals` extra. Optionally include all
packages via the `all` extra.
### Key Benefits
- **Lighter Dependencies**: Toolkits now depend only on `arcade-tdk` (~2
deps) vs full `arcade-ai` (~30+ deps)
- **Faster Builds**: uv provides 10-100x faster dependency resolution
and installation
- **Better Modularity**: Clear separation of concerns, consumers import
only what they need
- **Standard Tooling**: Eliminates custom poetry scripts, uses standard
Python packaging
### Migration Impact
- All 20 toolkits converted from poetry → uv with `arcade-tdk`
dependencies plus `arcade-ai[evals]` and `arcade-serve` dev
dependencies. When developing locally, devs should install toolkits via
`make install-local`.
- Modern Python 3.10+ type hints throughout
- Standardized build system with hatchling backend
- Enhanced Makefile with robust toolkit management commands
- Removed `arcade dev` CLI command
- Reduce the number of files created by `arcade new` and add an option
to not generate a tests and evals folder.
This foundation enables faster development cycles and cleaner dependency
chains for the growing toolkit ecosystem.
### Todo After this PR is merged
- [ ] Post-merge workflow(s) (release & publish containers, etc)
- [ ] Release order plan. @EricGustin suggests releasing in the
following order:
1. `arcade-core` version 0.1.0
2. `arcade-serve` version 0.1.0 and `arcade-tdk` version 0.1.0
3. `arcade-ai` version 2.0.0
4. Patch release for all toolkits (all changes in toolkits are internal
refactors)
- [ ] [Update docs](https://github.com/ArcadeAI/docs/pull/318)
---------
Co-authored-by: Eric Gustin <eric@arcade.dev>
Co-authored-by: Eric Gustin <34000337+EricGustin@users.noreply.github.com>
# PR Description
This PR is a part of the community contributed toolkits story.
* `arcade new` now uses jinja templates
* `arcade new` now creates a "cookiecutter" toolkit equipped with
everything a community contributed toolkit needs to be easily tested,
published to PyPi, etc. as its own Github repo
* I created the following toolkit with `arcade new`:
- [PyPi](https://pypi.org/project/arcade-local-file-management/0.1.5/)
-
[Github](https://github.com/EricGustin/local_file_management/tree/0.1.5)
On the last few PRs I have noticed two problems:
1. `ruff format` fails even though it seems OK on our local machines
(sometimes, not always)
2. Nate's and Sam's machines kept flip-flopping a specific piece of
formatting back and forth, indicating a subtle difference of config
hiding somewhere
3. This was reproducible by running `ruff format` in the terminal,
followed by `make check`. The former would edit files, and then `make
check` would edit them back!
This PR addresses both issues, and further standardizes our editor &
linter configs to be super stable.
Specifically:
1. The main fix for the above, the pre-commit hook was pinned to a super
old version of ruff.
This resulted in subtle differences in behavior between our machines,
and on CI.
2. Moved ruff settings from `pyproject.toml` to `.ruff.toml`
pyproject files in subdirectories (e.g. `toolkits/**`) were overriding
the main pyproject file and erasing the custom ruff config we set at the
root. This meant that our ruff config was applied to `arcade` but not to
any of the other packages.
By moving the config to `.ruff.toml` at the root, all projects will
inherit the same ruff linting & formatting config.
4. Un-ignored the `.vscode/` directory so that we can share
vscode/cursor workspace settings.
This is valuable for standardizing settings like the default formatter
(ruff) and default test framework (pytest).
However, it's important that going forward we _only_ commit things here
that should apply across all of our machines.
5. To avoid any conflict between prettier and ruff, prettier now
explicitly ignores *.py files
6. Finally, `ruff format` and `make check` agree. A number of files are
newly auto-formatted.
MyPy compliance for the whole codebase
- systematic way of executing tools (`executor.py`)
- support for using pydantic models in tool inputs and outputs
- mypy compliance (most of the changes)
- removal of unused code (from previous iterations)
Co-authored-by: Nate Barbettini <nate@arcade-ai.com>