arcade-mcp

Author	SHA1	Message	Date
Francisco Or Something	c866620435	fix(arcade-mcp-server): report missing debug stacktraces (#836 ) ## Summary - Return an explicit `[DEBUG] stacktrace: unavailable ...` note when the stacktrace debug flag is enabled but the tool error payload has no stacktrace. - Preserve existing behavior for real stacktraces and for developer messages, including not leaking developer details unless the developer-message flag is enabled. - Clarify the toolkit-author docs around when stacktraces exist, such as unhandled exceptions or chained `raise ... from exc` errors. ## Test plan - `pre-commit run --files CLAUDE.md libs/arcade-mcp-server/arcade_mcp_server/_debug_exposure.py libs/tests/arcade_mcp_server/test_debug_exposure.py libs/tests/arcade_mcp_server/test_debug_exposure_integration.py` - `uv run --with pytest --with pytest-asyncio --with pytest-cov pytest libs/tests/arcade_mcp_server/test_debug_exposure.py libs/tests/arcade_mcp_server/test_debug_exposure_integration.py -v` - `ruff format --check libs/arcade-mcp-server/arcade_mcp_server/_debug_exposure.py libs/tests/arcade_mcp_server/test_debug_exposure.py libs/tests/arcade_mcp_server/test_debug_exposure_integration.py` - `ruff check libs/arcade-mcp-server/arcade_mcp_server/_debug_exposure.py libs/tests/arcade_mcp_server/test_debug_exposure.py libs/tests/arcade_mcp_server/test_debug_exposure_integration.py` <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Low risk: changes are limited to debug-only error-message augmentation when an explicit env flag is enabled; default runtime behavior is unchanged. Main risk is only in local debugging scenarios where the new note could affect log parsing or expected error text. > > Overview > When `ARCADE_DEBUG_EXPOSE_STACKTRACE_IN_TOOL_ERROR_RESPONSES` is enabled, tool error messages now always include a stacktrace debug section: either the actual stacktrace (when present) or an explicit `[DEBUG] stacktrace: unavailable ...` note when the tool error payload had no stacktrace. > > Adds/updates unit + integration coverage for the missing-stacktrace case and adjusts expectations around “flag enabled but no content.” Updates toolkit-author docs to clarify when stacktraces exist, and bumps `arcade-mcp-server` patch version to `1.21.2`. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 7d85196a30d8d29be98ffb252a13ef2a78057742. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-30 20:03:53 -03:00
Francisco Or Something	dc4607daa4	feat(telemetry): add developer messages to tool error spans (#831 ) ## Summary - Add shared span attributes for tool error diagnostics, including developer-facing messages when present. - Wire those attributes through MCP server, worker RunTool, and HTTP CallTool spans while keeping default MCP response content public-only. - Cover no-leak response behavior, non-recording spans, outputless worker responses, and the shared attribute contract. ## Verification - `uv run ruff format ...` - `uv run ruff check ...` - `uv run pytest -W ignore libs/tests/arcade_mcp_server/test_debug_exposure_integration.py libs/tests/core/test_log_extras.py libs/tests/worker/test_worker_base.py` Made with [Cursor](https://cursor.com) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Adds new telemetry attributes that propagate tool error messages (including optional developer_message) into active spans across MCP server and worker execution paths; risk is mainly around potential leakage of sensitive developer messages into tracing backends and changes to observability contracts. > > Overview > Adds a shared `arcade_core.log_extras.build_tool_error_span_attributes()` helper and wires it into tool error paths so the current OpenTelemetry span is annotated with stable `tool_error_*` attributes (including `developer_message` when present). > > MCP tool calls now record these span attributes on failure while keeping default MCP response content sanitized, and `arcade-serve` records the same attributes on both `RunTool` and HTTP `CallTool` spans (handling `output=None`). Versions and dependency constraints are bumped to consume the new core helper, with tests added/updated to lock the span-attribute contract and verify behavior for non-recording spans and no-leak responses. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 33a53991d72140a662152f508dc53e9b769b9f07. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-29 20:41:07 -03:00
Eric Gustin	cbe68462df	fix: mypy errors silently dropped during CI (#832 ) Resolves https://linear.app/arcadedev/issue/TOO-788/mypy-failures-are-silently-dropped-during-arcade-mcp-ci <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Low risk: primarily CI/Makefile behavior and type-annotation tweaks; functional logic is unchanged aside from stricter failure propagation in `make check`. > > Overview > Stops CI from silently ignoring mypy failures. The `make check` target now runs `mypy` across `libs/arcade*/` and exits non-zero if any package fails, reporting the failed libs. > > Separately tightens typing to satisfy `mypy` (removing `type: ignore` on OAuth helpers, adding `cast()`/`Any` annotations for JSON response shapes and subprocess kwargs, and handling non-`str` `server_address` hosts), and bumps patch versions for `arcade-mcp` and `arcade-mcp-server`. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit e79575b13a2d03adf3548104a0064c643f1e21b1. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-28 13:25:44 -07:00
Eric Gustin	be52f07930	chore(arcade-core): update PostHog project token (#834 ) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Low risk: only rotates the PostHog project API key used for CLI telemetry; main risk is misconfiguration causing events to be sent to the wrong project or dropped. > > Overview > Updates `UsageService` to use a new PostHog project API key for CLI usage telemetry, redirecting all `alias` and background `capture` events to the new PostHog project. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 8cb1c7f1cc5bccd22cb2c73469848d88a703f27e. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-28 13:23:28 -07:00
Francisco Or Something	70515e3356	feat(arcade-core): opt-in debug leak flags for toolkit authors (#826 ) ## Summary Adds two strictly opt-in env vars that let toolkit developers see `developer_message` / `stacktrace` content in the agent-facing error message while debugging. Off by default; activation requires a specific acknowledgement string, not a boolean — `true`/`1` is explicitly rejected with a warning log. - `ARCADE_UNSAFE_DEBUG_LEAK_DEVELOPER_MESSAGE_TO_AGENT` - `ARCADE_UNSAFE_DEBUG_LEAK_STACKTRACE_TO_AGENT` - Magic ack: `yes-i-accept-leaking-internals-to-the-agent` Everything goes through a single funnel — `ToolOutputFactory.fail` / `fail_retry` in `arcade_core/output.py` — so the behavior covers both the MCP server path and the Arcade Worker path with no call-site changes. A loud `logger.warning` fires once per process on activation, and a big header comment in `output.py` tells future maintainers not to add more flags of this shape (debug info belongs in `logger.debug`, not in a field that gets shipped to the model and often to end users). Bumps `arcade-core` 4.6.2 → 4.7.0. Non-breaking, additive. ## Why Today the project does a lot of work to keep `developer_message` and `stacktrace` off the agent's context. That's the right default, but it makes iterating on a new toolkit painful — you end up adding temporary logging or rebuilds just to see what blew up. This gives toolkit authors a safe, ugly, loud-on-activation escape hatch. ## Safety design - Two separate flags so you only leak what you need. - Magic string (not a boolean) activates the flag. Boolean-style values are rejected and log a pointer to `output.py`. - First activation logs a `WARNING` identifying the flag and the risk. - Flags documented only in `CLAUDE.md`, not in the public README. - Top-of-file banner in `output.py` explicitly tells maintainers not to add more flags of this shape. ## Test plan - [x] Existing test suite passes (1154 tests — `libs/tests/{core,tool,arcade_mcp_server}`). - [x] End-to-end smoke test against the built `arcade_core-4.7.0` wheel, driven through `ToolExecutor.run` (same path toolkits hit). Covered cases: - flags off → message unchanged - `ARCADE_UNSAFE_..._DEVELOPER_MESSAGE_TO_AGENT=true` → flag rejected, warning logged, message unchanged - `ARCADE_UNSAFE_..._DEVELOPER_MESSAGE_TO_AGENT=<magic>` → `[DEBUG] developer_message: ...` appended - both flags with magic, `ToolRuntimeError` path → developer_message appended (stacktrace absent because `ToolRuntimeError.stacktrace()` returned `None`, which is existing behavior) - stacktrace flag with magic, generic `Exception` path → full `traceback.format_exc()` appended, activation `WARNING` visible Made with [Cursor](https://cursor.com) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Adds an opt-in path to include `developer_message` and stacktraces in agent-facing MCP error messages, which could leak sensitive data if misconfigured; safeguards (magic ack string + CI/pre-commit guard) reduce but don’t eliminate risk. > > Overview > Adds `arcade_mcp_server/_debug_exposure.py` with two env-gated debug flags that, only when set to a specific acknowledgement string, append `developer_message` and/or `stacktrace` into the agent-visible MCP tool error `message` (and logs one-shot warnings on rejection/activation). > > Wires this into the MCP error path in `MCPServer._handle_call_tool`, documents the flags in `CLAUDE.md`, bumps `arcade-mcp-server` to `1.21.0`, and adds unit + integration tests plus a pre-commit hook and GitHub Actions workflow (`scripts/check_debug_leak_flags_off.py`) to ensure the magic ack string can’t be committed outside a small allowlist. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 30e242c454128ec7cc62e169c2afd116be735cb5. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-25 11:40:26 -03:00
Pascal Matthiesen	40e05af27c	fix: claude, provide more options and remove apikey auth (#825 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Medium Risk > Medium risk because it changes how `arcade connect` authenticates (removes API-key flow) and rewrites user config files via new atomic/backup logic across multiple clients/formats (JSON/TOML). Mis-shaped entries or write/permission issues could break client integrations despite added tests. > > Overview > `arcade connect` is OAuth-only now: the `--api-key` flag and project API-key creation flow were removed, and connect always writes gateway configs without bearer tokens. > > Client support was expanded and corrected: Claude is now targeted as `claude-code` (writing to `~/.claude.json`), and new gateway config writers were added for `codex` (TOML upsert in `~/.codex/config.toml`), `opencode`, and `gemini`, while Cursor’s remote entry format was changed to match docs (no `type`). > > All config updates now use atomic writes with a single `.bak` backup and (on POSIX) tighten permissions to protect tokens; extensive tests were added to pin each client’s documented config shape and ensure unrelated existing config content is preserved and not corrupted on failures. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 19784e9311a00ed5dcedc7f27373ee9b0b842cf8. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-24 10:31:28 -07:00
Francisco Or Something	8f5d0ff54e	Improve typed httpx error mapping and adapter guidance (#820 ) ## Summary Routes HTTP adapter exceptions to the right error class instead of shoe-horning everything into `UpstreamError`. Addresses Eric's earlier feedback that several exceptions this PR was wrapping as `UpstreamError` didn't satisfy the "something happened with the upstream" claim (local pool exhaustion, client-side request construction, local TLS failures). ### Scope - `UpstreamError` (unchanged) — upstream responded with an HTTP status code. - `NetworkTransportError` (new sibling in `arcade-core`) — no complete response was received. `status_code=None`. Three kinds: `NETWORK_TRANSPORT_RUNTIME_TIMEOUT`, `_UNREACHABLE`, `_UNMAPPED`. - `FatalToolError` (existing) — client construction bugs (`InvalidURL`, `UnsupportedProtocol`, `MissingSchema`, `InvalidHeader`, `LocalProtocolError`, …) and local TLS/cert config failures. Never retried. --- ## Before / After (per Eric's request) Shows the error payload a tool produces for each exception, before this PR vs. after. "Before" = current `main` (exceptions without real HTTP responses fall through to the generic `@tool` `FatalToolError` catch-all with `message=str(exc)`). ### No-response transport failures \| Exception \| Before — class / message / kind \| After — class / message / kind \| \|---\|---\|---\| \| `httpx.PoolTimeout` \| `FatalToolError` — `str(exc)` leaks raw detail — `TOOL_RUNTIME_FATAL`, not retryable \| `NetworkTransportError` — `"HTTP request timed out before a complete response was received."` — `NETWORK_TRANSPORT_RUNTIME_TIMEOUT`, retryable \| \| `httpx.ConnectTimeout` \| same as above \| same as PoolTimeout — `TIMEOUT`, retryable \| \| `httpx.ConnectError` (refused / DNS) \| `FatalToolError` — `str(exc)` \| `NetworkTransportError` — `"HTTP request failed before reaching the upstream service."` — `UNREACHABLE`, retryable \| \| `httpx.RemoteProtocolError` (upstream sent bad HTTP) \| `FatalToolError` — `str(exc)` \| `NetworkTransportError` — same message as ConnectError — `UNREACHABLE`, retryable \| \| `httpx.DecodingError` \| `FatalToolError` — `str(exc)` \| `NetworkTransportError` — `"HTTP response from upstream could not be decoded."` — `UNMAPPED`, retryable \| \| `httpx.TooManyRedirects` \| `FatalToolError` — `str(exc)` \| `NetworkTransportError` — `"HTTP redirect limit exceeded before a final response was received."` — `UNMAPPED`, not retryable \| ### Client construction / local env bugs \| Exception \| Before \| After \| \|---\|---\|---\| \| `httpx.UnsupportedProtocol`, `httpx.InvalidURL`, `httpx.LocalProtocolError` \| `FatalToolError` with `message=str(exc)` (may leak scheme / URL content) \| `FatalToolError` — `"Tool constructed an invalid HTTP request — likely a tool-authoring bug."` — `TOOL_RUNTIME_FATAL`, not retryable \| \| `requests.MissingSchema`, `InvalidURL`, `InvalidHeader`, `InvalidSchema`, `InvalidProxyURL`, `URLRequired` \| same as above \| same as above \| \| `requests.SSLError` \| `FatalToolError` — `str(exc)` often contains raw cert chain detail \| `FatalToolError` — `"TLS handshake failed — likely a local certificate or trust configuration issue."` — `TOOL_RUNTIME_FATAL`, not retryable \| ### Real HTTP response errors (UNCHANGED — same behavior) \| Exception \| Class \| Message \| Kind \| Retryable \| \|---\|---\|---\|---\|---\| \| `httpx.HTTPStatusError` 404 \| `UpstreamError` \| `"Upstream HTTP request failed (Not Found, client error)."` \| `UPSTREAM_RUNTIME_NOT_FOUND` \| No \| \| `httpx.HTTPStatusError` 429 (w/ Retry-After: 60) \| `UpstreamRateLimitError` \| `"Upstream HTTP request failed (Too Many Requests, client error). Retry after 60 second(s)."` \| `UPSTREAM_RUNTIME_RATE_LIMIT` \| Yes \| \| `httpx.HTTPStatusError` 500 \| `UpstreamError` \| `"Upstream HTTP request failed (Internal Server Error, server error)."` \| `UPSTREAM_RUNTIME_SERVER_ERROR` \| Yes \| ### What's no longer in the message - Raw exception `str(exc)` output (which frequently includes the full URL with query-string tokens, connection pool details, or cert chains) is no longer the agent-facing `message`. It's preserved in `developer_message` for server-side diagnostics. - The misleading "Upstream HTTP…" prefix is gone from network-transport and construction-bug messages. Those messages now honestly describe what happened on the tool side. - For 429s without a `Retry-After` header, we still show "Retry after N seconds." (pre-existing behavior; see follow-up notes). --- ## Companion PRs - [ArcadeAI/arcade-mcp#823](https://github.com/ArcadeAI/arcade-mcp/pull/823) — introduces `NetworkTransportError` in `arcade-core` - [ArcadeAI/monorepo#911](https://github.com/ArcadeAI/monorepo/pull/911) — adds the 3 `ErrorKind` constants to the Go engine and Datadog dashboards - [ArcadeAI/docs#920](https://github.com/ArcadeAI/docs/pull/920) — documents the new hierarchy and adapter routing ## Follow-ups (out of scope for this PR) A short investigation surfaced several pre-existing issues that are worth fixing separately. A full list is in `NETWORK_TRANSPORT_ERROR_FOLLOWUPS.md` (shared offline). Summary: 1. `requests.HTTPError` with `response is None` returns `None` from the adapter; should fall through to the `NetworkTransportError(UNMAPPED)` fallback instead of becoming a generic `FatalToolError`. 2. `developer_message` can leak URL query strings (and therefore tokens) since it stores raw `str(exc)`. 3. `_sanitize_uri` does not strip userinfo (credentials in URL path). 4. `_parse_retry_ms` misinterprets epoch-style `x-ratelimit-reset` headers. 5. 429 responses without `Retry-After` synthesize a fabricated "Retry after 1 second(s)." suffix. 6. `UPSTREAM_RUNTIME_VALIDATION_ERROR` is defined but never emitted. 7. `UpstreamError` silently accepts out-of-range status codes. 8. `requests.HTTPError` branch re-extracts `request_url` / `request_method` inconsistently (dead work). ## Test plan - [x] Existing `libs/tests/sdk/test_httpx_adapter.py` + `test_graphql_adapter.py` updated; every no-response / construction-bug test asserts the new class + kind + `can_retry`. - [x] Full test suite passes locally. - [x] mypy clean on `arcade-core`, `arcade-tdk`, `arcade-mcp-server`. - [x] Smoke-tested 21 exception routing cases end-to-end against real httpx / requests exceptions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Changes core error classification and retryability for `httpx`/`requests`/GraphQL transport failures, which can affect tool retry behavior and telemetry. Risk is mitigated by extensive new/updated tests covering the new mappings and privacy expectations. > > Overview > Improves error adapter behavior to be more semantically correct and privacy-safe. The HTTP adapter now distinguishes real HTTP responses (`UpstreamError`/`UpstreamRateLimitError`) from no-response failures (`NetworkTransportError` with `ErrorKind` + retryability) and from client construction/local TLS issues (`FatalToolError`). > > Reduces sensitive data exposure in agent-facing messages. Status-based errors now emit standardized messages derived from status phrase/class, while preserving raw exception detail in `developer_message`; Google/Microsoft/Slack fallback paths similarly switch to `unhandled <ExceptionType>` messages and move `str(exc)` into `developer_message`. GraphQL transport connection/protocol errors are reclassified from `UpstreamError` (502) to `NetworkTransportError`, and transport/server messages are standardized. > > Bumps `arcade-tdk` version to `3.8.0` and expands/updates the SDK test suite to assert new classes, `kind`, `can_retry`, request metadata extraction, and privacy behavior. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 1041cb1bec4fa3b0bae3e7c6b860b84cf376cf9a. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 20:32:17 -03:00
Francisco Or Something	d9812621de	feat: add NetworkTransportError for no-response HTTP failures (#823 ) ## Summary - Adds `NetworkTransportError` — a new sibling to `UpstreamError` under `ToolExecutionError` — for failures where no complete HTTP response was received from the upstream service (timeouts, connection errors, pool exhaustion, DNS failures, decoding issues, redirect exhaustion) - Routes client-construction bugs (`InvalidURL`, `UnsupportedProtocol`, `MissingSchema`, `SSLError`, `InvalidHeader`, etc.) to existing `FatalToolError` instead of `UpstreamError` - Adds 3 new `ErrorKind` values: `NETWORK_TRANSPORT_RUNTIME_TIMEOUT`, `_UNREACHABLE`, `_UNMAPPED` — operationally distinct telemetry slices matching the UpstreamError pattern - `UpstreamError` is unchanged and reserved for real HTTP responses with status codes Addresses Eric's feedback on #820: the `include_status_code=False` post-init null-out workaround is replaced by a clean class hierarchy where `NetworkTransportError.status_code` is natively `None`. ### Changes \| File \| What \| \|---\|---\| \| `arcade-core/errors.py` \| 3 new `ErrorKind` values, `NetworkTransportError` class, `is_network_transport_error` helper \| \| `arcade-tdk/providers/http/error_adapter.py` \| Full rewrite of httpx + requests exception routing with 3-way split \| \| `arcade-tdk/providers/graphql/error_adapter.py` \| `TransportConnectionFailed`/`TransportProtocolError` → `NetworkTransportError` \| \| `arcade-tdk/errors.py`, `arcade-mcp-server/exceptions.py` \| Re-exports \| \| `pyproject.toml` × 3 \| Version bumps: core 4.7.0, tdk 3.7.0, mcp-server 1.20.0 \| \| Tests × 3 \| 33 new tests, 3 updated (2659 passed, 0 failures) \| ### Exception routing table \| Exception \| Target \| Kind \| can_retry \| \|---\|---\|---\|---\| \| `httpx.HTTPStatusError`, `requests.HTTPError` (with response) \| `UpstreamError` \| status-derived \| status-derived \| \| `httpx.TimeoutException`, `requests.Timeout` \| `NetworkTransportError` \| `TIMEOUT` \| ✅ \| \| `httpx.TransportError`, `requests.ConnectionError` \| `NetworkTransportError` \| `UNREACHABLE` \| ✅ \| \| `httpx.DecodingError`, `TooManyRedirects`, fallback \| `NetworkTransportError` \| `UNMAPPED` \| varies \| \| `httpx.InvalidURL`/`UnsupportedProtocol`/`LocalProtocolError`, `requests.MissingSchema`/`SSLError`/etc. \| `FatalToolError` \| `TOOL_RUNTIME_FATAL` \| ❌ \| ### Engine companion PR ArcadeAI/monorepo — `feat/network-transport-error-kinds` adds the 3 `ErrorKind` constants to Go schemas + OpenAPI docs. No engine logic changes needed (ErrorKind is a string alias, retry uses `can_retry` flag only, telemetry auto-slices). ## Test plan - [x] 2659 existing tests pass (0 failures) - [x] 33 new routing + class tests added - [x] mypy clean on arcade-core, arcade-tdk - [ ] Verify engine telemetry dashboard auto-surfaces new `NETWORK_TRANSPORT_` kinds after deploy 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk* > Changes the error taxonomy and classification helpers used for retries/telemetry, so misclassification could affect operational behavior, but the change is additive and covered by new tests. > > Overview > Adds a new error category for outbound request failures that never yield a complete upstream response: `NetworkTransportError` (sibling to `UpstreamError`) plus `ErrorKind.NETWORK_TRANSPORT_RUNTIME_{TIMEOUT,UNREACHABLE,UNMAPPED}` and matching `is_network_transport_error` classification helpers on both `ToolkitError` and the wire-model `ToolCallError`. > > Re-exports `NetworkTransportError` from `arcade-tdk` and `arcade-mcp-server`, bumps package versions (`arcade-core` 4.7.0, `arcade-tdk` 3.7.0, `arcade-mcp-server` 1.20.0) and dependency minimums, and expands `core/test_errors.py` to cover the new kind invariants/defaults and classification behavior. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit d2b89078729c6a67ba42684dc98445352238bc1d. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:29:13 -03:00
Pascal Matthiesen	8f4fb1ad77	feat: added connect cli command (#819 ) Summary - New arcade connect command that logs in, creates/reuses an Arcade Cloud gateway, and configures your MCP client in one step - Supports 5 clients: Claude Desktop, Cursor, VS Code, Windsurf, Amazon Q - Selection modes: --toolkit, --tool, --preset, --gateway, --all, or interactive picker - Reuses existing gateways when one already covers the requested tools - Resolves gateway names to slugs (--gateway opencode finds slug pascal_opencode) - OAuth auth by default, --api-key fallback with auto-created project key - --slug option to set a custom gateway slug on creation - Tool catalog cached to ~/.arcade/cache/tools.json (5min TTL, scoped to org/project) - Fills in the three previously placeholder configure__arcade() functions ```bash ❯ uv run arcade connect cursor --toolkit x Fetching tool catalog... Setting up gateway for toolkits: x Checking existing gateways... Found existing gateway: quickstart-x (slug: gw_3CHqdAlQXSSQ28soevSheOJvXzs) Configuring cursor to connect to gateway: gw_3CHqdAlQXSSQ28soevSheOJvXzs Configured Cursor with Arcade gateway 'x' Gateway URL: https://api.arcade.dev/mcp/gw_3CHqdAlQXSSQ28soevSheOJvXzs Config file: /Users/pascal/.cursor/mcp.json Restart Cursor for changes to take effect. Setup complete! Gateway URL: https://api.arcade.dev/mcp/gw_3CHqdAlQXSSQ28soevSheOJvXzs Auth: OAuth (handled by your MCP client) Try asking your AI assistant: - Post a tweet saying 'Hello from Arcade!' - Search recent tweets about AI tools ``` <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk* > Adds a new end-to-end flow that performs OAuth login, calls Arcade Engine/Coordinator APIs (gateway + API key creation), and writes MCP client config files, so failures could affect remote resource creation and local client configuration. > > Overview > Adds a new `arcade connect` CLI command that logs in (if needed), fetches/caches the user’s tool catalog, creates or reuses an Arcade Cloud gateway (optionally with a custom `--slug`), and writes the appropriate MCP client config to point at the gateway. > > Implements real Arcade Cloud gateway configuration for `claude`, `cursor`, and `vscode` (replacing prior placeholders) and extends support to Windsurf and Amazon Q, including optional `--api-key` mode that auto-creates a project API key and writes it as a `Bearer` header. > > Refocuses `arcade configure` on local filesystem servers (and nudges remote usage to `connect`), adds toolkit config helpers, expands test coverage for gateway/toolkit configuration and the new connect flow, and bumps the package version to `1.14.0`. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit d9357c144a8bddd05dfb39f9f922f577bdbb8bf0. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-15 13:16:50 -07:00
Francisco Or Something	1492c80fc5	TOO-627: Improve error messages for agents and Datadog (#814 ) ## Summary - Improve tool call error messages across 4 libraries (arcade-core, arcade-tdk, arcade-mcp-server, arcade-serve) so agents can self-correct and Datadog can facet on structured fields - Guard empty error messages, enrich input validation errors with field-level detail, fix `@tool` decorator fallback formatting, surface `additional_prompt_content` in MCP responses, and add structured log extras for Datadog - Addresses the 3 worst error patterns: generic "Error in tool input deserialization", bare `KeyError` values, and empty `FatalToolError` messages Linear: TOO-627 Plan: `docs/plans/2026-04-08-improve-error-messages-handoff.md` ## Tasks - [ ] Task 1: Guard empty error messages (arcade-core) - [ ] Task 2: Enrich input validation error messages (arcade-core) - [ ] Task 3: Improve `@tool` decorator error fallback (arcade-tdk) - [ ] Task 4: Fix MCP agent-facing error response (arcade-mcp-server) - [ ] Task 5: Add structured log extras in BaseWorker (arcade-serve) - [ ] Task 6: Add structured log extras in MCP server (arcade-mcp-server) ## Test plan - [ ] Each task has dedicated unit tests verifying the new behavior - [ ] `make test` passes after all tasks - [ ] `make check` (ruff + mypy) passes - [ ] Verify the 3 worst error patterns now produce actionable messages 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Touches cross-library error formatting and logging behavior used in production tool execution paths; while mostly additive/guardrails, it changes agent-visible messages and Datadog log facets, which could impact client expectations and alerting. > > Overview > Improves tool-call error handling across core/runtime, MCP transport, worker transport, and the TDK to make agent-visible failures more actionable while reducing sensitive-data leakage. > > In `arcade-core`, empty error messages now get placeholders, `ToolOutputFactory.fail` defaults blank messages, and input validation errors are rewritten as field-level summaries that intentionally omit rejected values (avoiding Pydantic echo of secrets). The `@tool` fallback in `arcade-tdk` no longer surfaces `str(exception)` to agents; it returns exception type-only* in `message` while preserving full detail in `developer_message`. > > Adds a shared `build_tool_error_log_extra` helper and updates `arcade-serve` + `arcade-mcp-server` to emit consistent structured WARNING logs (`error_*`, `tool_name`, optional toolkit/version) for Datadog, while MCP error responses now append `additional_prompt_content` and force `structuredContent=None` on failures per spec. Includes extensive new tests and bumps package versions (`arcade-core` 4.6.2, `arcade-tdk` 3.6.1, `arcade-mcp-server` 1.19.3, `arcade-serve` 3.2.3). > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit e5c7ebcaf56176cfbd8e6d1f2b6295352abd0ec0. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 20:10:51 -03:00
Eric Gustin	05682d54fe	Don't return structuredContent when error (#817 ) We recently added outputSchema support for our MCP tools (not yet for worker routes yet). Today, we always return structuredContent. On tool execution errors we return structuredContent: {"error": "..."} with isError: True, even when that shape does not match the tool’s declared outputSchema. Since the MCP spec says clients SHOULD validate structuredContent against outputSchema, some clients reject these responses. Since structuredContent is optional, we’re going to omit it when isError: true. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Changes the shape of tool error responses across the MCP server, which may break clients or tools that previously relied on `structuredContent["error"]` for failures. Behavior is more spec-compliant but touches core request/response paths and test expectations. > > Overview > Prevents MCP tool error responses from violating a tool’s declared `outputSchema` by always setting `structuredContent=None` when `isError=True` (server execution errors, unknown tools, middleware exceptions, and `Context.tools.call_raw` JSON-RPC errors). > > Updates requirement-failure error formatting to put the human-friendly message in `content[0]` and (when present) serialize extra machine-readable fields (e.g. `authorization_url`, `llm_instructions`) into an additional `content` item. Examples and integration/unit tests are updated to read errors from `content[0].text`, and `arcade-mcp-server` is bumped to `1.19.2`. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 4213bdd4aa44362de85c30f5f31c576243c132d5. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-10 15:27:07 -07:00
Eric Gustin	3204201360	fix: TypedDict total=False output breaks validation (#816 ) When a tool’s output TypedDict uses total=False, MCP clients reject the response with: ``` MCP error -32602: Structured content does not match the tool's output schema ``` Note that the bug also exists for the Engine transport (/worker/tools/execute), but since the engine doesn't validate the output schema, the bug never surfaced. This PR addresses the problem holistically (MCP and Engine) in preparation for a future where the Engine transport validates output schemas. Two bugs combined to cause this: 1. Schema: The outputSchema had no required array and declared all fields as strict types (e.g. "type": "string"), making every field look mandatory and non-null. 2. Serialization: model_dump() on TypedDict-derived Pydantic models emitted None for absent optional fields. A tool returning {"name": "hello"} produced {"name": "hello", "optional_field": null} which is a value the schema forbids. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Adjusts core schema generation and MCP JSON Schema conversion for TypedDicts, affecting how tool input/output contracts are emitted and validated across clients; mistakes could break compatibility or validation behavior. > > Overview > Fixes MCP/engine validation failures for `TypedDict(total=False)` outputs by ensuring absent optional keys are omitted from serialized output and that emitted schemas correctly describe required vs optional keys. > > `arcade-core` now tracks `required_keys`/`inner_required_keys` and per-field `nullable` in `ValueSchema`, derives required sets from TypedDict `__required_keys__`, and unwraps `Optional[T]` to support optional nested TypedDicts; TypedDict-derived Pydantic models now `model_dump(exclude_unset=True)` to avoid leaking missing fields as `null`. > > `arcade-mcp-server` JSON Schema conversion now emits `required` arrays (including for arrays of objects), supports `nullable` by generating `type: [<type>, "null"]` (and `enum` including `None`), and treats nullable top-level objects as valid unwrapped output schemas. Adds focused unit/end-to-end tests plus an expanded example server demonstrating total-false, mixed required/optional, nullable, and optional-nested TypedDict outputs, and bumps package versions/dependencies accordingly. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 53fe8365f613053599130520b75f30b614b465ca. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-09 17:47:57 -07:00
Eric Gustin	987a4eaef9	Add `[tool.uv] exclude-newer` config to `arcade new`'s full template (#809 ) templates the changes in https://github.com/ArcadeAI/monorepo/pull/765 <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Low risk: adds `uv` configuration to generated templates and bumps the package version, with no runtime logic changes. > > Overview > Updates the `arcade new` full template `pyproject.toml` to include a new `[tool.uv]` section that pins resolver behavior via `exclude-newer = "1 week"` while exempting key `arcade-*` packages. > > Bumps the root project version from `1.12.2` to `1.12.3`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit fd47343d52ee51700affad5c3f1701b3e143002f. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-08 11:54:12 -07:00
Eric Gustin	9f904a4ad6	Suppress 500 during deployment warm-up period (#810 )	2026-04-07 11:51:40 -07:00
Eric Gustin	82d6661dd9	Better error message (#789 ) We should be checking if the entrypoint file is in the current directory before we validate that it's a valid python file. Otherwise, the error message will be "invalid name for python file" when someone provides a path to a valid python file. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Low risk: only adjusts input validation/error messaging for `arcade configure` stdio entrypoints and bumps the package patch version. > > Overview > Improves `arcade configure` stdio entrypoint validation by rejecting path-like values (e.g., containing `/` or `\\`) before checking filename format, producing a clearer error when users pass a path instead of a file in the current directory. > > Bumps the project version from `1.13.0` to `1.13.1`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 4d02ee947740f839fd46d9faf62cfa766d5dae47. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-02 16:27:23 -07:00
Eric Gustin	d31a81ef3f	Add background update check & notification for arcade CLI (#800 ) ## Summary - On every CLI command invocation (except `update`/`upgrade`/`mcp`), a detached subprocess checks PyPI for newer versions (throttled to once per 4 hours) and caches the result at `~/.arcade/update_cache.json` - On the next invocation, if a newer version is known, a yellow one-liner notification is printed suggesting `arcade update` - Respects `ARCADE_DISABLE_AUTOUPDATE=1` environment variable to opt out entirely <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Adds a background PyPI version check that spawns detached subprocesses and may print update notifications on most CLI invocations; mistakes could impact CLI reliability or corrupt MCP stdio output (mitigated by explicit command exclusions). > > Overview > Adds `arcade update` (and hidden `arcade upgrade` alias) to self-upgrade the `arcade-mcp` CLI by detecting the original install method (`uv tool`, `pipx`, `uv pip`, or `pip`) and running the appropriate upgrade command. > > Introduces a throttled background update check on most CLI invocations: a detached subprocess queries PyPI, writes `~/.arcade/update_cache.json`, and on subsequent runs prints a one-line notification when a newer version is cached; this is disabled via `ARCADE_DISABLE_AUTOUPDATE=1` and explicitly skipped for `update`/`upgrade`/`mcp` to avoid MCP stdio output corruption. > > Bumps the package version to `1.13.0`, adds a `packaging` dependency, and includes comprehensive tests covering PyPI/yanked/prerelease handling, install-method detection, caching, and callback integration. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 2d9646ecc2211e8cfecd6e4901d14b1f5b7bb306. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-02 11:30:55 -07:00
Eric Gustin	7ce7d6892f	Add 'PRODUCT_ANALYTICS' to `SericeDomain` enum (#806 ) We will be adding some product analytics toolkits in the near future <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Low risk: adds a new `ServiceDomain` enum value and bumps the package version, with minimal behavioral impact beyond any downstream enum matching/serialization expectations. > > Overview > Adds `ServiceDomain.PRODUCT_ANALYTICS` (`"product_analytics"`) to tool metadata classification to support upcoming product analytics integrations. > > Bumps `arcade-core` version from `4.5.0` to `4.6.0`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 84666eaf997401559f8025dbe43563fdd03acd49. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-03-27 17:12:23 -07:00
Eric Gustin	9eec003c72	Add full support for MCP Resources (#803 ) Resolves https://linear.app/arcadedev/issue/TOO-590/add-resources-support-to-server-framework <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Adds new resource registration/reading semantics (including URI templates and duplicate/multiple-match policies) and changes JSON Schema generation for tool I/O, which may affect MCP client compatibility and runtime behavior across servers. > > Overview > Adds first-class MCP Resources support across `arcade-mcp-server`. `MCPApp` can now register resources at build time via `add_resource`/`@resource` plus convenience `add_text_resource` and `add_file_resource`, and passes these through to `MCPServer` for startup loading (including `ResourceTemplate` URIs with `{param}` and `{param}` matching). > > Extends `ResourceManager` behavior.* Resource reads now coerce handler return types (including raw `bytes` to base64 `BlobResourceContents`), support template matching with overlap/multiple-match detection, and introduce configurable duplicate handling policies. > > Improves tool schema + MCP Apps linking. Tool input/output JSON Schema generation is refactored to recursively expand nested `json` schemas and ensure `outputSchema` is always an object (wrapping non-object returns in a `result` property); `MCPApp` also supports attaching arbitrary tool `_meta` extensions (e.g., `ui.resourceUri`) applied at server start. > > Adds two new example servers (`resources`, `tools_with_output_schema`) and broad test coverage for resource templates, static/file resources, meta extensions, and schema wrapping/recursion. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit e785bee79d74110727519b00b81dcad6e9b74212. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 15:27:57 -07:00
Eric Gustin	bbba7aec90	Update `arcade new --full` (#805 ) Monorepo has new linting and formatting preferences. Updated the `--full` to reflect that. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Low risk: changes only affect the generated `--full` project template and CLI help surface (flag is now hidden), plus a patch version bump. > > Overview > Updates `arcade new --full` to be internal-only by hiding the flag and revises the full template to match monorepo conventions (Ruff/pre-commit versions and hook IDs, `.ruff.toml` now extends the repo config, `pyproject.toml` formatting/booleans, adds `pytest` `asyncio_mode`, and removes `pre-commit install` from the template `Makefile`). > > Adds a regression test ensuring generated full-template files match these conventions, and bumps the root package version to `1.12.2`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 2c1a285752d67dc4dd1aa8e0b6f25ca2f0a33fa2. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-03-27 14:22:32 -07:00
Sankara R. Avula	78c8e6fb99	feat: Add TelemetryPassbackMiddleware for serverExecutionTelemetry capability (#797 ) Implements: [SEP-2448: server execution telemetry] (https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2448) Description: The Observability Gap (The Problem) MCP clients propagate trace context to servers, but server-side execution remains a black box. The client sees a single tools/call or resources/read span; everything the server does (auth checks, policy evaluation, API calls, sub-tool invocations) is invisible. In cross-organization deployments, clients and servers use separate observability backends with no shared collector access, making traditional span export useless. <img width="1015" height="450" alt="Screenshot 2026-03-23 at 3 43 21 PM" src="https://github.com/user-attachments/assets/58c817b5-fee6-46a3-9877-d523a25368ad" /> Server Execution Telemetry (The Solution) Servers advertise serverExecutionTelemetry and return a curated slice of their execution spans directly in _meta.otel of the response. Clients ingest these verbatim OTLP spans into their own collector, stitching server-side execution into their distributed trace; no shared infrastructure required. The black box becomes transparent. <img width="945" height="574" alt="Screenshot 2026-03-23 at 3 43 44 PM" src="https://github.com/user-attachments/assets/38d97c94-aa73-4e62-9b4e-3264600e5ed0" /> . Summary: Implement MCP serverExecutionTelemetry capability that enables cross-organization distributed tracing by returning server-side OpenTelemetry spans to clients inline via _meta.otel.traces. Server-side (middleware): - TelemetryPassbackMiddleware intercepts tools/call and resources/read - ContextVarSpanCollector isolates span collection per-request via ContextVar - Propagates traceparent from client request for distributed trace stitching - Serializes collected spans to verbatim OTLP JSON (resourceSpans format), directly POSTable to /v1/traces - Top-level span filtering by default; full span tree via detailed opt-in - Middleware advertises capabilities via get_capabilities() on the Middleware base class - Provisional API: FutureWarning emitted until SEP-2448 is ratified Client-side (reference agent): - LangChain ReAct agent connects to MCP server via streamable_http_client with OAuth 2.1 - Detects serverExecutionTelemetry capability at initialization - Dynamically wraps discovered MCP tools with traceparent propagation and _meta.otel span request - Ingests returned server spans into Jaeger (OTLP JSON) and Galileo (OTLP protobuf) - Two-act demo: --no-passback (black box) vs default (full server-side visibility) Dependencies: - opentelemetry-api and opentelemetry-sdk added to arcade-mcp-server Bump arcade-mcp-server version to 1.18.0.	2026-03-25 15:57:50 -07:00
Eric Gustin	9bbdbe2b46	Fix outputSchema to conform to MCP spec's object type requirement (#799 ) When a stdio server had a tool that didn't return a dict, then: ``` { "code": "invalid_value", "values": [ "object" ], "path": [ "tools", 2, "outputSchema", "type" ], "message": "Invalid input: expected \"object\"" } ``` <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Changes the generated `outputSchema` shape for all non-`json` return types by wrapping them under a `result` property, which may affect clients/tests expecting primitive/array schemas despite being spec-correct. > > Overview > Adjusts MCP tool `outputSchema` generation to always emit an object schema, per the MCP spec that `structuredContent` must be a JSON object. > > `json` outputs remain a direct object schema, while primitive/array outputs are now wrapped as `{ "type": "object", "properties": { "result": <inner> } }` (preserving `enum`/`items`), and tests are expanded to cover these cases. Bumps `arcade-mcp-server` version to `1.18.0`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 7dd13bd33d6fdf6ebb778e1a3d9167ca89806f55. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-03-20 15:50:54 -07:00
jottakka	9c47f73602	[TOO-518] Enforce semver for MCPApp Versioning (#793 ) Here's the PR summary: --- ## Enforce semver validation for `MCPApp` versioning ### Problem `MCPApp.__init__` accepted any string as `version` with no validation. Invalid versions like `"1.0.0dev"` or `"latest"` silently propagated to the Engine, where `compareToolVersions` fell back to lexicographic `strings.Compare` instead of `semver.Compare` — causing incorrect ordering (e.g. `1.10.0 < 1.9.0`). ### Solution Validate and normalize `version` at `MCPApp` instantiation time using the same acceptance rules as Go's `golang.org/x/mod/semver v0.31.0` (the exact version used by the Engine). ### Changes `arcade_mcp_server/_validation.py` (new file) - Shared regex constants: `SEMVER_PATTERN` (semver.org spec), `SHORT_VERSION_PATTERN`, `MAJOR_ONLY_PATTERN` `arcade_mcp_server/mcp_app.py` - Added `_validate_version()` mirroring the existing `_validate_name()` pattern - Added `version` property + setter (validates on mutation too) - `__init__` now stores `self._version` via `_validate_version()` `arcade_mcp_server/settings.py` - Added `@field_validator("version")` on `ServerSettings` — covers the `MCP_SERVER_VERSION` env var path - Fixed default from `"0.1.0dev"` → `"0.1.0"` (the old default was itself invalid) `pyproject.toml` — bumped `arcade-mcp-server` `1.17.4` → `1.17.5` ### Normalization pipeline All inputs are normalized to canonical `MAJOR.MINOR.PATCH` before storage: \| Input \| Stored as \| \|-------\|-----------\| \| `v1.0.0` \| `1.0.0` \| \| `1.0` / `v1.0` \| `1.0.0` \| \| `1` / `v1` \| `1.0.0` \| ### Verification Validated against `golang.org/x/mod/semver v0.31.0` (Engine's exact pinned version) — 40/40 accept/reject cases match. The Engine's own `store_test.go` uses `"1.0"` and `"1.1"` as `ToolkitVersion` values, confirming short forms are intentionally supported. ### Breaking change Any user currently passing a non-semver version string (e.g. `"1.0.0dev"`, `"latest"`) will get a `ValueError` on upgrade. This is intentional — those versions were silently causing incorrect tool ordering in the Engine. Closes TOO-518 <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Introduces stricter version validation/normalization that will raise errors for previously-accepted non-semver inputs (including via env vars), which may break existing consumers depending on lax version strings. > > Overview > Enforces semver for server versioning across both `MCPApp` and `ServerSettings`, rejecting invalid strings and normalizing accepted inputs (e.g., stripping leading `v`, expanding `1`/`1.2` to `1.0.0`/`1.2.0`). > > Adds shared `normalize_version` logic in `arcade_mcp_server/_validation.py`, updates `MCPApp` to validate on init and via a new `version` property/setter, and adds a Pydantic `version` validator so `MCP_SERVER_VERSION` is checked. Defaults are updated from `0.1.0dev` to `0.1.0`, tests are expanded to cover accept/reject cases, and the package version is bumped to `1.17.5`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 2ceabacb25372e67eef9720b901c1ee2b214868f. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Eric Gustin <34000337+EricGustin@users.noreply.github.com>	2026-03-16 16:06:25 -07:00
jottakka	bf6bfa83f1	[TOO-522] Supress chardet noizy versioning warning (#792 ) Use this PR summary: --- ## [TOO-522] Suppress chardet warning and fix OpenTelemetry telemetry ### Summary Reduces noisy chardet/urllib3 warnings in telemetry and updates the OpenTelemetry logger API to match the current SDK. ### Changes `libs/arcade-serve/arcade_serve/fastapi/telemetry.py` - Add `warnings.filterwarnings` to ignore `RequestsDependencyWarning` when chardet≥6 is present (requests uses charset-normalizer regardless) - Replace `_logs.set_logger_provider` with `set_logger_provider` from `opentelemetry._logs` (API change in OpenTelemetry 1.15+) `.ruff.toml` - Add per-file ignore for E402 on `telemetry.py` because `warnings.filterwarnings` must run before the opentelemetry imports that pull in requests `libs/arcade-serve/pyproject.toml` - Bump version 3.2.1 → 3.2.2 --- Closes TOO-522 <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Low risk: changes are limited to telemetry initialization (warning filtering and OpenTelemetry logger-provider wiring) plus a patch version bump, with minimal impact outside observability. > > Overview > Reduces telemetry startup noise by filtering `requests` `chardet`-related warnings before OpenTelemetry imports, and updates logging initialization to use `opentelemetry._logs.set_logger_provider` instead of the deprecated `_logs.set_logger_provider` call. > > Adds a targeted Ruff `E402` per-file ignore for `telemetry.py` to allow the early warning filter, and bumps `arcade-serve` version from `3.2.1` to `3.2.2`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 5166c51be7cdfb05f86df18490a0c98b44f771c2. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-03-13 15:56:15 -07:00
jottakka	3ed66e663c	Removing Flanky tests (#790 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Medium Risk > Test-only change, but it removes coverage for concurrent tool execution over HTTP, which could let concurrency regressions slip through unnoticed. > > Overview > Removes three async end-to-end integration tests in `test_end_to_end.py` that asserted parallel/concurrent tool execution timing across the HTTP `POST /mcp` JSON-RPC route, the `POST /worker/tools/invoke` route, and a mixed MCP+Worker scenario. > > No production code changes; remaining HTTP and stdio E2E coverage stays in place, but the suite no longer validates HTTP concurrency behavior. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 3846363488c935771e79e0bb9b946b98137fdc55. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-03-06 16:03:49 -03:00
Renato Byrro	3e9ffb6bd9	Fix deploy timeout and improve error messages (#770 ) - update_deployment() was using httpx default timeout (5s) instead of the 360s used by deploy_server_to_engine(), causing "The write operation timed out" errors on larger packages - Catch httpx.TimeoutException in both deploy paths with an actionable error message that points to package size as the likely cause - Add proper error handling (ConnectError, HTTPStatusError) and client.close() to update_deployment(), matching deploy_server_to_engine() - Add unit tests covering timeout handling and timeout constant usage	2026-03-06 10:03:48 -03:00
Eric Gustin	4d48bb765d	`arcade new <name> --full` generates an `MCPApp` (#787 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Medium Risk > Moderate risk because it changes the default `arcade new --full` scaffolding (dependencies, entry points, Makefile workflow, licensing) and removes interactive prompts, which could break existing expectations for generated projects. > > Overview > `arcade new --full` scaffolding is simplified to be non-interactive and always generate an `arcade_<name>` package, with derived name variants (title/hyphenated) and updated next-step instructions (`make install/dev/test/lint`). > > The full template is updated to produce a runnable `arcade-mcp-server` `MCPApp` (`__main__.py` + `project.scripts`), switch sample code/tests/evals to a new async Reddit example tool (auth + metadata) using `httpx`, and refresh dev tooling (new `Makefile`, ruff config tweaks, added pytest `conftest.py`). > > Template licensing/metadata is standardized to Arcade proprietary (removes community/official conditionals and the templated README), and the repo version is bumped to `1.12.0`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit fd39c9ed9beba068fe85cf96979f04a31a40daa4. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-03-05 17:50:10 -08:00
jottakka	bcee0f556f	Left over fixes for Windows Papercut PR (#781 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Low Risk > Mostly CI/test and CLI output tweaks, plus a small refactor to reuse existing subprocess termination logic; low risk with minor potential for CI environment/version compatibility issues. > > Overview > Expands CI coverage by adding Python `3.13` and `3.14` to the GitHub Actions matrices (main tests, install test, and no-auth CLI integration), and removes a redundant editable install step in the no-auth workflow. > > Cleans up Windows subprocess handling by dropping `arcade_cli.deploy._graceful_terminate` and calling the shared `arcade_core.subprocess_utils.graceful_terminate_process` directly, with corresponding test updates. > > Improves `arcade new` scaffolding guidance by printing numbered “Next steps” with explicit stdio/HTTP run options, and adds/updates CLI tests to assert this output. Also bumps package version to `1.11.2` and tightens pre-commit `ruff` excludes (no longer excluding `_scratch`). > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 55c2ae106f13e5657acdbebf63e00d74c171181f. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-02-26 13:24:15 -03:00
Eric Gustin	4a737b9710	Improve `.env` discovery (#737 ) Resolves TOO-201 Documentation PR for this is here: https://github.com/ArcadeAI/docs/pull/626 <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Changes how environment variables/secrets are discovered and loaded, which can subtly alter runtime behavior depending on directory structure and existing env vars; bounded traversal and added tests reduce but don’t eliminate this risk. > > Overview > Improves `.env` discovery across the MCP server and CLI. Adds `find_env_file()` (bounded by the nearest `pyproject.toml` by default) and switches settings loading, `arcade deploy`, `arcade configure` stdio env injection, and provider API-key resolution to use it. > > Updates dev reload to also watch the discovered `.env` even when it lives outside the current working directory, adjusts `deploy --secrets all` to only run when a `.env` was found, and moves the minimal scaffold’s `.env.example` to the project root with updated tests/integration checks. Version bumps align examples and top-level deps with `arcade-mcp-server` `1.17.4` and `arcade-mcp` `1.11.2`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 40cff1738c14674ce01f09fd325ece9c874cd072. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-25 23:20:28 -08:00
Eric Gustin	36584942f7	Fix runtime warning (#771 ) When `python -m arcade_mcp_server` was executed, we would get the following Runtime Warning: ``` <frozen runpy>:128: RuntimeWarning: 'arcade_mcp_server.__main__' found in sys.modules after import of package 'arcade_mcp_server', but prior to execution of 'arcade_mcp_server.__main__'; this may result in unpredictable behaviour ``` This PR resolves this. This PR is mainly just moving existing functions to new locations; a refactor <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Primarily a module-organization refactor with minimal behavior change; main risk is import-path regressions for internal callers and stdio/CLI startup wiring. > > Overview > Fixes the `python -m arcade_mcp_server` runtime warning by refactoring `arcade_mcp_server.__main__` to be a thin CLI entrypoint and moving its reusable logic into import-safe modules. > > Extracts stdio execution and tool discovery into a new `arcade_mcp_server.stdio_runner` (`initialize_tool_catalog`, `run_stdio_server`) and moves `setup_logging` into `logging_utils`, updating `MCPApp`, the FastAPI `worker`, and tests to import from the new locations. Bumps package version to `1.17.3`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 210475acea7c5df44fc66be2bde06f1f0c806c4e. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-02-25 09:55:37 -08:00
Eric Gustin	25267ab6ee	JSON-safety validation for `ToolMetadata.extras` (#773 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Medium Risk > Tightens validation on tool metadata, which may break existing tools that relied on non-JSON-serializable `extras` values or keys; changes are localized and well-covered by tests. > > Overview > Adds JSON-safety enforcement for `ToolMetadata.extras`: top-level keys must be strings at model construction, and `validate_for_tool()` now recursively rejects non-JSON-native values (including non-finite floats) with path-rich `ToolDefinitionError` messages. > > Expands tests to cover valid/invalid nested `extras` cases and error-message quality, and bumps `arcade-core` version to `4.5.0`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 2bab0db3c17f0ddb97868764d10494da543b39e5. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-02-25 09:48:04 -08:00
jottakka	fe8ddfd500	[TOO-326] Windows papercuts (#768 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Medium Risk > Touches authentication/login flow, credentials-file permissions, and subprocess lifecycle behavior across platforms; while mostly defensive, regressions could impact login or process management on Windows/macOS runners. > > Overview > Improves Windows/cross-platform reliability across the CLI and MCP server: OAuth login now binds the callback server to `127.0.0.1`, avoids slow loopback reverse-DNS, adds a configurable callback timeout (`--timeout` + env default), and opens URLs via a Windows-friendly `_open_browser` to avoid flashing console windows. > > Centralizes CLI output via a shared `console` that forces UTF-8 on Windows, standardizes UTF-8 file reads/writes throughout, tightens credentials-file permissions on Windows using `icacls`, and adds shared Windows subprocess helpers for no-window process creation and graceful termination (used by `deploy`, MCP reload, and usage-tracking worker). > > Updates client configuration UX/robustness (Windows AppData resolution via `platformdirs`, Cursor config path fallbacks + compatibility writes, overwrite warnings, absolute `uv` path for GUI clients, safer path display) and improves `deploy` child-process handling to avoid pipe-buffer deadlocks while giving better debug-aware error messages. > > Expands CI to run tests on Linux/Windows/macOS, adds a no-auth CLI integration workflow, disables usage tracking in toolkits CI, and adds extensive regression tests for Windows signals, subprocess cleanup, UTF-8, and config-path edge cases; bumps `arcade-core` to `4.4.2` and `arcade-mcp-server` to `1.17.2` (with updated dependency pin). > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 0fabd8ca1cd647039ba6ddbdf3f7809c330bab9e. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-02-25 13:18:16 -03:00
Evan Tahler	7a28e7f988	Remove secret hints from CLI (#774 ) ## Summary - Removes the `Hint` column from `arcade secret list` table output — secret hints were removed from the API in [monorepo#322](https://github.com/ArcadeAI/monorepo/pull/322), causing a `KeyError: 'hint'` crash - Adds a test for `print_secret_table` with populated secrets (verifying no hint column) - Bumps version from 1.10.0 → 1.11.0 Closes TOO-444 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 08:31:18 -08:00
Eric Gustin	a918eef037	Add Tool Metadata (#766 )	2026-02-17 14:31:45 -08:00
emmithood	b928f52445	Add Attio wellknown auth class (#769 ) Add Attio to the wellknown OAuth2 provider classes so toolkits can use Attio(scopes=[...]) instead of OAuth2(id=..., scopes=[...]). --------- Co-authored-by: Eric Gustin <eric@arcade.dev>	2026-02-12 11:05:43 -08:00
jottakka	7472b18106	Fixing bug with multiple providers + stats for multiple runs (#752 ) @EricGustin you can use this cli command: ``` uv run arcade evals mcp_building_evals_results/eval_toolkit_iteration_dict.py \ -p openai:gpt-4o,gpt-4o-mini \ -p anthropic:claude-sonnet-4-20250514 \ -k openai:$OPENAI_API_KEY \ -k anthropic:$ANTHROPIC_API_KEY \ -d \ --num-runs 3 \ --seed random \ --multi-run-pass-rule majority \ --max-concurrent 6 \ -o mcp_building_evals_results/results ``` <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Touches core eval execution and all result formatters while adding new CLI inputs and output schema (`run_stats`/`critic_stats` and capture `runs`), so regressions could affect evaluation results and report compatibility despite being additive and validated. > > Overview > Adds multi-run evaluation support to `arcade evals` via new flags `--num-runs`, `--seed`, and `--multi-run-pass-rule`, with upfront validation and plumbing through the CLI runner into eval/capture suite execution. > > Fixes provider selection UX/bug by making `--use-provider/-p` repeatable (instead of a space-delimited string), updates docs/examples accordingly, and extends capture mode to optionally record per-run tool calls (`CapturedRun`) when `num_runs > 1`. > > Enhances all output formatters (HTML/Markdown/Text/JSON) to propagate and display per-case `run_stats` and `critic_stats`, including new HTML UI for run tabs/cards and comparative tables showing mean ± stddev when multi-run data is present. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 2ee1654b7d1fbb9538373507355636164b16a066. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-02-09 14:25:28 -03:00
Eric Gustin	859f2989be	ToolExecutionError description (#762 ) we actually don't want to deprecate ToolExecutionError <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Low risk: behavior change is limited to no longer emitting a `DeprecationWarning` when `ToolExecutionError` is instantiated, plus a patch version bump. > > Overview > `ToolExecutionError` is no longer treated as deprecated: the `warnings` import and runtime `DeprecationWarning` emission were removed, and the class docstring was updated to describe it as the base exception for errors raised from within a tool body. > > Bumps `arcade-core` version from `4.2.2` to `4.2.3`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 250d20e393a8a4d8dc20fad673a7faea1cba4797. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-02-03 16:04:52 -08:00
Eric Gustin	d7d765343e	Fix multiple worker log level bug (#758 ) When running `arcade_mcp_server` with `workers > 1`, uvicorn spawns worker subprocesses that directly call `create_arcade_mcp_factory()` without going through `main()`. Since `setup_logging()` is only called in `main()`, these subprocesses have no logging configuration, causing: 1. Standard Python logging not intercepted by Loguru 2. DEBUG-level logs from libraries like urllib3 appearing when OTEL is enabled 3. Inconsistent log formats between main process and workers <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Touches process-wide logging initialization for uvicorn worker subprocesses, which can affect log levels/handlers and output across the server. Functional impact is limited to observability but could change verbosity when OTEL or libraries emit logs. > > Overview > Fixes multi-worker/reload mode logging by configuring Loguru inside `create_arcade_mcp_factory()` (using `ARCADE_MCP_DEBUG` to set `INFO` vs `DEBUG`) so uvicorn-spawned worker subprocesses get the same logging/interception as `main()`. > > Adds regression tests that assert the factory filters DEBUG logs by default and enables them when `ARCADE_MCP_DEBUG=true`, and bumps `arcade-mcp-server` to `1.15.2`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 0c262eb9716ecbd589f1524842243a7aed80666e. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-01-30 15:37:52 -08:00
Eric Gustin	a4160dd9fe	Four bug fixes (#754 ) 1. Resolves [TOO-363](https://linear.app/arcadedev/issue/TOO-363/arcade-deploy-fails-when-additional-deps-are-added-to-the-server). 2. Resolves [TOO-364](https://linear.app/arcadedev/issue/TOO-364/arcade-cores-tool-skip-logic-is-missing-case-for-direct-execution). 3. Resolves [TOO-358](https://linear.app/arcadedev/issue/TOO-358/missing-evals-error-message-shows-wrong-command). 4. Resolves [TOO-365](https://linear.app/arcadedev/issue/TOO-365/arcade-evals-unit-tests-are-hanging). <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Medium risk because it changes how `arcade deploy` spawns the server process and adjusts toolkit discovery skip logic, which can affect deployments and tool discovery; however, the changes are small and covered by new unit/integration tests. > > Overview > `arcade deploy` now starts the validation server using the project’s `.venv` interpreter (via `find_python_interpreter`) instead of the CLI’s own `sys.executable`, preventing missing dependency failures when the CLI is installed in an isolated env. > > `arcade-core`’s `Toolkit.tools_from_directory` skip logic is hardened to also skip the currently executing entrypoint by module name (`__main__.__spec__.name`) when file paths don’t match (e.g., bundled execution). CLI error printing now escapes plain messages to avoid rich markup issues, and `arcade-evals` lock acquisition accepts an optional timeout default. > > Adds unit tests for the new toolkit skip behavior and an integration test that boots the MCP server via direct Python invocation to mirror deployment behavior, and bumps `arcade-core`, `arcade-mcp-server`, and root dependency versions accordingly. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit e7785634c231c059f2e0bd1bc73a56bd7470a494. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-01-29 15:12:06 -08:00
Eric Gustin	28c1863ee3	Support Ed25519 Algorithm (#742 ) Ed25519 is needed for Arcade AS. This required migrating from `python-jose` to `joserfc`, because `python-jose` didn't seem to support Ed25519	2026-01-16 15:55:05 -08:00
Eric Gustin	7c448aaf2e	Fix PostHog dependency issue (#740 ) The 'MCP server started' events would fail to send to posthog if the CLI was not installed. This PR fixes this by moving PostHog from being a dependency of the CLI to a dependency of arcade-core. <!-- CURSOR_SUMMARY --> > [!NOTE] > Aligns versions and dependency ranges across the CLI and server packages; removes an unnecessary dependency. > > - Bump `arcade-mcp-server` to `1.14.2` and `arcade-mcp` to `1.8.1` > - Update `arcade-core` constraint to `>=4.2.1,<5.0.0`; CLI now requires `arcade-mcp-server>=1.14.2` > - Remove `posthog` from CLI dependencies > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 69f8bb397737d4c01f57630863762109819dbc4f. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-01-12 10:48:47 -08:00
Evan Tahler	c034046735	Replace fcntl with cross-platform portalocker (fix win/powershell errors) (#739 ) So even importing `fcntl` causes problems on windows. This PR replaces fcntl with portalocker. Tests all pass, so I think we are good. ref: https://arcade-ai.slack.com/archives/C08K1SJ072S/p1767897850450239?thread_ts=1766186586.406019&cid=C08K1SJ072S <img width="934" height="501" alt="Screenshot 2026-01-08 at 2 57 46 PM" src="https://github.com/user-attachments/assets/1375b6b2-116c-44bd-bbe1-2157dd243d29" /> Closes ENGTOP-8 <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Cross-platform file locking > > - Replace `fcntl` with `portalocker` in `arcade_core/usage/identity.py` (shared/exclusive locks); switch to atomic `os.replace()` > - Add `portalocker` dependency and bump `arcade-core` to `4.2.1` > > Installation/CI > > - New GitHub Actions workflow `test-install.yml` runs install/CLI checks on macOS, Windows, and Linux for Python 3.10/3.12 > - Add `tests/install/test_install.py` and README to verify install, `arcade` CLI availability, and `portalocker` locking behavior > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 3fe98fbcbf177f51fdb0b7fc51b20060f7fc85ad. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-01-09 12:34:36 -08:00
jottakka	98fad93d21	Adding MCP Servers supports to Arcade Evals (#689 ) # MCP Server Tool Evaluation Support ## Overview Add support for evaluating tools from remote MCP servers without requiring Python callables. Enables direct evaluation of any MCP-compatible tool server. ## What's New ### Core Features - `MCPToolRegistry`: Evaluate tools from a single MCP server - `CompositeMCPRegistry`: Evaluate tools from multiple MCP servers simultaneously - Automatic loaders: `load_from_stdio()` and `load_from_http()` to fetch tools from running servers - Automatic namespacing: Tools prefixed with server name (e.g., `server_tool_name`) - Smart name resolution: Use short names if unique, full names if ambiguous - OpenAI strict mode: Automatic schema conversion prevents parameter hallucinations ### Usage Automatic Loading: ```python from arcade_evals import load_from_stdio, MCPToolRegistry # Load tools automatically from MCP server tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-github"]) registry = MCPToolRegistry(tools) ``` Single MCP Server: ```python from arcade_evals import MCPToolRegistry, ExpectedToolCall registry = MCPToolRegistry(mcp_tools) suite = EvalSuite(catalog=registry) suite.add_case( expected_tool_calls=[ ExpectedToolCall(tool_name="tool_name", args={...}) ] ) ``` Multiple MCP Servers: ```python from arcade_evals import CompositeMCPRegistry, load_from_stdio # Load from multiple servers github_tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-github"]) slack_tools = load_from_stdio(["npx", "-y", "@modelcontextprotocol/server-slack"]) composite = CompositeMCPRegistry( tool_lists={ "github": github_tools, "slack": slack_tools, } ) suite = EvalSuite(catalog=composite) suite.add_case( expected_tool_calls=[ ExpectedToolCall(tool_name="github_list_issues", args={...}) ] ) ``` ## Implementation ### Files Changed - `libs/arcade-evals/arcade_evals/registry.py` (NEW): Registry abstractions and implementations - `libs/arcade-evals/arcade_evals/loaders.py` (NEW): Automatic tool loading from MCP servers - `libs/arcade-evals/arcade_evals/eval.py` (MODIFIED): Enhanced `ExpectedToolCall` and evaluation logic - `libs/arcade-evals/arcade_evals/__init__.py` (MODIFIED): Exported new registries and loaders ### Key Technical Details - Added `BaseToolRegistry` interface for abstraction - `MCPToolRegistry` handles single server tools - `CompositeMCPRegistry` manages multiple servers with collision detection - `load_from_stdio()` and `load_from_http()` for automatic tool discovery - Fixed name normalization bug: MCP tools use underscores (not dots) - Optimized tool copying: 2.5x faster via shallow copy ## Testing - ✅ 41 tests passing (25 new tests added) - ✅ `test_eval_mcp_registry.py`: MCPToolRegistry functionality - ✅ `test_eval_composite_mcp.py`: CompositeMCPRegistry with multiple servers - ✅ Verified backward compatibility with Python tools ## Backward Compatibility ✅ 100% backward compatible - No breaking changes ## Breaking Changes None <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Adds end-to-end eval UX: examples, a robust CLI runner, and rich outputs. > > - New examples: `eval_arcade_gateway.py`, `eval_stdio_mcp_server.py`, `eval_http_mcp_server.py`, `eval_comprehensive_comparison.py` with timeouts, error handling, and track-based comparisons; detailed `README.md` > - CLI runner: `arcade_cli/evals_runner.py` to execute evals/capture in parallel with progress, error isolation, failed-only filtering, context inclusion, and multi-provider/model support > - Output formatters: `arcade_cli/formatters/` (txt, md, html, json) for evals and capture; comparative and multi-model HTML with tabs and context rendering > - Display refactor: `display.py` now supports writing multiple formats, failed-only disclaimers, include-context, and improved console summaries > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit ff8acf9c34a6b61462a019a1ee9df081006517d0. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Francisco Liberal <francisco@arcade.dev> Co-authored-by: Mateo Torres <torresmateo@gmail.com>	2026-01-07 20:26:23 -03:00
Eric Gustin	25309c4e15	Fix broken links (#738 ) https://github.com/ArcadeAI/docs/pull/622 moved a lot of files to new URLs <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Updates references to Arcade docs after site restructure and bumps package versions. > > - Update docs URLs in `README.md`, `SECURITY.md`, contrib READMEs (CrewAI, LangChain), and CLI template README to new `/en/...` paths > - Update `documentation_url` in `arcade_mcp_server/server.py` error message to the new "compare server types" doc > - Bump versions: `arcade-mcp-server` to `1.14.1` and root `arcade-mcp` to `1.7.2` > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 673b1ee7c2e5be6885ffd64914e7600b4685aaac. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-01-05 13:27:16 -08:00
jottakka	7a06bdfa7e	PagerDuty typed OAuth object (#718 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Adds a typed `PagerDuty` OAuth2 provider and wires it through TDK/MCP exports, with tests and coordinated version/dependency bumps. > > - Auth (core): > - Add typed OAuth2 provider `PagerDuty` (`provider_id="pagerduty"`) in `arcade_core/auth.py`. > - TDK & MCP Server: > - Re-export `PagerDuty` in `arcade_tdk/auth/__init__.py` and `arcade_mcp_server/auth/__init__.py`. > - Tests: > - Extend `test_tool_decorator.py` and `test_create_tool_definition.py` to cover `PagerDuty` success/failure and tool requirement generation. > - Versioning/Deps: > - Bump versions: `arcade-core`→`4.1.0`, `arcade-tdk`→`3.4.0`, `arcade-mcp-server`→`1.14.0`, root `arcade-mcp`→`1.7.1`. > - Update dependency ranges to require the bumped versions. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 2b60261b1962586ea58831ccb6ea66e57053ac86. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Francisco Liberal <francisco@arcade.dev>	2025-12-15 17:42:11 -03:00
Sterling Dreyer	069ce70fb2	Instrumentation for outbound requests (#726 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Adds OpenTelemetry instrumentation for outbound HTTP (httpx, aiohttp, requests), guards span environment attribute, and updates package/version deps. > > - Telemetry: > - Instrument outbound HTTP clients with OpenTelemetry: `httpx`, `aiohttp-client`, and `requests`. > - Tighten excluded span types using `Literal`. > - Core: > - Guard setting `environment` span attribute in `CallToolComponent` only if present on `worker`. > - Packaging: > - Bump `arcade-serve` to `3.2.1`. > - Add new dependencies for HTTP client instrumentation and `aiohttp`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 9ab57f3c33d6033ff9ec4c6a40445a85328b169a. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2025-12-12 15:30:11 -08:00
Eric Gustin	4d54b28926	Bump some verisons (#723 ) `arcade-mcp-server` version was not bumped in https://github.com/ArcadeAI/arcade-mcp/pull/717, so this PR bumps `arcade-mcp-server`, and then update's `arcade-mcp`'s dependency on `arcade-mcp-server` by increasing the minimum version <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Bumps arcade-mcp-server to 1.13.0, updates arcade-mcp to 1.6.2, and raises related dependency minimums (including example auth server). > > - Versions: > - Bump `libs/arcade-mcp-server` project version from `1.12.0` to `1.13.0`. > - Bump `arcade-mcp` package version from `1.6.1` to `1.6.2`. > - Dependencies: > - Raise `arcade-mcp` dependency on `arcade-mcp-server` to `>=1.13.0` in `pyproject.toml` (including `all` extra). > - Increase example server `examples/mcp_servers/authorization/pyproject.toml` minimum `arcade-mcp-server` to `>=1.12.0`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 8a4f606bd8d0b48dd50e3e8e836d31bb679c6eba. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2025-12-11 14:09:23 -08:00
Nate Barbettini	aae9b3a49c	feat: Support multiple orgs & projects in Arcade CLI (#717 ) Fixes [PLT-720: Refactor CLI to support multiple orgs + projects](https://linear.app/arcadedev/issue/PLT-720/refactor-cli-to-support-multiple-orgs-projects) This PR removes the legacy login flow (login to get an API key) from Arcade CLI. Believe it or not, this flow predates the ability to get an API key from the Dashboard, or even the Dashboard itself! Notable changes: Legacy handling - When a user with an existing `credentials.yaml` updates the CLI, they will get instructions on fixing their old credentials: <img width="978" height="146" alt="Screenshot 2025-12-08 at 10 10 37" src="https://github.com/user-attachments/assets/5aeaef2c-bef7-4642-a2f7-f917b257c94b" /> Any commands that require login (non-public commands) will be blocked with the above message until `arcade logout / arcade login` is performed again. New login flow ```sh arcade login Opening a browser to log you in... ✅ Logged in as nate@arcade.dev. Active project: Nate Barbettini's organization / Default project Run 'arcade org list' or 'arcade project list' to see available options. ``` List and set the active organization ```sh arcade org list ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┓ ┃ Name ┃ ID ┃ Default ┃ Active ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━┩ │ Nate Barbettini's organization │ 1c64968e-fdc5-4c55-8612-2ce46cd7881b │ ✓ │ ✓ │ │ Sergio 743 │ 1f1f6184-58dc-4bac-bdde-b9184e43fdf3 │ │ │ └────────────────────────────────┴──────────────────────────────────────┴─────────┴────────┘ Use 'arcade org set <org_id>' to switch organizations. ``` ```sh arcade org set 1c64968e-fdc5-4c55-8612-2ce46cd7881b ✓ Switched to organization: Nate Barbettini's organization Active project: Default project ``` List and set the active project ```sh arcade project list Active organization: Nate Barbettini's organization Use 'arcade org list' and 'arcade org set <org_id>' to switch organizations. ┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┓ ┃ Name ┃ ID ┃ Default ┃ Active ┃ ┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━┩ │ Default project │ 35166bf3-6e68-481e-bf16-f747fadc6c22 │ ✓ │ ✓ │ │ Second project │ 62963205-31ea-4fda-9fc4-af10db89c06f │ │ │ └─────────────────┴──────────────────────────────────────┴─────────┴────────┘ Use 'arcade project set <project_id>' to switch projects. ``` ```sh arcade project set 35166bf3-6e68-481e-bf16-f747fadc6c22 ✓ Switched to project: Default project ``` <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Migrates CLI to OAuth2 (PKCE) with saved org/project context, adds org/project commands, rewrites Engine calls to org-scoped endpoints, and bumps core packages. > > - Auth & Config > - Implement OAuth2 Authorization Code + PKCE (`arcade_cli/authn.py`) with local callback server and Jinja templates. > - Persist tokens and active `context` (org/project) in `credentials.yaml` via updated config models (`arcade_core/config_model.py`). > - Add token refresh and CLI config fetch utilities (`arcade_core/auth_tokens.py`). > - Detect legacy API-key credentials and block protected commands until re-login; add `whoami` command. > - Org/Project Management > - New subcommands: `arcade org list\|set`, `arcade project list\|set` (fetch via Coordinator). > - Engine API usage (org-scoped) > - Introduce org/project URL rewriting transports (`arcade_core/network/org_transport.py`) and helpers (`get_org_scoped_url`, `get_arcade_client`, `get_auth_headers`). > - Update `deploy`, `server`, and `secret` commands to use Bearer tokens and org-scoped paths; adjust log streaming/status, secrets CRUD, and deployment workflows. > - CLI UX > - Replace legacy login URLs/constants; add success/failure HTML templates for browser callback. > - Tweak `dashboard` to health-check without credentials. > - Usage tracking now includes `org_id`/`project_id` properties. > - Tests > - Update tests for dashboard, secrets, utils, and usage identity (OAuth `/whoami`). > - Dependencies & Versions > - Bump packages: `arcade-core@4.0.0`, `arcade-mcp-server@1.12.0`, `arcade-serve@3.2.0`, `arcade-tdk@3.3.0`; add `authlib`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 49702c2f74b9db15bb286d3ec71179b4e74a9134. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2025-12-11 12:58:55 -08:00
Eric Gustin	98fd13c4ed	Front-Door Auth (#696 ) # Valuable references for the reviewer: - Docs PR: https://github.com/ArcadeAI/docs/pull/583 - Implements Phase 1 of the following planning doc: https://linear.app/arcadedev/project/arcade-mcp-supports-mcp-auth-front-door-auth-7cbaa20cb054/overview https://github.com/user-attachments/assets/79ad43fd-f5e8-4793-a1dd-18b35acefdc3 # PR Description Adds OAuth 2.1 Resource Server authentication to arcade-mcp-server, enabling HTTP MCP servers to validate Bearer tokens on every request. This unlocks tool-level authorization and secrets support for HTTP servers. - Multiple authorization server support - Granular token validation options (verify_exp, verify_iat, verify_iss) - Environment variable configuration - OAuth discovery metadata endpoint (/.well-known/oauth-protected-resource) - Extracts sub claim from token as context.user_id - Lifts transport restrictions for tools requiring auth/secrets on HTTP when protected ```python from arcade_mcp_server import MCPApp from arcade_mcp_server.resource_server import ResourceServerAuth, AuthorizationServerEntry resource_server_auth = ResourceServerAuth( canonical_url="http://127.0.0.1:8000/mcp", authorization_servers=[ AuthorizationServerEntry( authorization_server_url="https://auth.example.com", issuer="https://auth.example.com", jwks_uri="https://auth.example.com/jwks", ) ], ) app = MCPApp(name="my_server", version="1.0.0", auth=resource_server_auth) ``` # Testing Beyond the comprehensive unit tests, I also manually tested end-to-end with WorkOS Authkit (DCR) and KeyCloak (non-DCR). # Future Work - CIMD support - An `ArcadeResourceServer` to make adding front-door auth super easy when using Arcade's Auth Server <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Adds OAuth 2.1 front-door auth (JWKS validation + OAuth discovery) and propagates user identity to tools, enabling auth/secret-requiring tools over HTTP. > > - Authentication (Front-Door OAuth 2.1) > - New `resource_server` module with `ResourceServerAuth` (multi-authorization-server, metadata) and `JWKSTokenValidator` (JWKS-based JWT validation) plus granular validation options. > - ASGI `ResourceServerMiddleware` validates Bearer tokens on every HTTP request and injects `resource_owner`. > - OAuth discovery endpoint via FastAPI router at `/.well-known/oauth-protected-resource[/<path>]`. > - Integration > - `MCPApp`/`worker` accept `auth`/`resource_server_validator`, mount middleware, expose discovery; logs accepted auth servers. > - HTTP transport (`http_streamable`) carries `SessionMessage` with `resource_owner` from request → session. > - `Context`/`Session`/`Server` plumb `resource_owner`; `Server` selects `user_id` preferring token `sub`. > - Behavior Changes > - HTTP transport restriction lifted for tools requiring `authorization`/`secrets` when request is authenticated; otherwise blocked with actionable error. > - Configuration > - Env-var based auth config via `MCP_RESOURCE_SERVER_` in `MCPSettings.ResourceServerSettings`; `.env` auto-load. > - Telemetry* > - Usage tracking records `resource_server_type` on server start. > - Examples > - New `examples/mcp_servers/authorization` sample server (HTTP auth, secrets, Reddit tool) with Docker setup. > - Tests > - Extensive unit tests for validators, middleware, env config, multi-AS, transport rules, and app integration. > - Version > - Bump `arcade-mcp-server` to `1.12.0`; minor docstring tweak in `__init__.py`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit d1116cdcafb0c7cb8f91e66682eb1fbae380da31. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> Resolves TOO-152	2025-12-11 12:51:20 -08:00
Sterling Dreyer	99c22f0ebb	Ability to run multiple uvicorn workers (#721 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Adds --workers to HTTP mode with validation, refactors server startup/discovery for multi-process uvicorn, and removes all Docker-related files/configs. > > - MCP Server (HTTP mode) > - Add `--workers` arg to run multiple uvicorn workers; block `workers > 1` with `stdio`, and `reload` with multiple workers. > - Refactor startup: move tool discovery/config into `create_arcade_mcp_factory()` driven by env vars; use `uvicorn.run(..., workers=...)` for multi-worker/reload; retain `serve_with_force_quit()` only for single-worker. > - Adjust CLI to only discover tools in `stdio` path; HTTP path now delegates discovery to the factory. > - MCPApp > - Minor run path cleanup; continue using `serve_with_force_quit()` for single-worker HTTP. > - Ops/Packaging > - Remove `docker/` directory and all Dockerfiles, compose/configs, and docs. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit c5700ac8855173c1e82c6f7e41b30ca173aaec14. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2025-12-10 11:06:24 -08:00
Evan Tahler	0fc9e21308	Improve error messages with fix instructions (#713 ) Improve user-facing error messages to provide actionable fix instructions, enhancing developer experience and reducing support queries. --- Linear Issue: [TOO-199](https://linear.app/arcadedev/issue/TOO-199/audit-error-messages-for-actionable-fix-instructions) <a href="https://cursor.com/background-agent?bcId=bc-e764f9a0-3581-4ced-b34a-2c48f3df1021"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/open-in-cursor-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/open-in-cursor-light.svg"><img alt="Open in Cursor" src="https://cursor.com/open-in-cursor.svg"></picture></a> <a href="https://cursor.com/agents?id=bc-e764f9a0-3581-4ced-b34a-2c48f3df1021"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/open-in-web-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/open-in-web-light.svg"><img alt="Open in Web" src="https://cursor.com/open-in-web.svg"></picture></a> <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Enhances MCP server/session error responses with clear, actionable guidance across JSON-RPC, tools, and resources; updates tests to assert new messages. > > - Server (`arcade_mcp_server/server.py`) > - Actionable JSON-RPC errors: Rich messages for `Invalid request`, `Not initialized`, `Method not found`, and internal errors with troubleshooting steps. > - Tools: > - `tools/list`/`tools/call`: Improved internal error messages; user-facing guidance on failures. > - Unknown tool: returns detailed fix instructions. > - Transport restrictions: explicit "Unsupported transport" guidance for HTTP vs `stdio` with docs link. > - Auth flow: messages for missing API key, pending authorization (with `authorization_url`), and authorization errors; includes next steps. > - Secrets: clear "Missing secret(s)" with `.env`/env-var setup instructions. > - Resources/Prompts: > - `resources/list`, `resources/templates/list`, `resources/read`, `prompts/list`, `prompts/get`: Detailed failure and not-found messages with guidance. > - Session (`arcade_mcp_server/session.py`) > - Enhanced internal error response formatting with troubleshooting steps. > - Tests (`libs/tests/arcade_mcp_server/test_server.py`) > - Updated assertions to match new, descriptive messages (e.g., "Authorization required", "Missing Arcade API key", "Unsupported transport"). > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 97a6db4ec80a1ea9597f3364b6325d47948c94e0. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Eric Gustin <34000337+EricGustin@users.noreply.github.com>	2025-12-10 10:16:38 -08:00

1 2 3

150 commits