## Summary
- Add shared span attributes for tool error diagnostics, including
developer-facing messages when present.
- Wire those attributes through MCP server, worker RunTool, and HTTP
CallTool spans while keeping default MCP response content public-only.
- Cover no-leak response behavior, non-recording spans, outputless
worker responses, and the shared attribute contract.
## Verification
- `uv run ruff format ...`
- `uv run ruff check ...`
- `uv run pytest -W ignore
libs/tests/arcade_mcp_server/test_debug_exposure_integration.py
libs/tests/core/test_log_extras.py
libs/tests/worker/test_worker_base.py`
Made with [Cursor](https://cursor.com)
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Adds new telemetry attributes that propagate tool error messages
(including optional developer_message) into active spans across MCP
server and worker execution paths; risk is mainly around potential
leakage of sensitive developer messages into tracing backends and
changes to observability contracts.
>
> **Overview**
> Adds a shared
`arcade_core.log_extras.build_tool_error_span_attributes()` helper and
wires it into tool error paths so the current OpenTelemetry span is
annotated with stable `tool_error_*` attributes (including
`developer_message` when present).
>
> MCP tool calls now record these span attributes on failure while
keeping default MCP response content sanitized, and `arcade-serve`
records the same attributes on both `RunTool` and HTTP `CallTool` spans
(handling `output=None`). Versions and dependency constraints are bumped
to consume the new core helper, with tests added/updated to lock the
span-attribute contract and verify behavior for non-recording spans and
no-leak responses.
>
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
33a53991d72140a662152f508dc53e9b769b9f07. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Low Risk**
> Low risk: only rotates the PostHog project API key used for CLI
telemetry; main risk is misconfiguration causing events to be sent to
the wrong project or dropped.
>
> **Overview**
> Updates `UsageService` to use a new PostHog project API key for CLI
usage telemetry, redirecting all `alias` and background `capture` events
to the new PostHog project.
>
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
8cb1c7f1cc5bccd22cb2c73469848d88a703f27e. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
## Summary
- Adds `NetworkTransportError` — a new sibling to `UpstreamError` under
`ToolExecutionError` — for failures where no complete HTTP response was
received from the upstream service (timeouts, connection errors, pool
exhaustion, DNS failures, decoding issues, redirect exhaustion)
- Routes client-construction bugs (`InvalidURL`, `UnsupportedProtocol`,
`MissingSchema`, `SSLError`, `InvalidHeader`, etc.) to existing
`FatalToolError` instead of `UpstreamError`
- Adds 3 new `ErrorKind` values: `NETWORK_TRANSPORT_RUNTIME_TIMEOUT`,
`_UNREACHABLE`, `_UNMAPPED` — operationally distinct telemetry slices
matching the UpstreamError pattern
- `UpstreamError` is unchanged and reserved for real HTTP responses with
status codes
Addresses Eric's feedback on #820: the `include_status_code=False`
post-init null-out workaround is replaced by a clean class hierarchy
where `NetworkTransportError.status_code` is natively `None`.
### Changes
| File | What |
|---|---|
| `arcade-core/errors.py` | 3 new `ErrorKind` values,
`NetworkTransportError` class, `is_network_transport_error` helper |
| `arcade-tdk/providers/http/error_adapter.py` | Full rewrite of httpx +
requests exception routing with 3-way split |
| `arcade-tdk/providers/graphql/error_adapter.py` |
`TransportConnectionFailed`/`TransportProtocolError` →
`NetworkTransportError` |
| `arcade-tdk/errors.py`, `arcade-mcp-server/exceptions.py` | Re-exports
|
| `pyproject.toml` × 3 | Version bumps: core 4.7.0, tdk 3.7.0,
mcp-server 1.20.0 |
| Tests × 3 | 33 new tests, 3 updated (2659 passed, 0 failures) |
### Exception routing table
| Exception | Target | Kind | can_retry |
|---|---|---|---|
| `httpx.HTTPStatusError`, `requests.HTTPError` (with response) |
`UpstreamError` | status-derived | status-derived |
| `httpx.TimeoutException`, `requests.Timeout` | `NetworkTransportError`
| `TIMEOUT` | ✅ |
| `httpx.TransportError`, `requests.ConnectionError` |
`NetworkTransportError` | `UNREACHABLE` | ✅ |
| `httpx.DecodingError`, `TooManyRedirects`, fallback |
`NetworkTransportError` | `UNMAPPED` | varies |
| `httpx.InvalidURL`/`UnsupportedProtocol`/`LocalProtocolError`,
`requests.MissingSchema`/`SSLError`/etc. | `FatalToolError` |
`TOOL_RUNTIME_FATAL` | ❌ |
### Engine companion PR
ArcadeAI/monorepo — `feat/network-transport-error-kinds` adds the 3
`ErrorKind` constants to Go schemas + OpenAPI docs. No engine logic
changes needed (ErrorKind is a string alias, retry uses `can_retry` flag
only, telemetry auto-slices).
## Test plan
- [x] 2659 existing tests pass (0 failures)
- [x] 33 new routing + class tests added
- [x] mypy clean on arcade-core, arcade-tdk
- [ ] Verify engine telemetry dashboard auto-surfaces new
`NETWORK_TRANSPORT_*` kinds after deploy
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Changes the error taxonomy and classification helpers used for
retries/telemetry, so misclassification could affect operational
behavior, but the change is additive and covered by new tests.
>
> **Overview**
> Adds a new error category for outbound request failures that never
yield a complete upstream response: `NetworkTransportError` (sibling to
`UpstreamError`) plus
`ErrorKind.NETWORK_TRANSPORT_RUNTIME_{TIMEOUT,UNREACHABLE,UNMAPPED}` and
matching `is_network_transport_error` classification helpers on both
`ToolkitError` and the wire-model `ToolCallError`.
>
> Re-exports `NetworkTransportError` from `arcade-tdk` and
`arcade-mcp-server`, bumps package versions (`arcade-core` 4.7.0,
`arcade-tdk` 3.7.0, `arcade-mcp-server` 1.20.0) and dependency minimums,
and expands `core/test_errors.py` to cover the new kind
invariants/defaults and classification behavior.
>
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
d2b89078729c6a67ba42684dc98445352238bc1d. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
- Improve tool call error messages across 4 libraries (arcade-core,
arcade-tdk, arcade-mcp-server, arcade-serve) so agents can self-correct
and Datadog can facet on structured fields
- Guard empty error messages, enrich input validation errors with
field-level detail, fix `@tool` decorator fallback formatting, surface
`additional_prompt_content` in MCP responses, and add structured log
extras for Datadog
- Addresses the 3 worst error patterns: generic "Error in tool input
deserialization", bare `KeyError` values, and empty `FatalToolError`
messages
**Linear:** TOO-627
**Plan:** `docs/plans/2026-04-08-improve-error-messages-handoff.md`
## Tasks
- [ ] Task 1: Guard empty error messages (arcade-core)
- [ ] Task 2: Enrich input validation error messages (arcade-core)
- [ ] Task 3: Improve `@tool` decorator error fallback (arcade-tdk)
- [ ] Task 4: Fix MCP agent-facing error response (arcade-mcp-server)
- [ ] Task 5: Add structured log extras in BaseWorker (arcade-serve)
- [ ] Task 6: Add structured log extras in MCP server
(arcade-mcp-server)
## Test plan
- [ ] Each task has dedicated unit tests verifying the new behavior
- [ ] `make test` passes after all tasks
- [ ] `make check` (ruff + mypy) passes
- [ ] Verify the 3 worst error patterns now produce actionable messages
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Touches cross-library error formatting and logging behavior used in
production tool execution paths; while mostly additive/guardrails, it
changes agent-visible messages and Datadog log facets, which could
impact client expectations and alerting.
>
> **Overview**
> Improves tool-call error handling across core/runtime, MCP transport,
worker transport, and the TDK to make agent-visible failures more
actionable while *reducing sensitive-data leakage*.
>
> In `arcade-core`, empty error messages now get placeholders,
`ToolOutputFactory.fail*` defaults blank messages, and input validation
errors are rewritten as field-level summaries that intentionally omit
rejected values (avoiding Pydantic echo of secrets). The `@tool`
fallback in `arcade-tdk` no longer surfaces `str(exception)` to agents;
it returns exception *type-only* in `message` while preserving full
detail in `developer_message`.
>
> Adds a shared `build_tool_error_log_extra` helper and updates
`arcade-serve` + `arcade-mcp-server` to emit consistent structured
WARNING logs (`error_*`, `tool_name`, optional toolkit/version) for
Datadog, while MCP error responses now append
`additional_prompt_content` and force `structuredContent=None` on
failures per spec. Includes extensive new tests and bumps package
versions (`arcade-core` 4.6.2, `arcade-tdk` 3.6.1, `arcade-mcp-server`
1.19.3, `arcade-serve` 3.2.3).
>
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
e5c7ebcaf56176cfbd8e6d1f2b6295352abd0ec0. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a tool’s output TypedDict uses total=False, MCP clients reject the
response with:
```
MCP error -32602: Structured content does not match the tool's output schema
```
Note that the bug also exists for the Engine transport
(/worker/tools/execute), but since the engine doesn't validate the
output schema, the bug never surfaced. This PR addresses the problem
holistically (MCP and Engine) in preparation for a future where the
Engine transport validates output schemas.
Two bugs combined to cause this:
1. Schema: The outputSchema had no required array and declared all
fields as strict types (e.g. "type": "string"), making every field look
mandatory and non-null.
2. Serialization: model_dump() on TypedDict-derived Pydantic models
emitted None for absent optional fields. A tool returning {"name":
"hello"} produced {"name": "hello", "optional_field": null} which is a
value the schema forbids.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Adjusts core schema generation and MCP JSON Schema conversion for
TypedDicts, affecting how tool input/output contracts are emitted and
validated across clients; mistakes could break compatibility or
validation behavior.
>
> **Overview**
> Fixes MCP/engine validation failures for `TypedDict(total=False)`
outputs by ensuring absent optional keys are **omitted from serialized
output** and that emitted schemas correctly describe **required vs
optional** keys.
>
> `arcade-core` now tracks `required_keys`/`inner_required_keys` and
per-field `nullable` in `ValueSchema`, derives required sets from
TypedDict `__required_keys__`, and unwraps `Optional[T]` to support
optional nested TypedDicts; TypedDict-derived Pydantic models now
`model_dump(exclude_unset=True)` to avoid leaking missing fields as
`null`.
>
> `arcade-mcp-server` JSON Schema conversion now emits `required` arrays
(including for arrays of objects), supports `nullable` by generating
`type: [<type>, "null"]` (and `enum` including `None`), and treats
nullable top-level objects as valid unwrapped output schemas. Adds
focused unit/end-to-end tests plus an expanded example server
demonstrating total-false, mixed required/optional, nullable, and
optional-nested TypedDict outputs, and bumps package
versions/dependencies accordingly.
>
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
53fe8365f613053599130520b75f30b614b465ca. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
We will be adding some product analytics toolkits in the near future
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Low Risk**
> Low risk: adds a new `ServiceDomain` enum value and bumps the package
version, with minimal behavioral impact beyond any downstream enum
matching/serialization expectations.
>
> **Overview**
> Adds `ServiceDomain.PRODUCT_ANALYTICS` (`"product_analytics"`) to tool
metadata classification to support upcoming product analytics
integrations.
>
> Bumps `arcade-core` version from `4.5.0` to `4.6.0`.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
84666eaf997401559f8025dbe43563fdd03acd49. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- CURSOR_SUMMARY -->
> [!NOTE]
> **Medium Risk**
> Tightens validation on tool metadata, which may break existing tools
that relied on non-JSON-serializable `extras` values or keys; changes
are localized and well-covered by tests.
>
> **Overview**
> Adds **JSON-safety enforcement** for `ToolMetadata.extras`: top-level
keys must be strings at model construction, and `validate_for_tool()`
now recursively rejects non-JSON-native values (including non-finite
floats) with path-rich `ToolDefinitionError` messages.
>
> Expands tests to cover valid/invalid nested `extras` cases and
error-message quality, and bumps `arcade-core` version to `4.5.0`.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
2bab0db3c17f0ddb97868764d10494da543b39e5. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- CURSOR_SUMMARY -->
> [!NOTE]
> **Medium Risk**
> Touches authentication/login flow, credentials-file permissions, and
subprocess lifecycle behavior across platforms; while mostly defensive,
regressions could impact login or process management on Windows/macOS
runners.
>
> **Overview**
> Improves Windows/cross-platform reliability across the CLI and MCP
server: OAuth login now binds the callback server to `127.0.0.1`, avoids
slow loopback reverse-DNS, adds a configurable callback timeout
(`--timeout` + env default), and opens URLs via a Windows-friendly
`_open_browser` to avoid flashing console windows.
>
> Centralizes CLI output via a shared `console` that forces UTF-8 on
Windows, standardizes UTF-8 file reads/writes throughout, tightens
credentials-file permissions on Windows using `icacls`, and adds shared
Windows subprocess helpers for **no-window** process creation and
graceful termination (used by `deploy`, MCP reload, and usage-tracking
worker).
>
> Updates client configuration UX/robustness (Windows AppData resolution
via `platformdirs`, Cursor config path fallbacks + compatibility writes,
overwrite warnings, absolute `uv` path for GUI clients, safer path
display) and improves `deploy` child-process handling to avoid
pipe-buffer deadlocks while giving better debug-aware error messages.
>
> Expands CI to run tests on Linux/Windows/macOS, adds a no-auth CLI
integration workflow, disables usage tracking in toolkits CI, and adds
extensive regression tests for Windows signals, subprocess cleanup,
UTF-8, and config-path edge cases; bumps `arcade-core` to `4.4.2` and
`arcade-mcp-server` to `1.17.2` (with updated dependency pin).
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
0fabd8ca1cd647039ba6ddbdf3f7809c330bab9e. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Add Attio to the wellknown OAuth2 provider classes so toolkits can use
Attio(scopes=[...]) instead of OAuth2(id=..., scopes=[...]).
---------
Co-authored-by: Eric Gustin <eric@arcade.dev>
we actually don't want to deprecate ToolExecutionError
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Low Risk**
> Low risk: behavior change is limited to no longer emitting a
`DeprecationWarning` when `ToolExecutionError` is instantiated, plus a
patch version bump.
>
> **Overview**
> `ToolExecutionError` is no longer treated as deprecated: the
`warnings` import and runtime `DeprecationWarning` emission were
removed, and the class docstring was updated to describe it as the base
exception for errors raised from within a tool body.
>
> Bumps `arcade-core` version from `4.2.2` to `4.2.3`.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
250d20e393a8a4d8dc20fad673a7faea1cba4797. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
1. Resolves
[TOO-363](https://linear.app/arcadedev/issue/TOO-363/arcade-deploy-fails-when-additional-deps-are-added-to-the-server).
2. Resolves
[TOO-364](https://linear.app/arcadedev/issue/TOO-364/arcade-cores-tool-skip-logic-is-missing-case-for-direct-execution).
3. Resolves
[TOO-358](https://linear.app/arcadedev/issue/TOO-358/missing-evals-error-message-shows-wrong-command).
4. Resolves
[TOO-365](https://linear.app/arcadedev/issue/TOO-365/arcade-evals-unit-tests-are-hanging).
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Medium risk because it changes how `arcade deploy` spawns the server
process and adjusts toolkit discovery skip logic, which can affect
deployments and tool discovery; however, the changes are small and
covered by new unit/integration tests.
>
> **Overview**
> `arcade deploy` now starts the validation server using the project’s
`.venv` interpreter (via `find_python_interpreter`) instead of the CLI’s
own `sys.executable`, preventing missing dependency failures when the
CLI is installed in an isolated env.
>
> `arcade-core`’s `Toolkit.tools_from_directory` skip logic is hardened
to also skip the currently executing entrypoint by module name
(`__main__.__spec__.name`) when file paths don’t match (e.g., bundled
execution). CLI error printing now escapes plain messages to avoid rich
markup issues, and `arcade-evals` lock acquisition accepts an optional
timeout default.
>
> Adds unit tests for the new toolkit skip behavior and an integration
test that boots the MCP server via direct Python invocation to mirror
deployment behavior, and bumps `arcade-core`, `arcade-mcp-server`, and
root dependency versions accordingly.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
e7785634c231c059f2e0bd1bc73a56bd7470a494. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
The 'MCP server started' events would fail to send to posthog if the CLI
was not installed. This PR fixes this by moving PostHog from being a
dependency of the CLI to a dependency of arcade-core.
<!-- CURSOR_SUMMARY -->
> [!NOTE]
> Aligns versions and dependency ranges across the CLI and server
packages; removes an unnecessary dependency.
>
> - Bump `arcade-mcp-server` to `1.14.2` and `arcade-mcp` to `1.8.1`
> - Update `arcade-core` constraint to `>=4.2.1,<5.0.0`; CLI now
requires `arcade-mcp-server>=1.14.2`
> - Remove `posthog` from CLI dependencies
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
69f8bb397737d4c01f57630863762109819dbc4f. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
So even importing `fcntl` causes problems on windows. This PR replaces
fcntl with portalocker. Tests all pass, so I think we are good.
ref:
https://arcade-ai.slack.com/archives/C08K1SJ072S/p1767897850450239?thread_ts=1766186586.406019&cid=C08K1SJ072S
<img width="934" height="501" alt="Screenshot 2026-01-08 at 2 57 46 PM"
src="https://github.com/user-attachments/assets/1375b6b2-116c-44bd-bbe1-2157dd243d29"
/>
Closes ENGTOP-8
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Cross-platform file locking**
>
> - Replace `fcntl` with `portalocker` in
`arcade_core/usage/identity.py` (shared/exclusive locks); switch to
atomic `os.replace()`
> - Add `portalocker` dependency and bump `arcade-core` to `4.2.1`
>
> **Installation/CI**
>
> - New GitHub Actions workflow `test-install.yml` runs install/CLI
checks on macOS, Windows, and Linux for Python 3.10/3.12
> - Add `tests/install/test_install.py` and README to verify install,
`arcade` CLI availability, and `portalocker` locking behavior
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
3fe98fbcbf177f51fdb0b7fc51b20060f7fc85ad. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Fixes [PLT-720: Refactor CLI to support multiple orgs +
projects](https://linear.app/arcadedev/issue/PLT-720/refactor-cli-to-support-multiple-orgs-projects)
This PR removes the legacy login flow (login to get an API key) from
Arcade CLI. Believe it or not, this flow predates the ability to get an
API key from the Dashboard, or even the Dashboard itself!
Notable changes:
**Legacy handling** - When a user with an existing `credentials.yaml`
updates the CLI, they will get instructions on fixing their old
credentials:
<img width="978" height="146" alt="Screenshot 2025-12-08 at 10 10 37"
src="https://github.com/user-attachments/assets/5aeaef2c-bef7-4642-a2f7-f917b257c94b"
/>
Any commands that require login (non-public commands) will be blocked
with the above message until `arcade logout / arcade login` is performed
again.
**New login flow**
```sh
arcade login
Opening a browser to log you in...
✅ Logged in as nate@arcade.dev.
Active project: Nate Barbettini's organization / Default project
Run 'arcade org list' or 'arcade project list' to see available options.
```
**List and set the active organization**
```sh
arcade org list
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┓
┃ Name ┃ ID ┃ Default ┃ Active ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━┩
│ Nate Barbettini's organization │ 1c64968e-fdc5-4c55-8612-2ce46cd7881b │ ✓ │ ✓ │
│ Sergio 743 │ 1f1f6184-58dc-4bac-bdde-b9184e43fdf3 │ │ │
└────────────────────────────────┴──────────────────────────────────────┴─────────┴────────┘
Use 'arcade org set <org_id>' to switch organizations.
```
```sh
arcade org set 1c64968e-fdc5-4c55-8612-2ce46cd7881b
✓ Switched to organization: Nate Barbettini's organization
Active project: Default project
```
**List and set the active project**
```sh
arcade project list
Active organization: Nate Barbettini's organization
Use 'arcade org list' and 'arcade org set <org_id>' to switch organizations.
┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┓
┃ Name ┃ ID ┃ Default ┃ Active ┃
┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━┩
│ Default project │ 35166bf3-6e68-481e-bf16-f747fadc6c22 │ ✓ │ ✓ │
│ Second project │ 62963205-31ea-4fda-9fc4-af10db89c06f │ │ │
└─────────────────┴──────────────────────────────────────┴─────────┴────────┘
Use 'arcade project set <project_id>' to switch projects.
```
```sh
arcade project set 35166bf3-6e68-481e-bf16-f747fadc6c22
✓ Switched to project: Default project
```
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Migrates CLI to OAuth2 (PKCE) with saved org/project context, adds
org/project commands, rewrites Engine calls to org-scoped endpoints, and
bumps core packages.
>
> - **Auth & Config**
> - Implement OAuth2 Authorization Code + PKCE (`arcade_cli/authn.py`)
with local callback server and Jinja templates.
> - Persist tokens and active `context` (org/project) in
`credentials.yaml` via updated config models
(`arcade_core/config_model.py`).
> - Add token refresh and CLI config fetch utilities
(`arcade_core/auth_tokens.py`).
> - Detect legacy API-key credentials and block protected commands until
re-login; add `whoami` command.
> - **Org/Project Management**
> - New subcommands: `arcade org list|set`, `arcade project list|set`
(fetch via Coordinator).
> - **Engine API usage (org-scoped)**
> - Introduce org/project URL rewriting transports
(`arcade_core/network/org_transport.py`) and helpers
(`get_org_scoped_url`, `get_arcade_client`, `get_auth_headers`).
> - Update `deploy`, `server`, and `secret` commands to use Bearer
tokens and org-scoped paths; adjust log streaming/status, secrets CRUD,
and deployment workflows.
> - **CLI UX**
> - Replace legacy login URLs/constants; add success/failure HTML
templates for browser callback.
> - Tweak `dashboard` to health-check without credentials.
> - Usage tracking now includes `org_id`/`project_id` properties.
> - **Tests**
> - Update tests for dashboard, secrets, utils, and usage identity
(OAuth `/whoami`).
> - **Dependencies & Versions**
> - Bump packages: `arcade-core@4.0.0`, `arcade-mcp-server@1.12.0`,
`arcade-serve@3.2.0`, `arcade-tdk@3.3.0`; add `authlib`.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
49702c2f74b9db15bb286d3ec71179b4e74a9134. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Closes TOO-192
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Adds a Figma OAuth2 auth provider and wires it through TDK and MCP
server, with tests updated and package versions bumped.
>
> - **Auth**:
> - Add `Figma` OAuth2 provider in
`libs/arcade-core/arcade_core/auth.py`.
> - **Exports**:
> - Expose `Figma` in
`libs/arcade-mcp-server/arcade_mcp_server/auth/__init__.py` and
`libs/arcade-tdk/arcade_tdk/auth/__init__.py` (`__all__`).
> - **Tests**:
> - Add Figma auth requirement test case in
`libs/tests/tool/test_create_tool_definition.py` and import `Figma`.
> - **Versioning**:
> - Bump `arcade-mcp-server` to `1.10.2` and `arcade-tdk` to `3.2.0`.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
2bacfdc5695b3e7fc5e4532dbd360c3b2263130e. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Francisco Liberal <francisco@arcade.dev>
This PR does three things:
1. Executes synchronous tool calls in thread pool allowing for up to 4 +
# of CPUs executions in parallel.
2. Makes force quitting via double SIGINT/SIGTERM possible and via
single SIGINT/SIGTERM + graceful shutdown timeout expiry possible, even
if there are active connections.
3. Sets `timeout_graceful_shutdown` to
`ARCADE_UVICORN_TIMEOUT_GRACEFUL_SHUTDOWN` env var if set, else defaults
to 15.
4. Disable the worker health check span to reduce noise
Tradeoffs:
Since this PR introduces executing synchronous tools via `await
asyncio.to_thread(func, **func_args)`, this means that there is no way
for the thread to be killed until it finishes. The ramifications of this
is that the force quitting logic that is also implemented in this PR has
to be very harsh `os._exit(1)` just in case there is a sync tool
actively executing. This means that `MCPApp` teardown logic will not
execute when force quitting is required. Although this was already the
case because we weren't previously able to force quit! This tradeoff is
justified for now since "parallel" tool executions will relieve us of
many worker timeouts that we are seeing in prod.
Future work:
Minimize/eliminate the need for `os._exit(1)` such that `MCPApp`
teardown logic will always execute, even when force quitting. The
solution will likely be moving away from `await asyncio.to_thread(func,
**func_args)` (while maintaining "parallelism" and then utilize the
`TaskTrackerMiddleware` introduced in this PR to cancel all of the
active HTTP requests.
Resolves PLT-713
Since servers managed by Arcade use the `/worker` routes under the hood,
tools that use MCP-specific properties of `Context` will fail.
This PR helps reduce the 'blast radius' of the above fact. For
properties that were deemed 'non-critical' to the execution of a
deployed tool, we simply no-op. For properties that were deemed
'critical' to the execution of a deployed tool, we raise an error that
informs the caller that the feature is not supported for Arcade managed
servers.
- Non-critical property: A context property that returns None
- Critical property: A context property that may return something that
could be necessary for a tool execution to succeed.
#672 was a quick fix. This PR makes it a long term fix.
Whether a tool is added via `MCPApp.add_tools_from_module`,
`MCPApp.add_tool`, or `@app.tool`, the server's version and description
will be the same.
### The Bug:
When an entrypoint file imports its parent package and
calls add_tools_from_module() on that package, and the same entrypoint
file also defines tools using @app.tool or @tool decorators, then the
server fails to start with an `AttributeError`. This is because the
tools would be discovered via AST parsing, but those tools weren't added
to the module's namespace yet because the file is still executing.
For example, this would fail on startup:
```py
#!/usr/bin/env python3
"""local_filesystem MCP server"""
import sys
from typing import Annotated
from arcade_mcp_server import MCPApp
import local_filesystem
app = MCPApp(name="eric_server", version="1.0.0", log_level="DEBUG")
app.add_tools_from_module(local_filesystem)
@app.tool
def eric(name: Annotated[str, "The name of the person to greet"]) -> str:
"""Greet a person by name."""
return "return"
if __name__ == "__main__":
transport = sys.argv[1] if len(sys.argv) > 1 else "stdio"
app.run(transport="http", host="127.0.0.1", port=8074)
```
### The fix:
Skip the entrypoint file. This means that any tool defined inside of the
entrypoint file must be added via MCPApp.add_tool(...) or instead use
the recommended @app.tool.
# PR Description
Consider this PR the result of a full pass through of this repository.
## Add helper for adding tools to an `MCPApp`
You can now add all of the tools in a module to an `MCPApp` via
`app.add_tools_from_module(...)`
## Edit what `arcade new` generates
First, I updated the backend to use hatchling.
Second, the structure generated before this PR was simple, but did not
create a proper Python module.
This hindered developers in the following ways:
1. Difficult to add the tools in your server to an evaluation suite
2. Difficult to add more than one tool to an MCPApp at a time
3. All other niceties that come with being able to import modules
```
# Before
server/
├── .env.example
├── server.py
└── pyproject.toml
```
This PR updates the structure generated such that a valid Python module
is generated:
```
# After
server/
├── pyproject.toml
└── src/
└── server/
├── __init__.py
├── .env.example
└── server.py
```
## Fix Tool Chaining
`self._ctx.server.executor.run(...)` was being called, but `MCPServer`
does not have an instance of `ToolExecutor` (and it's not intended to be
an instance anyways). I updated `Tool.call_raw` to pass the programmatic
tool call through the `MCPServer._handle_call_tool`. This means that the
programmatic tool calls now go through the same steps that a typical
tool call (initiated by the MCP client) would.
This means that **toolA**, which specifies **requirementsA**, is
permitted to call **toolB**, which specifies **requirementsB**, without
needing to explicitly declare or satisfy **requirementsB**. I believe
this is acceptable because the secrets and/or auth token associated with
**toolB's** `Context` are not exposed to **toolA**, and the secrets
and/or auth token associated with **toolA's** `Context` are not exposed
to **toolB**.
## Fix User Elicitation
1. The read & write streams were created with a maximum queue size of 0.
I increased this to 100.
2. I updated `ServerSession`'s run loop to both read messages from the
stream & process them concurrently. This enables server initiated
requests (like user elicitation and progress reporting) to be handled
while tools are being executed. Otherwise, the server initiated requests
would wait for the tool to finish executing and the tool execution would
wait for the server initiated request to finish.
3.
## Fix Progress Reporting
Progress tokens sent by the client were not being stored. Therefore
there was no way to notify a client with progress updates. I am now
storing the `progressToken`, along with other `_meta` sent from the
client, in the `ServerSession`'s `_request_meta`. I am setting
`_request_meta` whenever the `MCPServer` is handling an incoming message
from a client.
## Fix handling of server names with spaces
Before:
Server name: "The simple server name"
Tool name: whisper_secret
Name seen by client: "The_simple_server_name_WhisperSecret"
After
Server name: "The simple server name"
Tool name: whisper_secret
Name seen by client: "TheSimpleServerName_WhisperSecret"
## Add Integration Tests
The stdio integration test is much more comprehensive than the http
integration test. These tests will let me sleep a bit more at night
## Add Example MCP Servers
Example servers for sampling, user-elicitation, progress reporting,
logging, tool chaining, combining prebuilt tools with custom tools, tool
secrets, tool auth, evaluations, and more!
## Add Docker template
Added a Docker template for running an MCP server in Docker (and removed
the old docker stuff)
1. Refactored the core usage logic from `arcade_cli` to `arcade_core`
2. Add "MCP server started" event
As always, opt out by setting `ARCADE_USAGE_TRACKING` to 0.
There are many more instances of toolkit within this repo, but the goal
of this PR is to get rid of user facing references as much as possible.
---------
Co-authored-by: Nate Barbettini <nate@arcade.dev>
Seeing that arcade-ai==2.2.3 doesn't allow for core, serve, or tdk
versions 3.x.x and that it doesn't know about arcade-mcp-server or
arcade-mcp, I feel confident that we can get this past the release
candidate stage. The current state of our documentation
(docs.arcade.dev) still references the 'old way' of doing things, so we
can gradually introduce these new packages to users without the hassle
of specifying pre release flags when installing
### New packages:
arcade-mcp==1.0.0
arcade-mcp-server==1.0.0
### Breaking change with major bump:
arcade-core==3.0.0 from 2.4.0
arcade-serve==3.0.0 from 2.1.0
arcade-tdk==3.0.0 from 2.5.0
### Deprecated:
arcade-ai==2.2.3
# Release Candidate 2
## This PR:
- [x] No more confusing 307 redirect logs when using `/mcp` instead of
`/mcp/` (requested by @shubcodes)
- [x] Fix bug in `arcade configure` for Python < 3.12 (reported by
@evantahler
- [x] Fix bug where tools with unsatisfied secret requirements could
still be executed (reported by @evantahler, @shubcodes)
- [x] Auth providers can now be imported via `from
arcade_mcp_server.auth import Reddit` (requested by @shubcodes)
- [x] Add complete E2E oauth flow for tool calls with informational
errors about how to log into arcade and where to go to authorize
(requested by @evantahler, @shubcodes)
- [x] Add OAuth tool in `arcade new`'s generated server (requested by
@shubcodes)
- [x] Standardize on defaulting to running servers on port 8000
- [x] Improve credentials.yaml reading logic
- [x] CLI user friendliness (requested by @Spartee)
- [x] Remove `arcade serve` CLI command
- [x] Fix race condition in `arcade logout`
- [x] Update docs for desired developer onboarding flow
## Next PRs:
- Get `arcade deploy` working for MCP servers. (Command is hidden for
now)
- Rename all occurrences of `toolkit` to `server`/`tools` and rename all
occurrences of `worker` to `server`
Versions:
* arcade-mcp\==1.0.0rc1
* arcade-mcp-server\==1.0.0rc1
* arcade-core\==2.5.0rc1
* arcade-tdk\==2.6.0rc1
* arcade-serve\==2.2.0rc1
### Summary
Adds first-class MCP support across Arcade, introduces a new MCP server
and CLI, unifies the project under the arcade-mcp name, overhauls
templates/scaffolding, and improves developer tooling, secrets
management, and examples.
### Highlights
- **MCP Server & Core**
- New MCP server with stdio and HTTP/SSE transports, session management,
resumability, and lifecycle handling.
- FastAPI-like `MCPApp` for building servers with lazy init; integrated
worker+MCP HTTP app option.
- Middleware system (logging and error handling), robust exception
hierarchy, and Pydantic-based settings.
- Async-safe managers for tools, resources, and prompts backed by
registries and locks.
- Developer-facing, transport-agnostic runtime context interfaces (logs,
tools, prompts, resources, sampling, UI, notifications).
- Conversion from Arcade ToolDefinition to MCP tool schema; OpenAI JSON
tool schema converter.
- Parser supports `@app.tool`/`@app.tool(...)` decorators.
- **CLI**
- New `mcp` command to run MCP servers with stdio or HTTP/SSE.
- New `secret` command to set/list/unset tool secrets (supports .env
input, preserves original casing for lookups).
- `new` command refactored; option to create a full toolkit package with
scaffolding.
- `chat` command removed.
- `serve.py` imports updated to `arcade_serve.fastapi.telemetry`;
version retrieval now uses `arcade-mcp`.
- `show.py` refactor to use new local catalog utilities.
- `display_tool_details` improved: adds “Default” column and handles
nested properties.
- **Configuration & Discovery**
- New `configure.py` to set up Claude Desktop, Cursor, and VS Code to
connect to local or Arcade Cloud MCP servers.
- Discovery utilities to find/install toolkits, build `ToolCatalog`s,
analyze files for tools, load kits from directories (pyproject parsing),
and build minimal toolkits.
- Better handling of provider API key resolution and evaluation suite
loading.
- **Templates & Scaffolding**
- Reorganized template structure (minimal vs full); moved
`.pre-commit-config.yaml`, `.ruff.toml`, license, Makefile, README,
tests, and tools layout to correct paths.
- Minimal template adds `.env.example` for runtime secret injection.
- Template pyproject updated for MCP servers; includes sample server
with greeting and secret-reveal tools.
- Authorization flow in templates simplified.
- **Repo-wide Renaming & Examples**
- Migrates references from `arcade-ai` to `arcade-mcp` across READMEs,
scripts, and package metadata.
- Examples updated (LangChain/LangGraph/AI SDK/TypeScript) and package
name changed to `arcade-mcp-sdk`.
- **Evals & Core Utilities**
- Evals now use OpenAI tooling format (`OpenAIToolList`, `to_openai`);
`tool_eval` takes `provider_api_key`.
- Core utilities: fixed `does_function_return_value` by dedenting before
parse; version bump to `2.5.0rc1` and dependency cleanup.
- **Tooling & CI**
- `setup-uv-env` action splits toolkit vs contrib dependency
installation.
- Pre-commit: excludes `libs/arcade-mcp-server/mkdocs.yml` and
`libs/tests/` from YAML and Ruff hooks; Ruff per-file ignores (e.g.,
C901 in `libs/**/*.py`, TRY400 in server docs paths).
- Makefile updates for uv env setup, quality checks, tests, builds, and
new `shell` target.
- Added Makefile to MCP server library to streamline dev workflow.
- **Cleanup**
- Removed `claude.json` config.
- Simplified stdio entrypoint; removed unused imports (`arcade_gmail`,
`arcade_search`).
### Breaking Changes
- **CLI**: `chat` command removed; use `mcp`, `secret`, and updated
`new`.
- **Naming**: All users should update references from `arcade-ai` to
`arcade-mcp`.
- **Templates**: File paths moved; downstream scripts referencing old
template locations may need updates.
### Getting Started
- Run an MCP server:
- `arcade mcp --stdio --toolkits your_toolkit`
- `arcade mcp --http --toolkits your_toolkit`
- Manage secrets:
- `arcade secret set your_toolkit KEY=value`
- `arcade secret list your_toolkit`
- `arcade secret unset your_toolkit KEY`
- Configure clients:
- `arcade configure` to set up Claude Desktop, Cursor, and VS Code for
local/Arcade Cloud MCP.
---------
Co-authored-by: Sam Partee <sam@arcade-ai.com>
Co-authored-by: Shub <125150494+shubcodes@users.noreply.github.com>
# Improvements to Arcade TDK Error Handling
I tried my very best to not make any breaking changes in this PR. So,
you will notice various "Deprecation" notices throughout.
### Instructions for PR reviewers
1. Pull down this PR's branch
2. Pull down the Engine's tool error handling PR's branch
3. Update your installed arcadepy to have the following:
- In `arcadepy/resources/tools/tools.py`, if you want to test out
including stacktraces, then you need to update `ToolsResource.execute`
to accept a `include_error_stacktrace` argument and also include the
"include_error_stacktrace" argument to the POST to the Engine inside of
the function's execute method's body.
- In `arcadepy/types/execute_tool_response.py` add the following enum
```py
class ErrorKind(str, Enum):
"""Error kind that is comprised of
- the who (toolkit, tool, upstream)
- the when (load time, definition parsing time, runtime)
- the what (bad_definition, bad_input, bad_output, retry,
context_required, fatal, etc.)"""
TOOLKIT_LOAD_FAILED = "TOOLKIT_LOAD_FAILED"
TOOL_DEFINITION_BAD_DEFINITION = "TOOL_DEFINITION_BAD_DEFINITION"
TOOL_DEFINITION_BAD_INPUT_SCHEMA = "TOOL_DEFINITION_BAD_INPUT_SCHEMA"
TOOL_DEFINITION_BAD_OUTPUT_SCHEMA = "TOOL_DEFINITION_BAD_OUTPUT_SCHEMA"
TOOL_RUNTIME_BAD_INPUT_VALUE = "TOOL_RUNTIME_BAD_INPUT_VALUE"
TOOL_RUNTIME_BAD_OUTPUT_VALUE = "TOOL_RUNTIME_BAD_OUTPUT_VALUE"
TOOL_RUNTIME_RETRY = "TOOL_RUNTIME_RETRY"
TOOL_RUNTIME_CONTEXT_REQUIRED = "TOOL_RUNTIME_CONTEXT_REQUIRED"
TOOL_RUNTIME_FATAL = "TOOL_RUNTIME_FATAL"
UPSTREAM_RUNTIME_BAD_REQUEST = "UPSTREAM_RUNTIME_BAD_REQUEST"
UPSTREAM_RUNTIME_AUTH_ERROR = "UPSTREAM_RUNTIME_AUTH_ERROR"
UPSTREAM_RUNTIME_NOT_FOUND = "UPSTREAM_RUNTIME_NOT_FOUND"
UPSTREAM_RUNTIME_VALIDATION_ERROR = "UPSTREAM_RUNTIME_VALIDATION_ERROR"
UPSTREAM_RUNTIME_RATE_LIMIT = "UPSTREAM_RUNTIME_RATE_LIMIT"
UPSTREAM_RUNTIME_SERVER_ERROR = "UPSTREAM_RUNTIME_SERVER_ERROR"
UPSTREAM_RUNTIME_UNMAPPED = "UPSTREAM_RUNTIME_UNMAPPED"
UNKNOWN = "UNKNOWN"
```
- In `arcadepy/types/execute_tool_response.py` add the following fields
to OutputError:
```py
kind: ErrorKind
status_code: Optional[int] = None
stacktrace: Optional[str] = None
extra: Optional[dict[str, Any]] = None
```
### Example Client Usage
```py
# Example of handling an upstream rate limit
error = response.output.error
if error and error.kind == ErrorKind.UPSTREAM_RUNTIME_RATE_LIMIT:
sleep_time = error.retry_after_ms / 1000
time.sleep(sleep_time)
# and then execute again
```
```py
# Examples of determining what type of runtime error it is
error = response.output.error
if error:
is_retryable_error = error.kind == ErrorKind.TOOL_RUNTIME_RETRY
is_a_bug_in_the_tool = error.kind == ErrorKind.TOOL_RUNTIME_FATAL
is_additional_context_required = error.kind == ErrorKind.TOOL_RUNTIME_CONTEXT_REQUIRED
```
### Example Tool Usage
```py
# EXAMPLE 1 letting Arcade handle upstream error handling for you
reddit_client.post(params) # Arcade's httpx adapter will handle error handling for you!
# ------------------------------------
# EXAMPLE 2 handling upstream bad request yourself, but letting Arcade handle the rest
try:
reddit_client.post(params)
except httpx.HTTPStatusError as e:
if e.status_code == 400:
raise UpstreamError("My extra custom message) from e
raise
```
```py
# EXAMPLE 1 letting Arcade handle it for you
risky_element = my_risky_list[42] # Arcade will raise a FatalToolError for you
# ------------------------------------
# EXAMPLE 2 handling it yourself for extra flexibility
try:
risky_element = my_risky_list[42]
except IndexError as e:
raise FatalToolError("My extra custom message") from e
```
### Non-runtime Error Message Examples
Example ToolkitLoadError Messages:
```
- [TOOLKIT_LOAD_FAILED] ToolkitLoadError when loading toolkit 'sample_tool': Could not import module mock_module. Reason: Mock import error
- [TOOLKIT_LOAD_FAILED] ToolkitLoadError when loading toolkit 'test_toolkit': Tool 'ValidTool' in toolkit 'test_toolkit' already exists in the catalog.
```
Example ToolDefinitionError Messages
```
- [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_missing_description': Tool 'tool_missing_description' is missing a description
- [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_invalid_secret_type': Secret keys must be strings (error in tool ToolWithInvalidSecretType).
- [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_empty_secret': Secrets must have a non-empty key (error in tool ToolWithEmptySecret).
- [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_invalid_metadata_type': Metadata must be strings (error in tool ToolWithInvalidMetadataType).
- [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_metadata_requiring_auth_without_auth': Tool ToolWithMetadataRequiringAuthWithoutAuth declares metadata key 'client_id', which requires that the tool has an auth requirement, but no auth requirement was provided. Please specify an auth requirement.
- [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_empty_metadata': Metadata must have a non-empty key (error in tool ToolWithEmptyMetadata).
- [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_unsupported_param_type': Unsupported parameter type: <class 'test_catalog.MyFancyTestClass'>
```
Example ToolInputSchemaError Messages
```
- [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_missing_input_parameter_annotation': Parameter 'input_text' is missing a description
- [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_no_type_annotation': Parameter param has no type annotation.
- [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_invalid_param_name': Invalid parameter name: '123invalid' is not a valid identifier. Identifiers must start with a letter or underscore, and can only contain letters, digits, or underscores.
- [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_too_many_annotations': Parameter param: Annotated[str, 'name', 'desc', 'extra'] has too many string annotations. Expected 0, 1, or 2, got 3.
- [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_required_union_param': Parameter param is a union type. Only optional types are supported.
- [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_non_callable_default_factory': Default factory for parameter param: Annotated[str, 'Parameter'] = FieldInfo(annotation=NoneType, required=False, default_factory=str) is not callable.
- [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_multiple_tool_contexts': Only one ToolContext parameter is supported, but tool tool_with_multiple_tool_contexts has multiple.
```
Example ToolOutputSchemaError Messages
```
- [TOOL_DEFINITION_BAD_OUTPUT_SCHEMA] ToolOutputSchemaError in definition of tool 'tool_missing_return_type_hint': Tool 'ToolMissingReturnTypeHint' must have a return type
- [TOOL_DEFINITION_BAD_OUTPUT_SCHEMA] ToolOutputSchemaError in definition of tool 'tool_with_unsupported_output_type': Unsupported output type '<class 'test_catalog.MyFancyTestClass'>'. Only built-in Python types, TypedDicts, Pydantic models, and standard collections are supported as tool output types.
```
### Runtime Error Message Examples
Example Tool Runtime Error Messages
```
- [TOOL_RUNTIME_FATAL] FatalToolError during execution of tool 'get_posts_in_subreddit': list index out of range
- [TOOL_RUNTIME_CONTEXT_REQUIRED] ContextRequiredToolError during execution of tool 'get_posts_in_subreddit': Ambiguous username. Please provide a more specific username
- [TOOL_RUNTIME_RETRY] RetryableToolError during execution of tool 'get_posts_in_subreddit': Retry with subreddit=learnpython or subreddit=learnprogramming
```
Example Upstream Runtime Error Messages
```
- [UPSTREAM_RUNTIME_RATE_LIMIT] UpstreamRateLimitError during execution of tool 'get_posts_in_subreddit': 429 Client Error: Too Many Requests
- [UPSTREAM_RUNTIME_BAD_REQUEST] UpstreamError during execution of tool 'get_posts_in_subreddit': 400 Client Error: Bad request. Missing 'id' parameter.
- [UPSTREAM_RUNTIME_BAD_REQUEST] UpstreamError during execution of tool 'search_files': Upstream Google API error: Invalid value '-23'. Values must be within the range: [value: 1\n, value: 1000\n]
```
Improve Pydantic and Typedict support and add a bunch of tets.
1. Fixed the test failure where TypeDict was being serialized as a list
of tuples instead of a dict by:
- Adding proper handling for BaseModel instances in the output.py file
- Converting BaseModel results (from TypeDict conversion) to dicts using
model_dump()
- Handling lists containing BaseModel objects
2. Fixed None handling to ensure None results are converted to empty
strings as expected
3. Updated the schema.py to allow dict and list types in
ToolCallOutput.value
4. new tests
- TypeDict output execution tests
- Output factory tests
- Executor tests with TypeDict support
- Schema validation tests
The key changes were:
- In ``arcade_core/output.py``: Added BaseModel conversion logic in the
success method
- In ``arcade_core/schema.py``: Changed ToolCallOutput.value type from
list[str] to list to support complex types
TODO
- [ ] Confirm engine compatibility without changes made to engine
---------
Co-authored-by: Eric Gustin <eric@arcade.dev>
## Before
### Tool: ``GoogleNews.SearchNewsStories``
```python
@tool(requires_secrets=["SERP_API_KEY"])
async def search_news_stories(
context: ToolContext,
keywords: Annotated[
str,
"Keywords to search for news articles. E.g. 'Apple launches new iPhone'.",
],
country_code: Annotated[
CountryCode | None,
"2-character country code to search for news articles. "
"E.g. 'us' (United States). "
f"Defaults to '{DEFAULT_GOOGLE_NEWS_COUNTRY}'.",
] = None,
language_code: Annotated[
LanguageCode,
"2-character language code to search for news articles. E.g. 'en' (English). "
f"Defaults to '{DEFAULT_GOOGLE_NEWS_LANGUAGE}'.",
] = DEFAULT_GOOGLE_NEWS_LANGUAGE,
limit: Annotated[
int | None,
"Maximum number of news articles to return. Defaults to None "
"(returns all results found by the API).",
] = None,
) -> Annotated[dict[str, Any]]:
"""Search for news articles related to a given query."""
...
```
### Tool Definition: ``GoogleNews.SearchNewsStories``
```
{
"name": "SearchNewsStories",
"fully_qualified_name": "GoogleNews.SearchNewsStories",
"description": "Search for news articles related to a given query.",
"toolkit": {
"name": "GoogleNews",
"description": "Arcade.dev LLM tools for getting new via Google News",
"version": "2.0.0"
},
"input": {
"parameters": [
{
"name": "keywords",
"required": true,
"description": "Keywords to search for news articles. E.g. 'Apple launches new iPhone'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
},
"inferrable": true
},
{
"name": "country_code",
"required": false,
"description": "2-character country code to search for news articles. E.g. 'us' (United States). Defaults to 'None'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
},
"inferrable": true
},
{
"name": "language_code",
"required": false,
"description": "2-character language code to search for news articles. E.g. 'en' (English). Defaults to 'en'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
},
"inferrable": true
},
{
"name": "limit",
"required": false,
"description": "Maximum number of news articles to return. Defaults to None (returns all results found by the API).",
"value_schema": {
"val_type": "integer",
"inner_val_type": null,
"enum": null,
},
"inferrable": true
}
]
},
"output": {
"description": "News search results with article details.",
"available_modes": [
"value",
"error"
],
"value_schema": {
"val_type": "json"
}
},
"requirements": {
"authorization": null,
"secrets": [
{
"key": "serp_api_key"
}
],
"metadata": null
},
"deprecation_message": null
},
```
## After
### Enhanced Tool: ``GoogleNews.SearchNewsStories``
```python
"""Type definitions for Google News API responses and parameters."""
from typing_extensions import TypedDict
CountryCode = str
LanguageCode = str
class SearchNewsParams(TypedDict):
"""Input parameters for searching news articles."""
keywords: str
"""Search query terms to find relevant news articles \
(e.g., 'Apple launches new iPhone')."""
country_code: CountryCode | None
"""Optional 2-letter country code to filter news by region \
(e.g., 'us' for United States, 'uk' for United Kingdom)."""
language_code: LanguageCode | None
"""Optional 2-letter language code to filter news by language \
(e.g., 'en' for English, 'es' for Spanish)."""
limit: int | None
"""Optional maximum number of news articles to return. \
If not specified, returns all results from the API."""
class SourceInfo(TypedDict, total=False):
"""Information about the news source/publication."""
name: str
"""Name of the publication (e.g., 'CNN', 'BBC News', 'The New York Times')."""
icon: str
"""URL to the source's favicon or logo image."""
authors: list[str]
"""List of author names for the article, if available."""
class NewsResult(TypedDict, total=False):
"""Individual news article from the Google News API response."""
position: int
"""Ranking position of this result in the search results."""
title: str
"""Headline or title of the news article."""
link: str
"""Full URL to the original news article."""
source: SourceInfo
"""Information about the publication source."""
date: str
"""Publication date and time (e.g., '2 hours ago', 'Dec 15, 2023')."""
snippet: str
"""Brief excerpt or summary from the article content."""
thumbnail: str
"""URL to a high-resolution thumbnail image for the article."""
thumbnail_small: str
"""URL to a low-resolution thumbnail image for the article."""
story_token: str
"""Token for accessing full coverage of this news story across multiple sources."""
stories: list["NewsResult"]
"""Related news stories from other sources covering the same topic."""
highlight: dict
"""Additional highlighted information about the story."""
class SearchMetadata(TypedDict, total=False):
"""Metadata about the search request and processing."""
id: str
"""Unique identifier for this search request within SerpApi."""
status: str
"""Current processing status ('Processing', 'Success', or 'Error')."""
json_endpoint: str
"""URL to retrieve the JSON results for this search."""
created_at: str
"""Timestamp when the search request was created."""
processed_at: str
"""Timestamp when the search request was processed."""
google_news_url: str
"""Original Google News URL that would return these results."""
total_time_taken: float
"""Total time in seconds taken to process this search."""
class SearchParameters(TypedDict, total=False):
"""Parameters used for the search request."""
engine: str
"""Search engine used (always 'google_news' for this API)."""
q: str
"""Search query string."""
gl: str
"""Country code used for geographic filtering."""
hl: str
"""Language code used for language filtering."""
topic_token: str
"""Token for accessing specific news topics (e.g., 'World', 'Business', 'Technology')."""
publication_token: str
"""Token for accessing news from specific publishers."""
class MenuLink(TypedDict):
"""Navigation link for news categories or topics."""
title: str
"""Display text for the menu item (e.g., 'Technology', 'Sports', 'Business')."""
topic_token: str
"""Token to access this specific topic or category."""
serpapi_link: str
"""SerpApi URL to search within this topic."""
class TopStoriesLink(TypedDict):
"""Link to top stories section."""
topic_token: str
"""Token to access top stories."""
serpapi_link: str
"""SerpApi URL to retrieve top stories."""
class GoogleNewsResponse(TypedDict, total=False):
"""Complete response from the Google News API."""
search_metadata: SearchMetadata
"""Metadata about the search request and processing."""
search_parameters: SearchParameters
"""Parameters that were used for this search."""
news_results: list[NewsResult]
"""List of news articles matching the search criteria."""
menu_links: list[MenuLink]
"""Navigation links to different news categories and topics."""
top_stories_link: TopStoriesLink
"""Link to access top stories."""
title: str
"""Title of the page or topic being displayed."""
class SimplifiedNewsResult(TypedDict):
"""Simplified news article format for tool output."""
title: str
"""Headline of the news article."""
link: str
"""URL to the full article."""
source: str | None
"""Name of the publication source."""
date: str | None
"""When the article was published."""
snippet: str | None
"""Brief excerpt from the article."""
class SearchNewsOutput(TypedDict):
"""Output format for the search_news_stories tool."""
news_results: list[SimplifiedNewsResult]
"""List of news articles in simplified format."""
@tool(requires_secrets=["SERP_API_KEY"])
async def search_news_stories(
context: ToolContext,
keywords: Annotated[
str,
"Keywords to search for news articles. E.g. 'Apple launches new iPhone'.",
],
country_code: Annotated[
CountryCode | None,
"2-character country code to search for news articles. "
"E.g. 'us' (United States). "
f"Defaults to '{DEFAULT_GOOGLE_NEWS_COUNTRY}'.",
] = None,
language_code: Annotated[
LanguageCode,
"2-character language code to search for news articles. E.g. 'en' (English). "
f"Defaults to '{DEFAULT_GOOGLE_NEWS_LANGUAGE}'.",
] = DEFAULT_GOOGLE_NEWS_LANGUAGE,
limit: Annotated[
int | None,
"Maximum number of news articles to return. Defaults to None "
"(returns all results found by the API).",
] = None,
) -> Annotated[SearchNewsOutput, "News search results with article details."]:
"""Search for news articles related to a given query."""
...
```
### Enhanced Tool Definition: ``GoogleNews.SearchNewsStories``
```json
{
"name": "SearchNewsStories",
"fully_qualified_name": "GoogleNews.SearchNewsStories",
"description": "Search for news articles related to a given query.",
"toolkit": {
"name": "GoogleNews",
"description": "Arcade.dev LLM tools for getting new via Google News",
"version": "2.0.0"
},
"input": {
"parameters": [
{
"name": "keywords",
"required": true,
"description": "Keywords to search for news articles. E.g. 'Apple launches new iPhone'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": null
},
"inferrable": true
},
{
"name": "country_code",
"required": false,
"description": "2-character country code to search for news articles. E.g. 'us' (United States). Defaults to 'None'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": null
},
"inferrable": true
},
{
"name": "language_code",
"required": false,
"description": "2-character language code to search for news articles. E.g. 'en' (English). Defaults to 'en'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": null
},
"inferrable": true
},
{
"name": "limit",
"required": false,
"description": "Maximum number of news articles to return. Defaults to None (returns all results found by the API).",
"value_schema": {
"val_type": "integer",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": null
},
"inferrable": true
}
]
},
"output": {
"description": "News search results with article details.",
"available_modes": [
"value",
"error"
],
"value_schema": {
"val_type": "json",
"inner_val_type": null,
"enum": null,
"properties": {
"news_results": {
"val_type": "array",
"inner_val_type": "json",
"enum": null,
"properties": null,
"inner_properties": {
"title": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": "Headline of the news article."
},
"link": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": "URL to the full article."
},
"source": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": "Name of the publication source."
},
"date": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": "When the article was published."
},
"snippet": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": "Brief excerpt from the article."
}
},
"description": "List of news articles in simplified format."
}
},
"inner_properties": null,
"description": null
}
},
"requirements": {
"authorization": null,
"secrets": [
{
"key": "serp_api_key"
}
],
"metadata": null
},
"deprecation_message": null
},
```
---------
Co-authored-by: Eric Gustin <eric@arcade.dev>
## Summary
This PR removes the requirement that all toolkits must have the arcade_
prefix and introduces a more flexible toolkit discovery system using
Python entry points.
### 🏷️ Flexible Toolkit Naming
* Community toolkits: Only add arcade_ prefix when the user is in
arcade-ai/toolkits/ directory and explicitly chooses to create a
community contribution.
* External toolkits: No prefix requirement - developers can name their
toolkits however they want
* Toolkit names are now determined by user choice rather than enforced
automatically
### 🔍 Entry Point Discovery
* Added find_arcade_toolkits_from_entrypoints() method to discover
toolkits via entry points
* Entry point group: arcade_toolkits with name: toolkit_name
* Updated pyproject.toml template to include entry point configuration
* Entry point discovery takes precedence over prefix-based discovery for
deduplication
### 📦 Backward Compatibility
* Existing arcade_* prefixed toolkits continue to work via
find_arcade_toolkits_from_prefix()
find_all_arcade_toolkits() now combines both discovery methods
* Deduplication logic prefers entry point toolkits over prefix-based
ones when package names match
### 🛠️ `arcade new` Template Updates
* pyproject.toml template for `arcade new` now includes entry point
configuration: [project.entry-points.arcade_toolkits]
### 🔧 Minor Improvements
* Refactored _strip_arcade_prefix() into a separate method for
reusability
* Updated variable naming for clarity (community_toolkit →
is_community_toolkit)
### Benefits
* Developer Freedom: Toolkit developers are no longer forced to use the
arcade_ prefix. They are also no longer forced to use the package name
as the toolkit name.
* Cleaner Naming: External toolkits can use more natural names (e.g.,
my_company_toolkit instead of arcade_my_company_toolkit)
* Better Discovery: Entry points provide a more standard Python
mechanism for plugin discovery
* Flexible Distribution: Toolkits can be distributed with any package
name while still being discoverable
### Testing
* Added comprehensive tests for the new entry point functionality
* Tests cover edge cases like deduplication, error handling, and
backward compatibility
### Version Bumps
arcade-core: 2.0.0 → 2.1.0
arcade-ai: 2.0.5 → 2.1.0
This change makes the Arcade toolkit ecosystem more flexible and
developer-friendly while maintaining full backward compatibility with
existing toolkits.
---------
Co-authored-by: Mateo Torres <mateo@arcade.dev>
This is the first of a few PRs. Deploy to staging will fail until we
have `arcade-core`, `arcade-serve`, and `arcade-ai` released to PyPI.
This PR will release `arcade-core` to PyPI.
### PR Description
* Adds workflow that checks for changes in any pyproject.toml, and if
its version has changed, then tests, builds wheel, then publishes to
PyPI
* Updates the Dockerfile for our new structure
* Updates porter yamls
* Updates `make full-dist`
* Removes a couple unused workflows
Check out https://github.com/ArcadeAI/arcade-ai/actions/runs/15622059209
to see how the new workflow works (note that it failed publishing to
PyPI on purpose)
### Overview
Major restructuring from monolithic `arcade-ai` package to modular
library architecture with standardized uv-based dependency management.

### New Package Structure
- **`arcade-tdk`** - Lightweight toolkit development kit (core
decorators, auth)
- **`arcade-core`** - Core execution engine and catalog functionality
- **`arcade-serve`** - FastAPI/MCP server components
- **`arcade-ai`** - Meta package that includes CLI functionality.
Optionally include evals via the `evals` extra. Optionally include all
packages via the `all` extra.
### Key Benefits
- **Lighter Dependencies**: Toolkits now depend only on `arcade-tdk` (~2
deps) vs full `arcade-ai` (~30+ deps)
- **Faster Builds**: uv provides 10-100x faster dependency resolution
and installation
- **Better Modularity**: Clear separation of concerns, consumers import
only what they need
- **Standard Tooling**: Eliminates custom poetry scripts, uses standard
Python packaging
### Migration Impact
- All 20 toolkits converted from poetry → uv with `arcade-tdk`
dependencies plus `arcade-ai[evals]` and `arcade-serve` dev
dependencies. When developing locally, devs should install toolkits via
`make install-local`.
- Modern Python 3.10+ type hints throughout
- Standardized build system with hatchling backend
- Enhanced Makefile with robust toolkit management commands
- Removed `arcade dev` CLI command
- Reduce the number of files created by `arcade new` and add an option
to not generate a tests and evals folder.
This foundation enables faster development cycles and cleaner dependency
chains for the growing toolkit ecosystem.
### Todo After this PR is merged
- [ ] Post-merge workflow(s) (release & publish containers, etc)
- [ ] Release order plan. @EricGustin suggests releasing in the
following order:
1. `arcade-core` version 0.1.0
2. `arcade-serve` version 0.1.0 and `arcade-tdk` version 0.1.0
3. `arcade-ai` version 2.0.0
4. Patch release for all toolkits (all changes in toolkits are internal
refactors)
- [ ] [Update docs](https://github.com/ArcadeAI/docs/pull/318)
---------
Co-authored-by: Eric Gustin <eric@arcade.dev>
Co-authored-by: Eric Gustin <34000337+EricGustin@users.noreply.github.com>