arcade-mcp

665 commits 1 branch 0 tags 23 MiB

Author	SHA1	Message	Date
Francisco Or Something	dc4607daa4	feat(telemetry): add developer messages to tool error spans (#831 ) ## Summary - Add shared span attributes for tool error diagnostics, including developer-facing messages when present. - Wire those attributes through MCP server, worker RunTool, and HTTP CallTool spans while keeping default MCP response content public-only. - Cover no-leak response behavior, non-recording spans, outputless worker responses, and the shared attribute contract. ## Verification - `uv run ruff format ...` - `uv run ruff check ...` - `uv run pytest -W ignore libs/tests/arcade_mcp_server/test_debug_exposure_integration.py libs/tests/core/test_log_extras.py libs/tests/worker/test_worker_base.py` Made with [Cursor](https://cursor.com) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Adds new telemetry attributes that propagate tool error messages (including optional developer_message) into active spans across MCP server and worker execution paths; risk is mainly around potential leakage of sensitive developer messages into tracing backends and changes to observability contracts. > > Overview > Adds a shared `arcade_core.log_extras.build_tool_error_span_attributes()` helper and wires it into tool error paths so the current OpenTelemetry span is annotated with stable `tool_error_*` attributes (including `developer_message` when present). > > MCP tool calls now record these span attributes on failure while keeping default MCP response content sanitized, and `arcade-serve` records the same attributes on both `RunTool` and HTTP `CallTool` spans (handling `output=None`). Versions and dependency constraints are bumped to consume the new core helper, with tests added/updated to lock the span-attribute contract and verify behavior for non-recording spans and no-leak responses. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 33a53991d72140a662152f508dc53e9b769b9f07. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-04-29 20:41:07 -03:00
Francisco Or Something	1492c80fc5	TOO-627: Improve error messages for agents and Datadog (#814 ) ## Summary - Improve tool call error messages across 4 libraries (arcade-core, arcade-tdk, arcade-mcp-server, arcade-serve) so agents can self-correct and Datadog can facet on structured fields - Guard empty error messages, enrich input validation errors with field-level detail, fix `@tool` decorator fallback formatting, surface `additional_prompt_content` in MCP responses, and add structured log extras for Datadog - Addresses the 3 worst error patterns: generic "Error in tool input deserialization", bare `KeyError` values, and empty `FatalToolError` messages Linear: TOO-627 Plan: `docs/plans/2026-04-08-improve-error-messages-handoff.md` ## Tasks - [ ] Task 1: Guard empty error messages (arcade-core) - [ ] Task 2: Enrich input validation error messages (arcade-core) - [ ] Task 3: Improve `@tool` decorator error fallback (arcade-tdk) - [ ] Task 4: Fix MCP agent-facing error response (arcade-mcp-server) - [ ] Task 5: Add structured log extras in BaseWorker (arcade-serve) - [ ] Task 6: Add structured log extras in MCP server (arcade-mcp-server) ## Test plan - [ ] Each task has dedicated unit tests verifying the new behavior - [ ] `make test` passes after all tasks - [ ] `make check` (ruff + mypy) passes - [ ] Verify the 3 worst error patterns now produce actionable messages 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Touches cross-library error formatting and logging behavior used in production tool execution paths; while mostly additive/guardrails, it changes agent-visible messages and Datadog log facets, which could impact client expectations and alerting. > > Overview > Improves tool-call error handling across core/runtime, MCP transport, worker transport, and the TDK to make agent-visible failures more actionable while reducing sensitive-data leakage. > > In `arcade-core`, empty error messages now get placeholders, `ToolOutputFactory.fail` defaults blank messages, and input validation errors are rewritten as field-level summaries that intentionally omit rejected values (avoiding Pydantic echo of secrets). The `@tool` fallback in `arcade-tdk` no longer surfaces `str(exception)` to agents; it returns exception type-only* in `message` while preserving full detail in `developer_message`. > > Adds a shared `build_tool_error_log_extra` helper and updates `arcade-serve` + `arcade-mcp-server` to emit consistent structured WARNING logs (`error_*`, `tool_name`, optional toolkit/version) for Datadog, while MCP error responses now append `additional_prompt_content` and force `structuredContent=None` on failures per spec. Includes extensive new tests and bumps package versions (`arcade-core` 4.6.2, `arcade-tdk` 3.6.1, `arcade-mcp-server` 1.19.3, `arcade-serve` 3.2.3). > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit e5c7ebcaf56176cfbd8e6d1f2b6295352abd0ec0. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 20:10:51 -03:00
Evan Tahler	a921b76ce9	Add tool name and version to otel spans for worker (#690 ) Knowing which tools we called seems important <img width="1554" height="293" alt="Screenshot 2025-11-20 at 1 36 48 PM" src="https://github.com/user-attachments/assets/1b3cf297-749f-4ea1-9d1c-606707540d5b" /> Closes PLT-748	2025-11-21 15:53:54 -08:00
Eric Gustin	3424ec8219	MCP Local (#563 ) Versions: * arcade-mcp\==1.0.0rc1 * arcade-mcp-server\==1.0.0rc1 * arcade-core\==2.5.0rc1 * arcade-tdk\==2.6.0rc1 * arcade-serve\==2.2.0rc1 ### Summary Adds first-class MCP support across Arcade, introduces a new MCP server and CLI, unifies the project under the arcade-mcp name, overhauls templates/scaffolding, and improves developer tooling, secrets management, and examples. ### Highlights - MCP Server & Core - New MCP server with stdio and HTTP/SSE transports, session management, resumability, and lifecycle handling. - FastAPI-like `MCPApp` for building servers with lazy init; integrated worker+MCP HTTP app option. - Middleware system (logging and error handling), robust exception hierarchy, and Pydantic-based settings. - Async-safe managers for tools, resources, and prompts backed by registries and locks. - Developer-facing, transport-agnostic runtime context interfaces (logs, tools, prompts, resources, sampling, UI, notifications). - Conversion from Arcade ToolDefinition to MCP tool schema; OpenAI JSON tool schema converter. - Parser supports `@app.tool`/`@app.tool(...)` decorators. - CLI - New `mcp` command to run MCP servers with stdio or HTTP/SSE. - New `secret` command to set/list/unset tool secrets (supports .env input, preserves original casing for lookups). - `new` command refactored; option to create a full toolkit package with scaffolding. - `chat` command removed. - `serve.py` imports updated to `arcade_serve.fastapi.telemetry`; version retrieval now uses `arcade-mcp`. - `show.py` refactor to use new local catalog utilities. - `display_tool_details` improved: adds “Default” column and handles nested properties. - Configuration & Discovery - New `configure.py` to set up Claude Desktop, Cursor, and VS Code to connect to local or Arcade Cloud MCP servers. - Discovery utilities to find/install toolkits, build `ToolCatalog`s, analyze files for tools, load kits from directories (pyproject parsing), and build minimal toolkits. - Better handling of provider API key resolution and evaluation suite loading. - Templates & Scaffolding - Reorganized template structure (minimal vs full); moved `.pre-commit-config.yaml`, `.ruff.toml`, license, Makefile, README, tests, and tools layout to correct paths. - Minimal template adds `.env.example` for runtime secret injection. - Template pyproject updated for MCP servers; includes sample server with greeting and secret-reveal tools. - Authorization flow in templates simplified. - Repo-wide Renaming & Examples - Migrates references from `arcade-ai` to `arcade-mcp` across READMEs, scripts, and package metadata. - Examples updated (LangChain/LangGraph/AI SDK/TypeScript) and package name changed to `arcade-mcp-sdk`. - Evals & Core Utilities - Evals now use OpenAI tooling format (`OpenAIToolList`, `to_openai`); `tool_eval` takes `provider_api_key`. - Core utilities: fixed `does_function_return_value` by dedenting before parse; version bump to `2.5.0rc1` and dependency cleanup. - Tooling & CI - `setup-uv-env` action splits toolkit vs contrib dependency installation. - Pre-commit: excludes `libs/arcade-mcp-server/mkdocs.yml` and `libs/tests/` from YAML and Ruff hooks; Ruff per-file ignores (e.g., C901 in `libs/*/.py`, TRY400 in server docs paths). - Makefile updates for uv env setup, quality checks, tests, builds, and new `shell` target. - Added Makefile to MCP server library to streamline dev workflow. - Cleanup - Removed `claude.json` config. - Simplified stdio entrypoint; removed unused imports (`arcade_gmail`, `arcade_search`). ### Breaking Changes - CLI: `chat` command removed; use `mcp`, `secret`, and updated `new`. - Naming: All users should update references from `arcade-ai` to `arcade-mcp`. - Templates: File paths moved; downstream scripts referencing old template locations may need updates. ### Getting Started - Run an MCP server: - `arcade mcp --stdio --toolkits your_toolkit` - `arcade mcp --http --toolkits your_toolkit` - Manage secrets: - `arcade secret set your_toolkit KEY=value` - `arcade secret list your_toolkit` - `arcade secret unset your_toolkit KEY` - Configure clients: - `arcade configure` to set up Claude Desktop, Cursor, and VS Code for local/Arcade Cloud MCP. --------- Co-authored-by: Sam Partee <sam@arcade-ai.com> Co-authored-by: Shub <125150494+shubcodes@users.noreply.github.com>	2025-09-25 15:28:15 -07:00
Eric Gustin	f4558ef3a8	Tool Error Handling (#539 ) # Improvements to Arcade TDK Error Handling I tried my very best to not make any breaking changes in this PR. So, you will notice various "Deprecation" notices throughout. ### Instructions for PR reviewers 1. Pull down this PR's branch 2. Pull down the Engine's tool error handling PR's branch 3. Update your installed arcadepy to have the following: - In `arcadepy/resources/tools/tools.py`, if you want to test out including stacktraces, then you need to update `ToolsResource.execute` to accept a `include_error_stacktrace` argument and also include the "include_error_stacktrace" argument to the POST to the Engine inside of the function's execute method's body. - In `arcadepy/types/execute_tool_response.py` add the following enum ```py class ErrorKind(str, Enum): """Error kind that is comprised of - the who (toolkit, tool, upstream) - the when (load time, definition parsing time, runtime) - the what (bad_definition, bad_input, bad_output, retry, context_required, fatal, etc.)""" TOOLKIT_LOAD_FAILED = "TOOLKIT_LOAD_FAILED" TOOL_DEFINITION_BAD_DEFINITION = "TOOL_DEFINITION_BAD_DEFINITION" TOOL_DEFINITION_BAD_INPUT_SCHEMA = "TOOL_DEFINITION_BAD_INPUT_SCHEMA" TOOL_DEFINITION_BAD_OUTPUT_SCHEMA = "TOOL_DEFINITION_BAD_OUTPUT_SCHEMA" TOOL_RUNTIME_BAD_INPUT_VALUE = "TOOL_RUNTIME_BAD_INPUT_VALUE" TOOL_RUNTIME_BAD_OUTPUT_VALUE = "TOOL_RUNTIME_BAD_OUTPUT_VALUE" TOOL_RUNTIME_RETRY = "TOOL_RUNTIME_RETRY" TOOL_RUNTIME_CONTEXT_REQUIRED = "TOOL_RUNTIME_CONTEXT_REQUIRED" TOOL_RUNTIME_FATAL = "TOOL_RUNTIME_FATAL" UPSTREAM_RUNTIME_BAD_REQUEST = "UPSTREAM_RUNTIME_BAD_REQUEST" UPSTREAM_RUNTIME_AUTH_ERROR = "UPSTREAM_RUNTIME_AUTH_ERROR" UPSTREAM_RUNTIME_NOT_FOUND = "UPSTREAM_RUNTIME_NOT_FOUND" UPSTREAM_RUNTIME_VALIDATION_ERROR = "UPSTREAM_RUNTIME_VALIDATION_ERROR" UPSTREAM_RUNTIME_RATE_LIMIT = "UPSTREAM_RUNTIME_RATE_LIMIT" UPSTREAM_RUNTIME_SERVER_ERROR = "UPSTREAM_RUNTIME_SERVER_ERROR" UPSTREAM_RUNTIME_UNMAPPED = "UPSTREAM_RUNTIME_UNMAPPED" UNKNOWN = "UNKNOWN" ``` - In `arcadepy/types/execute_tool_response.py` add the following fields to OutputError: ```py kind: ErrorKind status_code: Optional[int] = None stacktrace: Optional[str] = None extra: Optional[dict[str, Any]] = None ``` ### Example Client Usage ```py # Example of handling an upstream rate limit error = response.output.error if error and error.kind == ErrorKind.UPSTREAM_RUNTIME_RATE_LIMIT: sleep_time = error.retry_after_ms / 1000 time.sleep(sleep_time) # and then execute again ``` ```py # Examples of determining what type of runtime error it is error = response.output.error if error: is_retryable_error = error.kind == ErrorKind.TOOL_RUNTIME_RETRY is_a_bug_in_the_tool = error.kind == ErrorKind.TOOL_RUNTIME_FATAL is_additional_context_required = error.kind == ErrorKind.TOOL_RUNTIME_CONTEXT_REQUIRED ``` ### Example Tool Usage ```py # EXAMPLE 1 letting Arcade handle upstream error handling for you reddit_client.post(params) # Arcade's httpx adapter will handle error handling for you! # ------------------------------------ # EXAMPLE 2 handling upstream bad request yourself, but letting Arcade handle the rest try: reddit_client.post(params) except httpx.HTTPStatusError as e: if e.status_code == 400: raise UpstreamError("My extra custom message) from e raise ``` ```py # EXAMPLE 1 letting Arcade handle it for you risky_element = my_risky_list[42] # Arcade will raise a FatalToolError for you # ------------------------------------ # EXAMPLE 2 handling it yourself for extra flexibility try: risky_element = my_risky_list[42] except IndexError as e: raise FatalToolError("My extra custom message") from e ``` ### Non-runtime Error Message Examples Example ToolkitLoadError Messages: ``` - [TOOLKIT_LOAD_FAILED] ToolkitLoadError when loading toolkit 'sample_tool': Could not import module mock_module. Reason: Mock import error - [TOOLKIT_LOAD_FAILED] ToolkitLoadError when loading toolkit 'test_toolkit': Tool 'ValidTool' in toolkit 'test_toolkit' already exists in the catalog. ``` Example ToolDefinitionError Messages ``` - [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_missing_description': Tool 'tool_missing_description' is missing a description - [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_invalid_secret_type': Secret keys must be strings (error in tool ToolWithInvalidSecretType). - [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_empty_secret': Secrets must have a non-empty key (error in tool ToolWithEmptySecret). - [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_invalid_metadata_type': Metadata must be strings (error in tool ToolWithInvalidMetadataType). - [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_metadata_requiring_auth_without_auth': Tool ToolWithMetadataRequiringAuthWithoutAuth declares metadata key 'client_id', which requires that the tool has an auth requirement, but no auth requirement was provided. Please specify an auth requirement. - [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_empty_metadata': Metadata must have a non-empty key (error in tool ToolWithEmptyMetadata). - [TOOL_DEFINITION_BAD_DEFINITION] ToolDefinitionError in definition of tool 'tool_with_unsupported_param_type': Unsupported parameter type: <class 'test_catalog.MyFancyTestClass'> ``` Example ToolInputSchemaError Messages ``` - [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_missing_input_parameter_annotation': Parameter 'input_text' is missing a description - [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_no_type_annotation': Parameter param has no type annotation. - [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_invalid_param_name': Invalid parameter name: '123invalid' is not a valid identifier. Identifiers must start with a letter or underscore, and can only contain letters, digits, or underscores. - [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_too_many_annotations': Parameter param: Annotated[str, 'name', 'desc', 'extra'] has too many string annotations. Expected 0, 1, or 2, got 3. - [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_required_union_param': Parameter param is a union type. Only optional types are supported. - [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_non_callable_default_factory': Default factory for parameter param: Annotated[str, 'Parameter'] = FieldInfo(annotation=NoneType, required=False, default_factory=str) is not callable. - [TOOL_DEFINITION_BAD_INPUT_SCHEMA] ToolInputSchemaError in definition of tool 'tool_with_multiple_tool_contexts': Only one ToolContext parameter is supported, but tool tool_with_multiple_tool_contexts has multiple. ``` Example ToolOutputSchemaError Messages ``` - [TOOL_DEFINITION_BAD_OUTPUT_SCHEMA] ToolOutputSchemaError in definition of tool 'tool_missing_return_type_hint': Tool 'ToolMissingReturnTypeHint' must have a return type - [TOOL_DEFINITION_BAD_OUTPUT_SCHEMA] ToolOutputSchemaError in definition of tool 'tool_with_unsupported_output_type': Unsupported output type '<class 'test_catalog.MyFancyTestClass'>'. Only built-in Python types, TypedDicts, Pydantic models, and standard collections are supported as tool output types. ``` ### Runtime Error Message Examples Example Tool Runtime Error Messages ``` - [TOOL_RUNTIME_FATAL] FatalToolError during execution of tool 'get_posts_in_subreddit': list index out of range - [TOOL_RUNTIME_CONTEXT_REQUIRED] ContextRequiredToolError during execution of tool 'get_posts_in_subreddit': Ambiguous username. Please provide a more specific username - [TOOL_RUNTIME_RETRY] RetryableToolError during execution of tool 'get_posts_in_subreddit': Retry with subreddit=learnpython or subreddit=learnprogramming ``` Example Upstream Runtime Error Messages ``` - [UPSTREAM_RUNTIME_RATE_LIMIT] UpstreamRateLimitError during execution of tool 'get_posts_in_subreddit': 429 Client Error: Too Many Requests - [UPSTREAM_RUNTIME_BAD_REQUEST] UpstreamError during execution of tool 'get_posts_in_subreddit': 400 Client Error: Bad request. Missing 'id' parameter. - [UPSTREAM_RUNTIME_BAD_REQUEST] UpstreamError during execution of tool 'search_files': Upstream Google API error: Invalid value '-23'. Values must be within the range: [value: 1\n, value: 1000\n] ```	2025-09-10 10:45:18 -07:00
Sam Partee	b6b4cd0a4c	🏗️ Restructure: Multi-Package Architecture + uv Migration (#412 ) ### Overview Major restructuring from monolithic `arcade-ai` package to modular library architecture with standardized uv-based dependency management. ![arcade-ai Monorepo (2)](https://github.com/user-attachments/assets/25f102b0-bb87-4a04-9701-d227d05664b1) ### New Package Structure - `arcade-tdk` - Lightweight toolkit development kit (core decorators, auth) - `arcade-core` - Core execution engine and catalog functionality - `arcade-serve` - FastAPI/MCP server components - `arcade-ai` - Meta package that includes CLI functionality. Optionally include evals via the `evals` extra. Optionally include all packages via the `all` extra. ### Key Benefits - Lighter Dependencies: Toolkits now depend only on `arcade-tdk` (~2 deps) vs full `arcade-ai` (~30+ deps) - Faster Builds: uv provides 10-100x faster dependency resolution and installation - Better Modularity: Clear separation of concerns, consumers import only what they need - Standard Tooling: Eliminates custom poetry scripts, uses standard Python packaging ### Migration Impact - All 20 toolkits converted from poetry → uv with `arcade-tdk` dependencies plus `arcade-ai[evals]` and `arcade-serve` dev dependencies. When developing locally, devs should install toolkits via `make install-local`. - Modern Python 3.10+ type hints throughout - Standardized build system with hatchling backend - Enhanced Makefile with robust toolkit management commands - Removed `arcade dev` CLI command - Reduce the number of files created by `arcade new` and add an option to not generate a tests and evals folder. This foundation enables faster development cycles and cleaner dependency chains for the growing toolkit ecosystem. ### Todo After this PR is merged - [ ] Post-merge workflow(s) (release & publish containers, etc) - [ ] Release order plan. @EricGustin suggests releasing in the following order: 1. `arcade-core` version 0.1.0 2. `arcade-serve` version 0.1.0 and `arcade-tdk` version 0.1.0 3. `arcade-ai` version 2.0.0 4. Patch release for all toolkits (all changes in toolkits are internal refactors) - [ ] [Update docs](https://github.com/ArcadeAI/docs/pull/318) --------- Co-authored-by: Eric Gustin <eric@arcade.dev> Co-authored-by: Eric Gustin <34000337+EricGustin@users.noreply.github.com>	2025-06-11 16:48:17 -07:00

Renamed from arcade/arcade/worker/core/base.py (Browse further)

6 commits