arcade-mcp/libs/arcade-mcp-server/arcade_mcp_server/logging_utils.py
Francisco Or Something 1492c80fc5
TOO-627: Improve error messages for agents and Datadog (#814)
## Summary

- Improve tool call error messages across 4 libraries (arcade-core,
arcade-tdk, arcade-mcp-server, arcade-serve) so agents can self-correct
and Datadog can facet on structured fields
- Guard empty error messages, enrich input validation errors with
field-level detail, fix `@tool` decorator fallback formatting, surface
`additional_prompt_content` in MCP responses, and add structured log
extras for Datadog
- Addresses the 3 worst error patterns: generic "Error in tool input
deserialization", bare `KeyError` values, and empty `FatalToolError`
messages

**Linear:** TOO-627
**Plan:** `docs/plans/2026-04-08-improve-error-messages-handoff.md`

## Tasks

- [ ] Task 1: Guard empty error messages (arcade-core)
- [ ] Task 2: Enrich input validation error messages (arcade-core)
- [ ] Task 3: Improve `@tool` decorator error fallback (arcade-tdk)
- [ ] Task 4: Fix MCP agent-facing error response (arcade-mcp-server)
- [ ] Task 5: Add structured log extras in BaseWorker (arcade-serve)
- [ ] Task 6: Add structured log extras in MCP server
(arcade-mcp-server)

## Test plan

- [ ] Each task has dedicated unit tests verifying the new behavior
- [ ] `make test` passes after all tasks
- [ ] `make check` (ruff + mypy) passes
- [ ] Verify the 3 worst error patterns now produce actionable messages

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Touches cross-library error formatting and logging behavior used in
production tool execution paths; while mostly additive/guardrails, it
changes agent-visible messages and Datadog log facets, which could
impact client expectations and alerting.
> 
> **Overview**
> Improves tool-call error handling across core/runtime, MCP transport,
worker transport, and the TDK to make agent-visible failures more
actionable while *reducing sensitive-data leakage*.
> 
> In `arcade-core`, empty error messages now get placeholders,
`ToolOutputFactory.fail*` defaults blank messages, and input validation
errors are rewritten as field-level summaries that intentionally omit
rejected values (avoiding Pydantic echo of secrets). The `@tool`
fallback in `arcade-tdk` no longer surfaces `str(exception)` to agents;
it returns exception *type-only* in `message` while preserving full
detail in `developer_message`.
> 
> Adds a shared `build_tool_error_log_extra` helper and updates
`arcade-serve` + `arcade-mcp-server` to emit consistent structured
WARNING logs (`error_*`, `tool_name`, optional toolkit/version) for
Datadog, while MCP error responses now append
`additional_prompt_content` and force `structuredContent=None` on
failures per spec. Includes extensive new tests and bumps package
versions (`arcade-core` 4.6.2, `arcade-tdk` 3.6.1, `arcade-mcp-server`
1.19.3, `arcade-serve` 3.2.3).
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
e5c7ebcaf56176cfbd8e6d1f2b6295352abd0ec0. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 20:10:51 -03:00

122 lines
4.7 KiB
Python

"""Shared logging utilities for MCP server."""
import logging
import sys
from typing import ClassVar
from loguru import logger
class LoguruInterceptHandler(logging.Handler):
"""Intercept standard logging and route to Loguru.
This handler bridges the standard Python logging module with Loguru,
ensuring all logs (from both systems) use the same formatting.
Preserves custom 'extra' fields (used for structured logging).
"""
# Standard fields populated on every ``logging.LogRecord`` by the stdlib
# logging module itself. They are NOT custom extras passed via
# ``logger.warning(..., extra={...})`` and forwarding them to loguru would
# both clobber loguru's own keys and pollute the structured log payload
# with redundant data.
#
# The set is the union of:
# * Attributes assigned in ``logging.LogRecord.__init__`` (CPython source:
# Lib/logging/__init__.py — ``LogRecord.__init__``). This covers
# ``name, msg, args, levelname, levelno, pathname, filename, module,
# exc_info, exc_text, stack_info, lineno, funcName, created, msecs,
# relativeCreated, thread, threadName, processName, process``.
# * ``taskName`` — added in Python 3.12 by ``LogRecord.__init__`` for
# asyncio task identification. Included unconditionally so the set
# stays correct on 3.10-3.11 (key just won't be present at runtime).
# * ``message`` — set lazily by ``LogRecord.getMessage()``; appears on
# ``record.__dict__`` after formatting.
# * ``asctime`` — set lazily by formatters that reference ``%(asctime)s``.
#
# Anything else in ``record.__dict__`` is, by stdlib convention, a custom
# extra passed via the ``extra={...}`` kwarg, and IS forwarded to loguru.
_STANDARD_FIELDS: ClassVar[set[str]] = {
"name",
"msg",
"args",
"created",
"msecs",
"levelname",
"levelno",
"pathname",
"filename",
"module",
"exc_info",
"exc_text",
"stack_info",
"lineno",
"funcName",
"relativeCreated",
"thread",
"threadName",
"processName",
"process",
"taskName",
"message",
"asctime",
}
def emit(self, record: logging.LogRecord) -> None:
# Resolve the loguru level: prefer the named level (DEBUG/INFO/WARNING/
# ERROR/CRITICAL — all built into loguru), and for custom or otherwise
# unknown levels fall back to the integer ``levelno``. We pass an int
# (not ``str(levelno)``) because loguru only resolves stringified
# levels via name lookup — passing ``"42"`` raises ValueError, while
# passing the integer ``42`` is accepted directly per loguru's API.
level: str | int
try:
level = logger.level(record.levelname).name
except ValueError:
level = record.levelno
# Extract custom extra fields from the log record.
# These are fields added via logger.warning(..., extra={...}).
# By preserving them and passing to loguru via bind(), they become
# part of the log record's extra dict, enabling Datadog faceting.
extras = {k: v for k, v in record.__dict__.items() if k not in self._STANDARD_FIELDS}
# Bind extras to loguru context, then log the message.
# logger.bind() merges extras into the record["extra"] dict,
# making them available to sinks (e.g., Datadog, ELK) that support
# structured logging.
logger.bind(**extras).opt(exception=record.exc_info).log(level, record.getMessage())
def intercept_standard_logging() -> None:
"""Configure standard logging to route through Loguru.
This should be called after Loguru is configured to ensure all
standard logging calls are intercepted and formatted consistently.
"""
logging.basicConfig(handlers=[LoguruInterceptHandler()], level=0, force=True)
def setup_logging(level: str = "INFO", stdio_mode: bool = False) -> None:
"""Configure logging with Loguru."""
logger.remove()
# In stdio mode, use stderr (stdout is reserved for JSON-RPC)
sink = sys.stderr if stdio_mode else sys.stdout
if level == "DEBUG":
format_str = "<level>{level: <8}</level> | <green>{time:HH:mm:ss}</green> | <cyan>{name}:{line}</cyan> | <level>{message}</level>"
else:
format_str = (
"<level>{level: <8}</level> | <green>{time:HH:mm:ss}</green> | <level>{message}</level>"
)
logger.add(
sink,
format=format_str,
level=level,
colorize=(not stdio_mode),
diagnose=(level == "DEBUG"),
)
intercept_standard_logging()