4 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
113d0d3086
|
CLI Usage (#593)
TLDR;
The philosophy of CLI usage is "fire and forget" and "best effort". You
can opt out by setting `ARCADE_USAGE_TRACKING=0`.
We are capturing two events: `CLI execution succeeded` and `CLI
execution failed`. Reporting to PostHog is a short lived (maximum 10
seconds) subprocess that does not block the main CLI execution process.
`~/.arcade/usage.json` persists two values `anon_id` and
`linked_principal_id`. The logged in status of the CLI user determines
which ID is used. Upon `arcade login`, the `anon_id` is aliased with
`linked_principal_id`. Upon `arcade logout` the `linked_principal_id` is
removed and the `anon_id` is rotated.
## CLI Usage Tracking - How It Works
The usage tracking system implements an identity management and event
tracking pipeline. Here's how the pieces work together:
### **Identity State Management (`usage.json`)**
The system maintains a persistent identity file at
`~/.arcade/usage.json` with this structure:
```json
{
"anon_id": "uuid",
"linked_principal_id": "uuid" | null
}
```
**Key mechanics:**
- **`anon_id`**: Generated once on first CLI use and persists across
sessions. This UUID tracks all anonymous activity.
- **`linked_principal_id`**: Initially `null`. Once the user logs in and
we successfully alias their identity, this field stores their
`principal_id` to indicate this `anon_id` has been linked.
- **Atomic writes**: All updates use a temp file + atomic rename pattern
to prevent corruption from concurrent CLI processes
- **File locking**: Uses `fcntl` (Unix) to coordinate reads/writes
across multiple simultaneous CLI invocations
- **In-memory cache**: The `UsageIdentity` class caches the loaded data
to avoid repeated file I/O within a single CLI invocation
### **Identity Resolution Flow**
When tracking an event, the system determines the `distinct_id` (who to
attribute the event to) via this waterfall:
1. **Check `linked_principal_id`** in `usage.json`
- If present → use it (user was previously aliased)
- This is the fastest path and avoids API calls
2. **Fetch `principal_id` from Arcade Cloud API**
- Makes HTTP request to `/api/v1/auth/validate` with the user's API key
from `~/.arcade/credentials.yaml`
- If authenticated → returns `principal_id`
- Has 2s timeout for responsiveness
3. **Fall back to `anon_id`**
- If not authenticated or API call fails → use anonymous ID
- Marks event with `is_anon=True` flag
### **The Aliasing Lifecycle**
PostHog aliasing links anonymous activity to authenticated users. Here's
the state machine:
#### **Stage 1: Anonymous User**
```
usage.json: { "anon_id": "abc-123", "linked_principal_id": null }
All events → sent with distinct_id="abc-123" and is_anon=True
```
#### **Stage 2: Login Event**
1. User runs `arcade login`
2. Command completes successfully (auth token saved)
3. `CommandTracker` detects successful login
4. Fetches `principal_id` from API
5. Checks `should_alias()` → returns `True` because
`linked_principal_id` is `null`
6. **Calls `alias()` synchronously** (blocking):
```python
posthog.alias(previous_id="abc-123", distinct_id="zyx-321")
```
7. Updates `usage.json`:
```json
{ "anon_id": "abc-123", "linked_principal_id": "zyx-321" }
```
8. PostHog backend merges all events with `distinct_id="abc-123"` into
the user profile for `"zyx-321"`
#### **Stage 3: Authenticated User**
```
usage.json: { "anon_id": "abc-123", "linked_principal_id": "zyx-321" }
All events → sent with distinct_id="zyx-321" and is_anon=False
```
- Events are directly attributed to the authenticated user
- No more API calls needed (uses cached `linked_principal_id`)
#### **Stage 4: Logout Event**
1. User runs `arcade logout`
2. Logout event is sent with the authenticated `distinct_id`
3. `CommandTracker` detects successful logout
4. **Rotates identity** by calling `reset_to_anonymous()`:
```json
{ "anon_id": "xyz-789", "linked_principal_id": null }
```
5. New `anon_id` prevents cross-contamination if another user logs in
### **Critical Constraint: Alias Timing**
PostHog requires that `alias()` is called **BEFORE** any events are sent
with the new `distinct_id`. This is why:
- **`alias()` is synchronous (blocking)**: Guarantees it completes
before the login success event is sent
- **Subsequent events use `linked_principal_id`**: Once aliased, all
future events use the authenticated ID
- **Lazy aliasing**: If a user authenticates via another mechanism (not
through `arcade login`), the system detects this on the next command and
performs aliasing before sending that command's event
### **Event Capture Pipeline**
When `CommandTracker.track_command_execution()` is called:
1. **Resolve identity** → determines `distinct_id` and `is_anon` flag
2. **Build event properties**:
```python
{
"command_name": "toolkit.run",
"cli_version": "1.2.3",
"python_version": "3.11.0",
"os_type": "Darwin",
"os_release": "23.4.0",
"duration": 1250.42, # milliseconds
"error_message": "..." # if failed
}
```
3. **Call `UsageService.capture()`**:
- Serializes event data to JSON
- Spawns detached subprocess: `python -m arcade_cli.usage`
- Passes data via `ARCADE_USAGE_EVENT_DATA` env var
- **Returns immediately** (non-blocking)
4. **Detached subprocess (`__main__.py`)**:
- Runs independently, survives parent CLI exit
- Deserializes event data
- If `is_anon=True`, sets `$process_person_profile=False` (tells PostHog
not to create a full profile)
- Sends event to PostHog with 5s timeout
- Exits (hard exit after 10s max via timeout thread)
### **Concurrency Handling**
Multiple CLI processes can run simultaneously. The system handles this
via:
- **File locking** on `usage.json` (shared lock for reads, exclusive for
writes)
- **Atomic writes** via temp files ensure incomplete writes never
corrupt the file
- **Idempotent aliasing**: `should_alias()` prevents redundant alias
calls
### **Edge Cases Handled**
1. **Side-channel authentication**: User authenticates outside of
`arcade login` (e.g., manually editing credentials)
- Detected via "lazy aliasing" check on every command
- Performs alias if `linked_principal_id` doesn't match current
`principal_id`
2. **API failures during identity fetch**: Falls back to anonymous
tracking
- 2s timeout prevents hanging
- Silent failure doesn't disrupt CLI
3. **PostHog merge restrictions**: Can't alias returning users who
already have a profile
- System stores `linked_principal_id` to avoid retrying impossible
aliases
- New users (never logged in before) get full history stitched
4. **Multiple accounts on same machine**: Logout rotates `anon_id`
- User A's anonymous activity won't leak into User B's profile
### **Privacy & Performance**
- **Opt-out**: `ARCADE_USAGE_TRACKING=0` disables all tracking
- **Non-blocking**: Events never slow down CLI (detached subprocess)
- **Anonymous profiles**: `$process_person_profile=False` for `anon_id`
events minimizes data collection
- **Silent failures**: Network issues or PostHog errors never surface to
users
|
||
|
|
3424ec8219
|
MCP Local (#563)
Versions: * arcade-mcp\==1.0.0rc1 * arcade-mcp-server\==1.0.0rc1 * arcade-core\==2.5.0rc1 * arcade-tdk\==2.6.0rc1 * arcade-serve\==2.2.0rc1 ### Summary Adds first-class MCP support across Arcade, introduces a new MCP server and CLI, unifies the project under the arcade-mcp name, overhauls templates/scaffolding, and improves developer tooling, secrets management, and examples. ### Highlights - **MCP Server & Core** - New MCP server with stdio and HTTP/SSE transports, session management, resumability, and lifecycle handling. - FastAPI-like `MCPApp` for building servers with lazy init; integrated worker+MCP HTTP app option. - Middleware system (logging and error handling), robust exception hierarchy, and Pydantic-based settings. - Async-safe managers for tools, resources, and prompts backed by registries and locks. - Developer-facing, transport-agnostic runtime context interfaces (logs, tools, prompts, resources, sampling, UI, notifications). - Conversion from Arcade ToolDefinition to MCP tool schema; OpenAI JSON tool schema converter. - Parser supports `@app.tool`/`@app.tool(...)` decorators. - **CLI** - New `mcp` command to run MCP servers with stdio or HTTP/SSE. - New `secret` command to set/list/unset tool secrets (supports .env input, preserves original casing for lookups). - `new` command refactored; option to create a full toolkit package with scaffolding. - `chat` command removed. - `serve.py` imports updated to `arcade_serve.fastapi.telemetry`; version retrieval now uses `arcade-mcp`. - `show.py` refactor to use new local catalog utilities. - `display_tool_details` improved: adds “Default” column and handles nested properties. - **Configuration & Discovery** - New `configure.py` to set up Claude Desktop, Cursor, and VS Code to connect to local or Arcade Cloud MCP servers. - Discovery utilities to find/install toolkits, build `ToolCatalog`s, analyze files for tools, load kits from directories (pyproject parsing), and build minimal toolkits. - Better handling of provider API key resolution and evaluation suite loading. - **Templates & Scaffolding** - Reorganized template structure (minimal vs full); moved `.pre-commit-config.yaml`, `.ruff.toml`, license, Makefile, README, tests, and tools layout to correct paths. - Minimal template adds `.env.example` for runtime secret injection. - Template pyproject updated for MCP servers; includes sample server with greeting and secret-reveal tools. - Authorization flow in templates simplified. - **Repo-wide Renaming & Examples** - Migrates references from `arcade-ai` to `arcade-mcp` across READMEs, scripts, and package metadata. - Examples updated (LangChain/LangGraph/AI SDK/TypeScript) and package name changed to `arcade-mcp-sdk`. - **Evals & Core Utilities** - Evals now use OpenAI tooling format (`OpenAIToolList`, `to_openai`); `tool_eval` takes `provider_api_key`. - Core utilities: fixed `does_function_return_value` by dedenting before parse; version bump to `2.5.0rc1` and dependency cleanup. - **Tooling & CI** - `setup-uv-env` action splits toolkit vs contrib dependency installation. - Pre-commit: excludes `libs/arcade-mcp-server/mkdocs.yml` and `libs/tests/` from YAML and Ruff hooks; Ruff per-file ignores (e.g., C901 in `libs/**/*.py`, TRY400 in server docs paths). - Makefile updates for uv env setup, quality checks, tests, builds, and new `shell` target. - Added Makefile to MCP server library to streamline dev workflow. - **Cleanup** - Removed `claude.json` config. - Simplified stdio entrypoint; removed unused imports (`arcade_gmail`, `arcade_search`). ### Breaking Changes - **CLI**: `chat` command removed; use `mcp`, `secret`, and updated `new`. - **Naming**: All users should update references from `arcade-ai` to `arcade-mcp`. - **Templates**: File paths moved; downstream scripts referencing old template locations may need updates. ### Getting Started - Run an MCP server: - `arcade mcp --stdio --toolkits your_toolkit` - `arcade mcp --http --toolkits your_toolkit` - Manage secrets: - `arcade secret set your_toolkit KEY=value` - `arcade secret list your_toolkit` - `arcade secret unset your_toolkit KEY` - Configure clients: - `arcade configure` to set up Claude Desktop, Cursor, and VS Code for local/Arcade Cloud MCP. --------- Co-authored-by: Sam Partee <sam@arcade-ai.com> Co-authored-by: Shub <125150494+shubcodes@users.noreply.github.com> |
||
|
|
27a6cd31a3
|
Support Tool Output in ValueSchema of ToolDefinition (#487)
## Before
### Tool: ``GoogleNews.SearchNewsStories``
```python
@tool(requires_secrets=["SERP_API_KEY"])
async def search_news_stories(
context: ToolContext,
keywords: Annotated[
str,
"Keywords to search for news articles. E.g. 'Apple launches new iPhone'.",
],
country_code: Annotated[
CountryCode | None,
"2-character country code to search for news articles. "
"E.g. 'us' (United States). "
f"Defaults to '{DEFAULT_GOOGLE_NEWS_COUNTRY}'.",
] = None,
language_code: Annotated[
LanguageCode,
"2-character language code to search for news articles. E.g. 'en' (English). "
f"Defaults to '{DEFAULT_GOOGLE_NEWS_LANGUAGE}'.",
] = DEFAULT_GOOGLE_NEWS_LANGUAGE,
limit: Annotated[
int | None,
"Maximum number of news articles to return. Defaults to None "
"(returns all results found by the API).",
] = None,
) -> Annotated[dict[str, Any]]:
"""Search for news articles related to a given query."""
...
```
### Tool Definition: ``GoogleNews.SearchNewsStories``
```
{
"name": "SearchNewsStories",
"fully_qualified_name": "GoogleNews.SearchNewsStories",
"description": "Search for news articles related to a given query.",
"toolkit": {
"name": "GoogleNews",
"description": "Arcade.dev LLM tools for getting new via Google News",
"version": "2.0.0"
},
"input": {
"parameters": [
{
"name": "keywords",
"required": true,
"description": "Keywords to search for news articles. E.g. 'Apple launches new iPhone'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
},
"inferrable": true
},
{
"name": "country_code",
"required": false,
"description": "2-character country code to search for news articles. E.g. 'us' (United States). Defaults to 'None'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
},
"inferrable": true
},
{
"name": "language_code",
"required": false,
"description": "2-character language code to search for news articles. E.g. 'en' (English). Defaults to 'en'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
},
"inferrable": true
},
{
"name": "limit",
"required": false,
"description": "Maximum number of news articles to return. Defaults to None (returns all results found by the API).",
"value_schema": {
"val_type": "integer",
"inner_val_type": null,
"enum": null,
},
"inferrable": true
}
]
},
"output": {
"description": "News search results with article details.",
"available_modes": [
"value",
"error"
],
"value_schema": {
"val_type": "json"
}
},
"requirements": {
"authorization": null,
"secrets": [
{
"key": "serp_api_key"
}
],
"metadata": null
},
"deprecation_message": null
},
```
## After
### Enhanced Tool: ``GoogleNews.SearchNewsStories``
```python
"""Type definitions for Google News API responses and parameters."""
from typing_extensions import TypedDict
CountryCode = str
LanguageCode = str
class SearchNewsParams(TypedDict):
"""Input parameters for searching news articles."""
keywords: str
"""Search query terms to find relevant news articles \
(e.g., 'Apple launches new iPhone')."""
country_code: CountryCode | None
"""Optional 2-letter country code to filter news by region \
(e.g., 'us' for United States, 'uk' for United Kingdom)."""
language_code: LanguageCode | None
"""Optional 2-letter language code to filter news by language \
(e.g., 'en' for English, 'es' for Spanish)."""
limit: int | None
"""Optional maximum number of news articles to return. \
If not specified, returns all results from the API."""
class SourceInfo(TypedDict, total=False):
"""Information about the news source/publication."""
name: str
"""Name of the publication (e.g., 'CNN', 'BBC News', 'The New York Times')."""
icon: str
"""URL to the source's favicon or logo image."""
authors: list[str]
"""List of author names for the article, if available."""
class NewsResult(TypedDict, total=False):
"""Individual news article from the Google News API response."""
position: int
"""Ranking position of this result in the search results."""
title: str
"""Headline or title of the news article."""
link: str
"""Full URL to the original news article."""
source: SourceInfo
"""Information about the publication source."""
date: str
"""Publication date and time (e.g., '2 hours ago', 'Dec 15, 2023')."""
snippet: str
"""Brief excerpt or summary from the article content."""
thumbnail: str
"""URL to a high-resolution thumbnail image for the article."""
thumbnail_small: str
"""URL to a low-resolution thumbnail image for the article."""
story_token: str
"""Token for accessing full coverage of this news story across multiple sources."""
stories: list["NewsResult"]
"""Related news stories from other sources covering the same topic."""
highlight: dict
"""Additional highlighted information about the story."""
class SearchMetadata(TypedDict, total=False):
"""Metadata about the search request and processing."""
id: str
"""Unique identifier for this search request within SerpApi."""
status: str
"""Current processing status ('Processing', 'Success', or 'Error')."""
json_endpoint: str
"""URL to retrieve the JSON results for this search."""
created_at: str
"""Timestamp when the search request was created."""
processed_at: str
"""Timestamp when the search request was processed."""
google_news_url: str
"""Original Google News URL that would return these results."""
total_time_taken: float
"""Total time in seconds taken to process this search."""
class SearchParameters(TypedDict, total=False):
"""Parameters used for the search request."""
engine: str
"""Search engine used (always 'google_news' for this API)."""
q: str
"""Search query string."""
gl: str
"""Country code used for geographic filtering."""
hl: str
"""Language code used for language filtering."""
topic_token: str
"""Token for accessing specific news topics (e.g., 'World', 'Business', 'Technology')."""
publication_token: str
"""Token for accessing news from specific publishers."""
class MenuLink(TypedDict):
"""Navigation link for news categories or topics."""
title: str
"""Display text for the menu item (e.g., 'Technology', 'Sports', 'Business')."""
topic_token: str
"""Token to access this specific topic or category."""
serpapi_link: str
"""SerpApi URL to search within this topic."""
class TopStoriesLink(TypedDict):
"""Link to top stories section."""
topic_token: str
"""Token to access top stories."""
serpapi_link: str
"""SerpApi URL to retrieve top stories."""
class GoogleNewsResponse(TypedDict, total=False):
"""Complete response from the Google News API."""
search_metadata: SearchMetadata
"""Metadata about the search request and processing."""
search_parameters: SearchParameters
"""Parameters that were used for this search."""
news_results: list[NewsResult]
"""List of news articles matching the search criteria."""
menu_links: list[MenuLink]
"""Navigation links to different news categories and topics."""
top_stories_link: TopStoriesLink
"""Link to access top stories."""
title: str
"""Title of the page or topic being displayed."""
class SimplifiedNewsResult(TypedDict):
"""Simplified news article format for tool output."""
title: str
"""Headline of the news article."""
link: str
"""URL to the full article."""
source: str | None
"""Name of the publication source."""
date: str | None
"""When the article was published."""
snippet: str | None
"""Brief excerpt from the article."""
class SearchNewsOutput(TypedDict):
"""Output format for the search_news_stories tool."""
news_results: list[SimplifiedNewsResult]
"""List of news articles in simplified format."""
@tool(requires_secrets=["SERP_API_KEY"])
async def search_news_stories(
context: ToolContext,
keywords: Annotated[
str,
"Keywords to search for news articles. E.g. 'Apple launches new iPhone'.",
],
country_code: Annotated[
CountryCode | None,
"2-character country code to search for news articles. "
"E.g. 'us' (United States). "
f"Defaults to '{DEFAULT_GOOGLE_NEWS_COUNTRY}'.",
] = None,
language_code: Annotated[
LanguageCode,
"2-character language code to search for news articles. E.g. 'en' (English). "
f"Defaults to '{DEFAULT_GOOGLE_NEWS_LANGUAGE}'.",
] = DEFAULT_GOOGLE_NEWS_LANGUAGE,
limit: Annotated[
int | None,
"Maximum number of news articles to return. Defaults to None "
"(returns all results found by the API).",
] = None,
) -> Annotated[SearchNewsOutput, "News search results with article details."]:
"""Search for news articles related to a given query."""
...
```
### Enhanced Tool Definition: ``GoogleNews.SearchNewsStories``
```json
{
"name": "SearchNewsStories",
"fully_qualified_name": "GoogleNews.SearchNewsStories",
"description": "Search for news articles related to a given query.",
"toolkit": {
"name": "GoogleNews",
"description": "Arcade.dev LLM tools for getting new via Google News",
"version": "2.0.0"
},
"input": {
"parameters": [
{
"name": "keywords",
"required": true,
"description": "Keywords to search for news articles. E.g. 'Apple launches new iPhone'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": null
},
"inferrable": true
},
{
"name": "country_code",
"required": false,
"description": "2-character country code to search for news articles. E.g. 'us' (United States). Defaults to 'None'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": null
},
"inferrable": true
},
{
"name": "language_code",
"required": false,
"description": "2-character language code to search for news articles. E.g. 'en' (English). Defaults to 'en'.",
"value_schema": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": null
},
"inferrable": true
},
{
"name": "limit",
"required": false,
"description": "Maximum number of news articles to return. Defaults to None (returns all results found by the API).",
"value_schema": {
"val_type": "integer",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": null
},
"inferrable": true
}
]
},
"output": {
"description": "News search results with article details.",
"available_modes": [
"value",
"error"
],
"value_schema": {
"val_type": "json",
"inner_val_type": null,
"enum": null,
"properties": {
"news_results": {
"val_type": "array",
"inner_val_type": "json",
"enum": null,
"properties": null,
"inner_properties": {
"title": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": "Headline of the news article."
},
"link": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": "URL to the full article."
},
"source": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": "Name of the publication source."
},
"date": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": "When the article was published."
},
"snippet": {
"val_type": "string",
"inner_val_type": null,
"enum": null,
"properties": null,
"inner_properties": null,
"description": "Brief excerpt from the article."
}
},
"description": "List of news articles in simplified format."
}
},
"inner_properties": null,
"description": null
}
},
"requirements": {
"authorization": null,
"secrets": [
{
"key": "serp_api_key"
}
],
"metadata": null
},
"deprecation_message": null
},
```
---------
Co-authored-by: Eric Gustin <eric@arcade.dev>
|
||
|
|
b6b4cd0a4c
|
🏗️ Restructure: Multi-Package Architecture + uv Migration (#412)
### Overview Major restructuring from monolithic `arcade-ai` package to modular library architecture with standardized uv-based dependency management.  ### New Package Structure - **`arcade-tdk`** - Lightweight toolkit development kit (core decorators, auth) - **`arcade-core`** - Core execution engine and catalog functionality - **`arcade-serve`** - FastAPI/MCP server components - **`arcade-ai`** - Meta package that includes CLI functionality. Optionally include evals via the `evals` extra. Optionally include all packages via the `all` extra. ### Key Benefits - **Lighter Dependencies**: Toolkits now depend only on `arcade-tdk` (~2 deps) vs full `arcade-ai` (~30+ deps) - **Faster Builds**: uv provides 10-100x faster dependency resolution and installation - **Better Modularity**: Clear separation of concerns, consumers import only what they need - **Standard Tooling**: Eliminates custom poetry scripts, uses standard Python packaging ### Migration Impact - All 20 toolkits converted from poetry → uv with `arcade-tdk` dependencies plus `arcade-ai[evals]` and `arcade-serve` dev dependencies. When developing locally, devs should install toolkits via `make install-local`. - Modern Python 3.10+ type hints throughout - Standardized build system with hatchling backend - Enhanced Makefile with robust toolkit management commands - Removed `arcade dev` CLI command - Reduce the number of files created by `arcade new` and add an option to not generate a tests and evals folder. This foundation enables faster development cycles and cleaner dependency chains for the growing toolkit ecosystem. ### Todo After this PR is merged - [ ] Post-merge workflow(s) (release & publish containers, etc) - [ ] Release order plan. @EricGustin suggests releasing in the following order: 1. `arcade-core` version 0.1.0 2. `arcade-serve` version 0.1.0 and `arcade-tdk` version 0.1.0 3. `arcade-ai` version 2.0.0 4. Patch release for all toolkits (all changes in toolkits are internal refactors) - [ ] [Update docs](https://github.com/ArcadeAI/docs/pull/318) --------- Co-authored-by: Eric Gustin <eric@arcade.dev> Co-authored-by: Eric Gustin <34000337+EricGustin@users.noreply.github.com> |
Renamed from arcade/arcade/cli/show.py (Browse further)