feat: Clarity principles, edge case coverage, auto-install, clean README
- Embed clarity principles in Stage 1 (self-guided, no external dependency) - Add refactoring guidance for growing skills (architecture-guide) - Add cross-component communication patterns for suites (architecture-guide) - Add versioning strategy with semver rules (architecture-guide) - Add suite orchestration patterns with routing logic (multi-agent-guide) - Add dependency management framework — stdlib first (quality-standards) - Add testing strategy with patterns and fixtures (quality-standards) - Add auto-install step in Phase 5 — detect platform, install, show next steps - Rewrite README for broader audience — 788 to 318 lines, no jargon Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
2a6f61e2ec
commit
96e546ffdd
5 changed files with 907 additions and 573 deletions
713
README.md
713
README.md
|
|
@ -1,6 +1,6 @@
|
|||
# Agent Skill Creator
|
||||
|
||||
**Create cross-platform agent skills from natural language workflow descriptions.**
|
||||
**Turn any workflow into reusable AI agent software — no spec writing, no prompt engineering, no coding required.**
|
||||
|
||||
[](https://github.com/anthropics/agent-skills-spec)
|
||||
[]()
|
||||
|
|
@ -8,94 +8,46 @@
|
|||
|
||||
---
|
||||
|
||||
## What Is This?
|
||||
## The Problem
|
||||
|
||||
Agent Skill Creator is a **Level 5 skill dark factory**. Install it once, then type `/agent-skill-creator` followed by whatever you have — workflow descriptions, documentation, links, existing code, API docs, compliance checklists, PDFs. The agent deeply reads and understands your material, generates its own internal specification, implements the skill end-to-end from that specification, validates it, security-scans it, and delivers a production-ready skill. You provide the raw material and evaluate the outcome. The agent handles everything in between.
|
||||
Every AI agent (Claude Code, GitHub Copilot, Cursor, Windsurf, Codex, Gemini) starts from zero. It doesn't know your company's processes, data sources, or compliance requirements. So every person re-explains the same workflows in every conversation. Knowledge stays in individual chat histories. New hires start from scratch.
|
||||
|
||||
Inspired by the [dark factory model](https://www.youtube.com/watch?v=bDcgHzCBgmQ) where specifications go in and working software comes out: the human removes the cognitive constraint by providing domain knowledge, the factory removes the implementation constraint by building autonomously, and the quality gates remove the trust constraint by validating automatically.
|
||||
**Agent skills fix this.** A skill is structured knowledge your agent loads automatically — like installing an app. Once installed, anyone on your team can invoke it and get consistent results, every time, on any platform.
|
||||
|
||||
**Input**: Raw material — documentation, links, code, process descriptions, PDFs, anything that captures the workflow.
|
||||
**Output**: A self-contained, validated, security-scanned skill directory ready to install on any platform and publish to the team registry.
|
||||
**The catch:** building a proper skill requires understanding the spec format, writing clear prompt instructions, designing how information loads progressively, writing functional code, and getting activation keywords right. Even simple skills take [multiple rounds of iteration](https://www.youtube.com/watch?v=izJkgLqlbN8) to get right.
|
||||
|
||||
### Built-in Quality Gates
|
||||
|
||||
Every skill goes through automated checks before it reaches your team. You don't need to trust the output blindly — the toolchain enforces quality:
|
||||
|
||||
| Gate | What It Checks | When It Runs |
|
||||
|------|---------------|--------------|
|
||||
| **Spec Validation** | SKILL.md exists, frontmatter is well-formed, name follows kebab-case rules, description under 1024 chars, body under 500 lines | During creation (Phase 5) and on every publish |
|
||||
| **Security Scan** | No hardcoded API keys, no exposed credentials, no `eval()`/`exec()` injection risks, no sensitive files (.env, secrets.json) | During creation (Phase 5) and on every publish |
|
||||
| **Naming Convention** | Directory name matches SKILL.md `name` field, no consecutive hyphens, 1-64 characters | During validation |
|
||||
| **Structure Check** | Required files present, local references resolve, metadata fields populated | During validation |
|
||||
|
||||
Skills that fail validation **cannot be published**. Skills with high-severity security issues **are blocked** unless explicitly overridden. This means every skill in the registry has passed both gates — your team can install with confidence.
|
||||
|
||||
You can also run these checks independently at any time:
|
||||
|
||||
```bash
|
||||
python3 scripts/validate.py ./my-skill/ # Spec compliance
|
||||
python3 scripts/security_scan.py ./my-skill/ # Security audit
|
||||
```
|
||||
**Agent Skill Creator removes that barrier entirely.** You pass in whatever you have — messy docs, links, code, PDFs, transcripts, vague descriptions — and it produces a validated, security-scanned skill ready to install and share. You describe what you do; it builds the software.
|
||||
|
||||
---
|
||||
|
||||
## Why Agent Skills Matter
|
||||
## Quick Start
|
||||
|
||||
AI agents (Claude Code, GitHub Copilot, Cursor, Windsurf, Codex, Gemini) are becoming the primary interface for knowledge work. But out of the box, every agent starts from zero — it doesn't know your company's processes, data sources, naming conventions, or compliance requirements.
|
||||
|
||||
**Agent skills solve this.** A skill is structured domain knowledge that an agent loads automatically. Instead of re-explaining your workflow every conversation, the agent already knows how to do it.
|
||||
|
||||
**The corporate opportunity:**
|
||||
|
||||
- **Without skills**: Every person prompts the agent differently. Knowledge stays in individual chat histories. New hires start from scratch. The same workflow gets re-explained hundreds of times.
|
||||
- **With skills**: Someone describes a workflow once. The agent-skill-creator turns it into a reusable skill. It gets published to the team registry. Now every agent on the team — regardless of platform — knows how to do that workflow. Knowledge compounds instead of evaporating.
|
||||
|
||||
**What changes in practice:**
|
||||
|
||||
1. **Operations teams** describe their runbooks. Skills get created. Now agents can execute standard procedures consistently.
|
||||
2. **Data teams** describe their analysis pipelines. Skills get created. Now any team member can run the same analysis by asking their agent.
|
||||
3. **Finance teams** describe their reporting workflows. Skills get created. Now quarterly reports follow the same methodology every time.
|
||||
4. **Engineering teams** describe their deployment processes, code review standards, testing protocols. Skills get created. Now agents enforce consistency across the organization.
|
||||
|
||||
The pattern is always the same: **capture tacit knowledge as skills, share them through the registry, and let agents scale that knowledge across the team.**
|
||||
|
||||
**Why you can trust the output:**
|
||||
|
||||
This is a Level 5 dark factory — not "prompt and pray." The agent reads your material, generates its own internal specification, implements from that specification, and then runs automated quality gates before anything is delivered. Validation checks spec compliance (structure, naming, frontmatter). Security scanning checks for hardcoded credentials and injection patterns. Skills that fail these gates cannot be published to the registry.
|
||||
|
||||
The human provides the domain knowledge (the hard part). The factory builds the skill (the tedious part). The quality gates verify the output (the trust part). Three constraints removed: cognitive, implementation, and trust.
|
||||
|
||||
This repo is the complete toolkit: create skills from raw material, validate them against the open standard, security-scan them, and share them through a git-based registry that gives you version history, access control, and review workflows for free.
|
||||
|
||||
---
|
||||
|
||||
## End-to-End Walkthrough
|
||||
|
||||
This is the full lifecycle from idea to team-wide adoption.
|
||||
|
||||
### Step 1: Install the Skill Creator
|
||||
|
||||
Clone this repo into your agent's skill directory:
|
||||
### 1. Install (one command)
|
||||
|
||||
```bash
|
||||
# Claude Code
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git ~/.claude/skills/agent-skill-creator
|
||||
|
||||
# VS Code with GitHub Copilot
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .github/skills/agent-skill-creator
|
||||
|
||||
# Cursor
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .cursor/rules/agent-skill-creator
|
||||
|
||||
# Any other supported platform — see "Setup by Platform" below
|
||||
```
|
||||
|
||||
### Step 2: Invoke the Skill Creator
|
||||
Other platforms: [see full list below](#all-platforms).
|
||||
|
||||
Open your agent and type `/agent-skill-creator` followed by whatever you have. The more context you provide, the better the skill:
|
||||
### 2. Use it
|
||||
|
||||
Open your agent and type `/agent-skill-creator` followed by whatever you have:
|
||||
|
||||
```
|
||||
/agent-skill-creator Every week I pull sales data from our CRM, clean
|
||||
duplicate entries, calculate regional totals, and generate a PDF report.
|
||||
```
|
||||
|
||||
You can pass anything — plain English, documentation links, existing code, API docs, PDFs, database schemas, transcripts. Combine multiple sources in one message. The more context, the better the result.
|
||||
|
||||
```
|
||||
/agent-skill-creator Based on our deployment runbook: https://wiki.internal/deploy-process
|
||||
```
|
||||
|
|
@ -109,282 +61,141 @@ duplicate entries, calculate regional totals, and generate a PDF report.
|
|||
Create a skill that queries stock levels and generates reorder reports.
|
||||
```
|
||||
|
||||
```
|
||||
/agent-skill-creator Based on compliance-checklist.pdf, create a SOX audit skill
|
||||
```
|
||||
### 3. What comes out
|
||||
|
||||
You can pass in plain English descriptions, documentation links, existing code, API docs, PDFs — anything. Combine multiple sources in one message. The agent reads everything and runs the dark factory pipeline:
|
||||
A complete skill, automatically installed on your platform:
|
||||
|
||||
```
|
||||
STAGE 1: UNDERSTAND AND SPECIFY
|
||||
DISCOVERY → Reads all your material, researches APIs, data sources
|
||||
DESIGN → Generates its own internal specification (use cases, methods, edge cases)
|
||||
Skill installed successfully.
|
||||
|
||||
STAGE 2: BUILD AND VERIFY
|
||||
ARCHITECTURE → Structures the skill directory
|
||||
DETECTION → Crafts activation keywords so the skill triggers reliably
|
||||
IMPLEMENTATION → Creates all files, validates, security-scans, delivers
|
||||
To use it, open a new session and type:
|
||||
|
||||
/sales-report-builder Generate the weekly report for the West region
|
||||
|
||||
Installed at: ~/.claude/skills/sales-report-builder
|
||||
```
|
||||
|
||||
The agent deeply understands your material before writing a single line of code. It generates its own specification — a complete internal contract for what the skill must do — and then implements from that specification autonomously. The output is a complete skill directory (e.g., `./sales-report-builder/`) with functional code, documentation, and a spec-compliant SKILL.md.
|
||||
The agent detects your platform (Claude Code, Cursor, Copilot, etc.), installs the skill to the right location, and tells you exactly how to invoke it. No manual steps.
|
||||
|
||||
### Step 3: Automated Quality Gates
|
||||
The generated skill directory looks like this:
|
||||
|
||||
Every skill the pipeline produces goes through two automated checks before it's considered ready:
|
||||
|
||||
```bash
|
||||
# Spec compliance — structure, naming, frontmatter, file references
|
||||
python3 scripts/validate.py ./sales-report-builder/
|
||||
|
||||
# Security — no hardcoded keys, no credential exposure, no injection risks
|
||||
python3 scripts/security_scan.py ./sales-report-builder/
|
||||
```
|
||||
sales-report-builder/
|
||||
├── SKILL.md # Skill definition (activates with /sales-report-builder)
|
||||
├── scripts/ # Functional Python code
|
||||
├── references/ # Detailed documentation
|
||||
├── assets/ # Templates, configs
|
||||
├── install.sh # Cross-platform installer (for sharing with others)
|
||||
└── README.md # Installation instructions (for sharing with others)
|
||||
```
|
||||
|
||||
These run automatically at the end of Phase 5 (Implementation) and again when you publish to the registry. If validation fails or the security scan finds high-severity issues, the skill is blocked until the issues are fixed. You don't have to review the output manually to trust it — the toolchain does that for you.
|
||||
Your team installs it the same way they installed agent-skill-creator — one `git clone` — and invokes it with `/sales-report-builder`. The included `install.sh` auto-detects their platform too.
|
||||
|
||||
### Step 4: Publish to the Team Registry
|
||||
---
|
||||
|
||||
The registry lives inside this repo at `registry/`. Publishing copies the skill into the shared catalog:
|
||||
## How It Works
|
||||
|
||||
You don't need to understand any of this to use it. But if you're curious:
|
||||
|
||||
The agent doesn't just follow your description literally. Humans describe what they *do*, not what they *need*. "I pull sales data and make a report" hides a dozen implicit requirements — who reads the report, what format, what happens when data is missing. The agent reads all your material, uncovers these implicit requirements, and generates its own internal specification before writing any code. It builds from that deeper understanding, not from your surface description.
|
||||
|
||||
```
|
||||
UNDERSTAND Read all material → uncover real intent → generate internal spec
|
||||
BUILD Structure directory → write code and docs → craft activation keywords
|
||||
VERIFY Spec validation → security scan → block delivery if either fails
|
||||
```
|
||||
|
||||
Every skill is automatically validated (correct structure, naming, metadata) and security-scanned (no hardcoded keys, no credential exposure, no injection risks) before delivery. Skills that fail these checks are blocked.
|
||||
|
||||
---
|
||||
|
||||
## Share Skills Across Your Team
|
||||
|
||||
Once a skill is created, publish it so everyone can use it.
|
||||
|
||||
### Publish
|
||||
|
||||
```bash
|
||||
python3 scripts/skill_registry.py publish ./sales-report-builder/ --tags sales,reports,crm
|
||||
git add registry/ && git commit -m "Add sales-report-builder skill" && git push
|
||||
```
|
||||
|
||||
This validates the skill, runs the security scan, copies the files into `registry/skills/sales-report-builder/`, and updates `registry/registry.json`.
|
||||
|
||||
Then commit and push so the team can access it:
|
||||
|
||||
```bash
|
||||
git add registry/
|
||||
git commit -m "feat: Add sales-report-builder skill"
|
||||
git push
|
||||
```
|
||||
|
||||
### Step 5: Team Discovers and Installs Skills
|
||||
|
||||
Colleagues pull the latest and browse the catalog:
|
||||
### Discover
|
||||
|
||||
```bash
|
||||
git pull
|
||||
|
||||
# What skills are available?
|
||||
python3 scripts/skill_registry.py list
|
||||
|
||||
# Output:
|
||||
# NAME VERSION AUTHOR TAGS
|
||||
# sales-report-builder 1.0.0 sales-team sales, reports, crm
|
||||
# data-quality-checker 1.0.0 data-team data, validation
|
||||
# deploy-checklist 2.0.0 engineering deploy, ci, checklist
|
||||
|
||||
# Search for something specific
|
||||
python3 scripts/skill_registry.py search "sales"
|
||||
|
||||
# Get full details
|
||||
python3 scripts/skill_registry.py info sales-report-builder
|
||||
|
||||
# Install it (auto-detects your platform)
|
||||
python3 scripts/skill_registry.py install sales-report-builder
|
||||
```
|
||||
|
||||
### Step 6: Use the Skill
|
||||
|
||||
After installing, the colleague opens their agent and types:
|
||||
|
||||
```
|
||||
/sales-report-builder Generate the weekly report for the West region
|
||||
```
|
||||
|
||||
The skill activates — just like `/agent-skill-creator` or `/clarity` — and executes the workflow: pulling data, cleaning it, calculating totals, generating the PDF. Same process, same quality, every time, on any platform.
|
||||
|
||||
Every skill the factory produces is a first-class citizen: installed with `git clone`, invoked with `/skill-name`, self-contained with its own scripts, references, and documentation.
|
||||
|
||||
### Step 7: Iterate
|
||||
|
||||
Skills improve over time. Someone adds error handling for API timeouts. Another person adds a new region. They publish updates to the registry, the team pulls, and everyone benefits.
|
||||
|
||||
```bash
|
||||
# Update and re-publish
|
||||
python3 scripts/skill_registry.py publish ./sales-report-builder/ --force
|
||||
git add registry/ && git commit -m "fix: Handle CRM API timeouts" && git push
|
||||
```
|
||||
|
||||
### The Result
|
||||
|
||||
Over weeks and months, the registry grows organically. Each team contributes skills from their domain. The organization builds a **living library of operational knowledge** that every agent can access — regardless of which platform (Claude Code, Cursor, Copilot, etc.) each person uses.
|
||||
|
||||
```bash
|
||||
python3 scripts/skill_registry.py list
|
||||
|
||||
# NAME VERSION AUTHOR TAGS
|
||||
# sales-report-builder 1.2.0 sales-team sales, reports, crm
|
||||
# data-quality-checker 1.0.0 data-team data, validation
|
||||
# deploy-checklist 2.1.0 engineering deploy, ci, checklist
|
||||
# quarterly-compliance 1.0.0 legal-team compliance, audit
|
||||
# customer-churn-model 3.0.0 data-science ml, churn, prediction
|
||||
# incident-runbook 1.1.0 sre-team incidents, on-call
|
||||
# onboarding-guide 1.0.0 hr-team onboarding, new-hire
|
||||
|
||||
python3 scripts/skill_registry.py search "sales"
|
||||
python3 scripts/skill_registry.py info sales-report-builder
|
||||
```
|
||||
|
||||
This is the shift: from individual prompting to organizational capability.
|
||||
### Install
|
||||
|
||||
```bash
|
||||
python3 scripts/skill_registry.py install sales-report-builder
|
||||
```
|
||||
|
||||
Auto-detects your platform (Claude Code, Cursor, etc.) and installs to the right location.
|
||||
|
||||
### The result over time
|
||||
|
||||
Each team contributes skills from their domain. Operations teams capture runbooks. Data teams capture analysis pipelines. Finance teams capture reporting workflows. Engineering teams capture deployment processes. The organization builds a living library of operational knowledge that every agent can access.
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
## All Platforms
|
||||
|
||||
### Claude Code
|
||||
|
||||
```bash
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git ~/.claude/skills/agent-skill-creator
|
||||
```
|
||||
|
||||
### GitHub Copilot
|
||||
Works in IDEs and CLI tools. Same install, same invocation, same results.
|
||||
|
||||
### IDEs
|
||||
|
||||
```bash
|
||||
# VS Code with GitHub Copilot
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .github/skills/agent-skill-creator
|
||||
```
|
||||
|
||||
### Cursor
|
||||
|
||||
```bash
|
||||
# Cursor
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .cursor/rules/agent-skill-creator
|
||||
```
|
||||
|
||||
After installing, open your agent and type:
|
||||
|
||||
```
|
||||
/agent-skill-creator Create a skill for analyzing CSV files
|
||||
```
|
||||
|
||||
The skill creator activates and walks you through the full pipeline. You can also just describe a workflow naturally — the skill activates on phrases like "create a skill for...", "automate this workflow", etc.
|
||||
|
||||
For Windsurf, Cline, Codex CLI, Gemini CLI, and other platforms see [Setup by Platform](#setup-by-platform-complete-guide) below.
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
### Invocation
|
||||
|
||||
Type `/agent-skill-creator` followed by your input:
|
||||
|
||||
```
|
||||
/agent-skill-creator Create a skill for analyzing stock market data
|
||||
/agent-skill-creator Every day I process CSV files manually, automate this
|
||||
/agent-skill-creator https://wiki.internal/weather-api-docs
|
||||
/agent-skill-creator See scripts/data_pipeline.py — make this a reusable skill
|
||||
```
|
||||
|
||||
The skill also activates on natural language without the prefix:
|
||||
|
||||
```
|
||||
Create a skill for weather alerts
|
||||
Automate this workflow
|
||||
Validate this skill
|
||||
Export this skill for Cursor
|
||||
```
|
||||
|
||||
### What Happens
|
||||
|
||||
The dark factory reads your material, generates its own spec, builds the skill, and verifies it:
|
||||
|
||||
```
|
||||
UNDERSTAND: Read all material, research domain, generate internal specification
|
||||
BUILD: Structure directory, write all code and docs, craft activation keywords
|
||||
VERIFY: Run spec validation + security scan — block delivery if either fails
|
||||
```
|
||||
|
||||
Output: a complete skill directory you can install on any supported platform.
|
||||
|
||||
---
|
||||
|
||||
## Setup by Platform (Complete Guide)
|
||||
|
||||
Each platform installs with a single `git clone` directly into the right location. Replace `agent-skill-creator` with the skill name when installing generated skills.
|
||||
|
||||
### Claude Code
|
||||
|
||||
```bash
|
||||
# Personal skill (available in all projects)
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git ~/.claude/skills/agent-skill-creator
|
||||
|
||||
# Per-project (scoped to one repo)
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .claude/skills/agent-skill-creator
|
||||
```
|
||||
|
||||
### GitHub Copilot (CLI + VS Code)
|
||||
|
||||
```bash
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .github/skills/agent-skill-creator
|
||||
```
|
||||
|
||||
### Cursor
|
||||
|
||||
```bash
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .cursor/rules/agent-skill-creator
|
||||
```
|
||||
|
||||
Cursor reads SKILL.md natively alongside its `.mdc` rules.
|
||||
|
||||
### Windsurf
|
||||
|
||||
```bash
|
||||
# Windsurf
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .windsurf/skills/agent-skill-creator
|
||||
```
|
||||
|
||||
### Cline
|
||||
|
||||
```bash
|
||||
# Cline (VS Code Extension)
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .clinerules/agent-skill-creator
|
||||
```
|
||||
|
||||
### OpenAI Codex CLI
|
||||
### CLI Tools
|
||||
|
||||
```bash
|
||||
# Claude Code (personal — available in all projects)
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git ~/.claude/skills/agent-skill-creator
|
||||
|
||||
# Claude Code (per-project)
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .claude/skills/agent-skill-creator
|
||||
|
||||
# GitHub Copilot CLI
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .github/skills/agent-skill-creator
|
||||
|
||||
# OpenAI Codex CLI
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .codex/skills/agent-skill-creator
|
||||
```
|
||||
|
||||
### Gemini CLI
|
||||
|
||||
```bash
|
||||
# Gemini CLI
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .gemini/skills/agent-skill-creator
|
||||
```
|
||||
|
||||
### Claude Desktop / claude.ai (Export)
|
||||
|
||||
These platforms use `.zip` upload instead of directory copying:
|
||||
|
||||
1. Export: `python3 scripts/export_utils.py ./agent-skill-creator/ --variant desktop`
|
||||
2. Open Claude Desktop or claude.ai
|
||||
3. Go to Settings > Skills > Upload skill
|
||||
4. Select the generated `.zip` file
|
||||
|
||||
### Claude API (Programmatic)
|
||||
### Claude Desktop / claude.ai
|
||||
|
||||
```bash
|
||||
python3 scripts/export_utils.py ./agent-skill-creator/ --variant api
|
||||
python3 scripts/export_utils.py ./agent-skill-creator/ --variant desktop
|
||||
# Then: Settings > Skills > Upload the generated .zip
|
||||
```
|
||||
|
||||
```python
|
||||
import anthropic
|
||||
|
||||
client = anthropic.Anthropic()
|
||||
|
||||
with open("agent-skill-creator-api-v4.0.0.zip", "rb") as f:
|
||||
skill = client.skills.create(file=f, name="agent-skill-creator")
|
||||
|
||||
response = client.messages.create(
|
||||
model="claude-sonnet-4",
|
||||
messages=[{"role": "user", "content": "Your query here"}],
|
||||
container={"type": "custom_skill", "skill_id": skill.id},
|
||||
betas=["code-execution-2025-08-25", "skills-2025-10-02"],
|
||||
)
|
||||
```
|
||||
|
||||
Note: API sandbox has no network access, no pip install at runtime, and an 8 MB size limit.
|
||||
|
||||
### Updating
|
||||
|
||||
To update an installed skill, just `git pull` from inside the skill directory:
|
||||
### Update
|
||||
|
||||
```bash
|
||||
cd ~/.claude/skills/agent-skill-creator && git pull
|
||||
|
|
@ -392,285 +203,69 @@ cd ~/.claude/skills/agent-skill-creator && git pull
|
|||
|
||||
---
|
||||
|
||||
## How It Works
|
||||
## Quality Gates
|
||||
|
||||
| Phase | What Happens | Key Output |
|
||||
|-------|-------------|------------|
|
||||
| **Discovery** | Researches the domain, identifies APIs and data sources | Domain model, API list |
|
||||
| **Design** | Defines use cases, analysis methods, output formats | Use case specs, methodology docs |
|
||||
| **Architecture** | Decides simple skill vs. complex suite, plans directory structure | Architecture decision, file plan |
|
||||
| **Detection** | Crafts SKILL.md description and activation keywords | SKILL.md frontmatter, trigger phrases |
|
||||
| **Implementation** | Generates all code, docs, installer; validates and scans | Complete skill directory |
|
||||
Every skill goes through automated checks before delivery and on every publish:
|
||||
|
||||
For full pipeline documentation, see [references/pipeline-phases.md](references/pipeline-phases.md).
|
||||
| Gate | What It Checks |
|
||||
|------|---------------|
|
||||
| **Spec Validation** | SKILL.md structure, frontmatter format, naming rules, file references |
|
||||
| **Security Scan** | No hardcoded API keys, no credentials, no injection patterns |
|
||||
|
||||
---
|
||||
|
||||
## Generated Skill Format
|
||||
|
||||
Every generated skill follows the Agent Skills Open Standard:
|
||||
|
||||
```
|
||||
skill-name/
|
||||
SKILL.md # Main skill file (<500 lines, spec-compliant)
|
||||
scripts/ # Functional Python code
|
||||
references/ # Detailed documentation (progressive disclosure)
|
||||
assets/ # Templates, schemas, config files
|
||||
install.sh # Cross-platform installer
|
||||
README.md # Multi-platform install instructions
|
||||
```
|
||||
|
||||
### SKILL.md Frontmatter
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: skill-name
|
||||
description: >-
|
||||
Concise description of what the skill does (<=1024 chars).
|
||||
Includes activation trigger phrases.
|
||||
license: MIT
|
||||
metadata:
|
||||
author: Your Name
|
||||
version: 1.0.0
|
||||
compatibility: >-
|
||||
Works on Claude Code, GitHub Copilot, Cursor, Windsurf,
|
||||
Cline, Codex CLI, Gemini CLI.
|
||||
---
|
||||
```
|
||||
|
||||
Followed by sections: When to Use, Overview, Workflow, Implementation Guidelines, and References.
|
||||
|
||||
**Naming rules**: `kebab-case`, 1-64 characters, pattern `^[a-z][a-z0-9-]*[a-z0-9]$`, must match directory name.
|
||||
|
||||
---
|
||||
|
||||
## Tools
|
||||
|
||||
### Validate a Skill
|
||||
|
||||
Check spec compliance against the Agent Skills Open Standard:
|
||||
Run them independently anytime:
|
||||
|
||||
```bash
|
||||
python3 scripts/validate.py ./my-skill/
|
||||
|
||||
# JSON output (for CI/CD)
|
||||
python3 scripts/validate.py ./my-skill/ --json
|
||||
```
|
||||
|
||||
**Checks**: SKILL.md existence, valid frontmatter, kebab-case name (1-64 chars), description under 1024 chars, body under 500 lines, required directory structure, install.sh exists and is executable.
|
||||
|
||||
**Exit codes**: `0` = valid (may have warnings), `1` = invalid (errors found).
|
||||
|
||||
### Security Scan
|
||||
|
||||
Scan for common security issues before sharing or deploying:
|
||||
|
||||
```bash
|
||||
python3 scripts/security_scan.py ./my-skill/
|
||||
|
||||
# JSON output
|
||||
python3 scripts/security_scan.py ./my-skill/ --json
|
||||
```
|
||||
|
||||
**Detects**: hardcoded API keys (OpenAI, AWS, GitHub, GitLab), tokens and secrets, command injection patterns, unsafe file operations, credential exposure in config files.
|
||||
|
||||
**Exit codes**: `0` = clean, `1` = issues found.
|
||||
|
||||
### Export for Other Platforms
|
||||
|
||||
Package skills for distribution:
|
||||
|
||||
```bash
|
||||
# Desktop/Web (.zip for Claude Desktop, claude.ai)
|
||||
python3 scripts/export_utils.py ./my-skill/ --variant desktop
|
||||
|
||||
# API (.zip for Claude API, <=8MB)
|
||||
python3 scripts/export_utils.py ./my-skill/ --variant api
|
||||
|
||||
# All variants
|
||||
python3 scripts/export_utils.py ./my-skill/
|
||||
```
|
||||
|
||||
Output goes to `exports/`. See [references/export-guide.md](references/export-guide.md) for full documentation.
|
||||
|
||||
### Skill Registry
|
||||
|
||||
Share and discover skills across your team. The registry lives inside this repo (`registry/`) so one `git pull` gives everyone access to all published skills.
|
||||
|
||||
**First-time setup** (once per organization):
|
||||
|
||||
```bash
|
||||
python3 scripts/skill_registry.py init --name "Acme Corp Skills"
|
||||
```
|
||||
|
||||
**Typical workflow:**
|
||||
|
||||
```bash
|
||||
# Someone describes a workflow, the agent creates a skill
|
||||
# "Every week I pull sales data, clean it, and make a report"
|
||||
# → agent creates ./sales-report-builder/
|
||||
|
||||
# Publish it so the team can use it
|
||||
python3 scripts/skill_registry.py publish ./sales-report-builder/ --tags sales,reports
|
||||
|
||||
# Browse what the team has built
|
||||
python3 scripts/skill_registry.py list
|
||||
python3 scripts/skill_registry.py search "sales"
|
||||
|
||||
# Get details about a skill
|
||||
python3 scripts/skill_registry.py info sales-report-builder
|
||||
|
||||
# Install a skill to your platform (auto-detects Claude Code, Cursor, etc.)
|
||||
python3 scripts/skill_registry.py install sales-report-builder
|
||||
|
||||
# Install for a specific platform or at project level
|
||||
python3 scripts/skill_registry.py install sales-report-builder --platform cursor --project
|
||||
|
||||
# Remove a skill from the registry
|
||||
python3 scripts/skill_registry.py remove sales-report-builder --force
|
||||
```
|
||||
|
||||
After publishing, commit and push so colleagues can `git pull` and install the new skill.
|
||||
|
||||
All commands support `--json` for machine-readable output. Use `--force` to overwrite duplicates or bypass confirmation prompts.
|
||||
|
||||
**Exit codes**: `0` = success, `1` = error.
|
||||
Skills that fail validation cannot be published. Skills with high-severity security issues are blocked.
|
||||
|
||||
---
|
||||
|
||||
## Architecture Decisions
|
||||
## Tools Reference
|
||||
|
||||
The creator automatically decides simple vs. complex based on scope:
|
||||
|
||||
| Factor | Simple Skill | Complex Suite |
|
||||
|--------|-------------|---------------|
|
||||
| Workflows | 1-2 | 3+ distinct |
|
||||
| Code size | <1000 lines | >2000 lines |
|
||||
| Structure | Single SKILL.md | Multiple component SKILL.md files |
|
||||
|
||||
For detailed decision logic, see [references/architecture-guide.md](references/architecture-guide.md).
|
||||
|
||||
---
|
||||
|
||||
## For AI Agents (Machine-Readable Reference)
|
||||
|
||||
This section provides structured metadata for AI agents ingesting this README as context.
|
||||
|
||||
### Activation Triggers
|
||||
|
||||
```
|
||||
# Primary invocation
|
||||
/agent-skill-creator <description, links, code, docs>
|
||||
|
||||
# Natural language (also works)
|
||||
create a skill for [domain]
|
||||
automate this workflow
|
||||
every day I [task]
|
||||
I need to automate [task]
|
||||
validate this skill
|
||||
export this skill for [platform]
|
||||
```
|
||||
|
||||
### Install Commands
|
||||
### Registry Commands
|
||||
|
||||
```bash
|
||||
# Claude Code (personal)
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git ~/.claude/skills/agent-skill-creator
|
||||
# GitHub Copilot
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .github/skills/agent-skill-creator
|
||||
# Cursor
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .cursor/rules/agent-skill-creator
|
||||
# Windsurf
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .windsurf/skills/agent-skill-creator
|
||||
# Cline
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .clinerules/agent-skill-creator
|
||||
# Codex CLI
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .codex/skills/agent-skill-creator
|
||||
# Gemini CLI
|
||||
git clone https://github.com/FrancyJGLisboa/agent-skill-creator.git .gemini/skills/agent-skill-creator
|
||||
# Update
|
||||
cd <install-path>/agent-skill-creator && git pull
|
||||
python3 scripts/skill_registry.py init --name "Acme Corp Skills" # First-time setup
|
||||
python3 scripts/skill_registry.py publish ./skill/ --tags t1,t2 # Publish a skill
|
||||
python3 scripts/skill_registry.py list # Browse all skills
|
||||
python3 scripts/skill_registry.py search "query" # Search skills
|
||||
python3 scripts/skill_registry.py info skill-name # Skill details
|
||||
python3 scripts/skill_registry.py install skill-name # Install a skill
|
||||
python3 scripts/skill_registry.py remove skill-name --force # Remove a skill
|
||||
```
|
||||
|
||||
### Tool Commands
|
||||
### Validation and Security
|
||||
|
||||
```bash
|
||||
# Validate
|
||||
python3 scripts/validate.py PATH # Human output
|
||||
python3 scripts/validate.py PATH --json # Machine output
|
||||
|
||||
# Security scan
|
||||
python3 scripts/security_scan.py PATH
|
||||
python3 scripts/security_scan.py PATH --json
|
||||
|
||||
# Export
|
||||
python3 scripts/export_utils.py PATH --variant desktop
|
||||
python3 scripts/export_utils.py PATH --variant api
|
||||
|
||||
# Registry (default --registry ./registry)
|
||||
python3 scripts/skill_registry.py init --name "Team Name"
|
||||
python3 scripts/skill_registry.py publish SKILL_PATH --tags T1,T2
|
||||
python3 scripts/skill_registry.py list [--json]
|
||||
python3 scripts/skill_registry.py search QUERY [--json]
|
||||
python3 scripts/skill_registry.py install SKILL_NAME [--platform PLATFORM] [--project]
|
||||
python3 scripts/skill_registry.py info SKILL_NAME [--json]
|
||||
python3 scripts/skill_registry.py remove SKILL_NAME --force
|
||||
python3 scripts/validate.py ./skill/ # Spec compliance
|
||||
python3 scripts/validate.py ./skill/ --json # Machine-readable output
|
||||
python3 scripts/security_scan.py ./skill/ # Security audit
|
||||
python3 scripts/security_scan.py ./skill/ --json # Machine-readable output
|
||||
```
|
||||
|
||||
### Platform Paths
|
||||
### Export
|
||||
|
||||
| Platform | Path | Scope |
|
||||
|----------|------|-------|
|
||||
| Claude Code | `~/.claude/skills/` | User-level |
|
||||
| Claude Code | `.claude/skills/` | Project-level |
|
||||
| GitHub Copilot | `.github/skills/` | Project-level |
|
||||
| Cursor | `.cursor/rules/` | Workspace |
|
||||
| Windsurf | `.windsurf/skills/` | Workspace |
|
||||
| Cline | `.clinerules/` | Workspace |
|
||||
| Codex CLI | `.codex/skills/` | Workspace |
|
||||
| Gemini CLI | `.gemini/skills/` | Workspace |
|
||||
| Claude Desktop | `.zip` upload | App-level |
|
||||
| claude.ai | `.zip` upload | Web |
|
||||
| Claude API | `.zip` via API | Programmatic |
|
||||
|
||||
### SKILL.md Spec (Required Fields)
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: kebab-case-name # 1-64 chars, ^[a-z][a-z0-9-]*[a-z0-9]$
|
||||
description: >- # 1-1024 chars, include activation keywords
|
||||
What this skill does...
|
||||
license: MIT
|
||||
metadata:
|
||||
author: Author Name
|
||||
version: X.Y.Z
|
||||
---
|
||||
# Body: <500 lines. Move detailed content to references/.
|
||||
```bash
|
||||
python3 scripts/export_utils.py ./skill/ --variant desktop # For Claude Desktop
|
||||
python3 scripts/export_utils.py ./skill/ --variant api # For Claude API
|
||||
```
|
||||
|
||||
### Pipeline Phases
|
||||
|
||||
```
|
||||
DISCOVERY -> DESIGN -> ARCHITECTURE -> DETECTION -> IMPLEMENTATION
|
||||
```
|
||||
|
||||
Each phase is documented in `references/phase{1..5}-*.md`.
|
||||
All commands use exit code `0` for success, `1` for errors. All support `--json` for CI/CD integration.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Skill not activating**: Ensure SKILL.md `description` field contains the trigger phrases you expect. The description is the primary activation mechanism.
|
||||
**Skill not activating**: Check that the SKILL.md `description` field contains keywords matching your query. The description is how the agent decides when to activate the skill.
|
||||
|
||||
**Validation fails on name**: Names must be kebab-case, 1-64 characters, no consecutive hyphens, no leading/trailing hyphens. Pattern: `^[a-z][a-z0-9-]*[a-z0-9]$`.
|
||||
**Validation fails on name**: Names must be lowercase, use hyphens between words, 1-64 characters. Examples: `sales-report-builder`, `deploy-checklist`.
|
||||
|
||||
**SKILL.md too long**: Body must be under 500 lines. Move detailed documentation to `references/` and link from the main SKILL.md.
|
||||
**SKILL.md too long**: Move detailed content to `references/` files and link from the main SKILL.md.
|
||||
|
||||
**Export fails with size error**: API exports have an 8 MB limit. Reduce asset sizes or exclude large files.
|
||||
|
||||
**install.sh not executable**: Run `chmod +x install.sh` before executing.
|
||||
|
||||
**Platform not auto-detected**: Use `./install.sh --platform <name>` to specify explicitly.
|
||||
**Platform not auto-detected**: Use `--platform cursor` (or copilot, windsurf, etc.) to specify explicitly.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -678,24 +273,24 @@ Each phase is documented in `references/phase{1..5}-*.md`.
|
|||
|
||||
```
|
||||
agent-skill-creator/
|
||||
SKILL.md # Meta-skill definition (the product)
|
||||
SKILL.md # The skill definition (what the agent reads)
|
||||
README.md # This file
|
||||
scripts/
|
||||
validate.py # Spec compliance validator
|
||||
validate.py # Spec compliance checker
|
||||
security_scan.py # Security scanner
|
||||
export_utils.py # Cross-platform export tool
|
||||
skill_registry.py # Shared skill registry CLI
|
||||
install-template.sh # Template for generated install.sh
|
||||
references/
|
||||
pipeline-phases.md # Full 5-phase pipeline instructions
|
||||
architecture-guide.md # Simple skill vs. complex suite
|
||||
cross-platform-guide.md # Platform-specific details
|
||||
export-guide.md # Export system documentation
|
||||
quality-standards.md # Quality and code standards
|
||||
templates-guide.md # Template system guide
|
||||
interactive-mode.md # Interactive wizard docs
|
||||
multi-agent-guide.md # Suite creation docs
|
||||
agentdb-integration.md # Optional learning system
|
||||
export_utils.py # Cross-platform export
|
||||
skill_registry.py # Team skill registry
|
||||
install-template.sh # Template for generated installers
|
||||
references/ # Detailed docs (loaded by the agent on demand)
|
||||
pipeline-phases.md # Full creation pipeline
|
||||
architecture-guide.md # Skill structure decisions
|
||||
quality-standards.md # Code and documentation standards
|
||||
multi-agent-guide.md # Multi-skill suite creation
|
||||
cross-platform-guide.md # Platform compatibility
|
||||
export-guide.md # Export documentation
|
||||
templates-guide.md # Template system
|
||||
interactive-mode.md # Interactive wizard
|
||||
agentdb-integration.md # Learning system
|
||||
phase1-discovery.md # Phase 1 deep dive
|
||||
phase2-design.md # Phase 2 deep dive
|
||||
phase3-architecture.md # Phase 3 deep dive
|
||||
|
|
@ -703,10 +298,10 @@ agent-skill-creator/
|
|||
phase5-implementation.md # Phase 5 deep dive
|
||||
templates/ # Skill templates
|
||||
examples/stock-analyzer/ # Example skill
|
||||
registry/ # Shared skill catalog (git-tracked)
|
||||
registry.json # Skill manifest
|
||||
skills/ # Published skill directories
|
||||
exports/ # Export output directory
|
||||
registry/ # Shared skill catalog
|
||||
registry.json
|
||||
skills/
|
||||
exports/ # Export output
|
||||
```
|
||||
|
||||
---
|
||||
|
|
@ -714,24 +309,24 @@ agent-skill-creator/
|
|||
## Contributing
|
||||
|
||||
1. Fork the repository
|
||||
2. Create a feature branch (`git checkout -b feature/my-feature`)
|
||||
2. Create a feature branch
|
||||
3. Make your changes
|
||||
4. Run validation: `python3 scripts/validate.py ./`
|
||||
5. Run security scan: `python3 scripts/security_scan.py ./`
|
||||
6. Submit a pull request
|
||||
4. Run `python3 scripts/validate.py ./` and `python3 scripts/security_scan.py ./`
|
||||
5. Submit a pull request
|
||||
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
MIT License.
|
||||
MIT
|
||||
|
||||
---
|
||||
|
||||
## Links
|
||||
|
||||
- [Agent Skills Open Standard](https://github.com/anthropics/agent-skills-spec)
|
||||
- [What are Claude Skills? (video)](https://www.youtube.com/watch?v=izJkgLqlbN8)
|
||||
- [Architecture Guide](references/architecture-guide.md)
|
||||
- [Pipeline Phases Reference](references/pipeline-phases.md)
|
||||
- [Pipeline Phases](references/pipeline-phases.md)
|
||||
- [Cross-Platform Guide](references/cross-platform-guide.md)
|
||||
- [Export Guide](references/export-guide.md)
|
||||
|
|
|
|||
94
SKILL.md
94
SKILL.md
|
|
@ -20,9 +20,11 @@ compatibility: >-
|
|||
---
|
||||
# /agent-skill-creator — Level 5 Skill Dark Factory
|
||||
|
||||
You are an autonomous skill factory. The user provides raw material — workflow descriptions, documentation, links, existing code, API docs, PDFs, compliance checklists, anything — and you produce a complete, production-ready, cross-platform agent skill. The human provides sources and evaluates the outcome. You handle everything in between.
|
||||
You are an autonomous skill factory. You exist because humans are cognitively incapable of writing specifications clear enough for an agent to build from without intervention. A human-written spec will never reach Level 5 — it will always be incomplete, ambiguous, and missing the requirements the human assumed were obvious. That is not a flaw to fix. That is the design constraint this factory is built around.
|
||||
|
||||
This is a Level 5 dark factory for skill creation. The user should never need to write code, review implementation details, fill out templates, or understand the skill spec. They describe what they need; you deeply understand their material, generate your own specification, implement from that specification, validate, security-scan, and deliver a self-contained skill ready for the team to use.
|
||||
The user provides raw material — workflow descriptions, documentation, links, existing code, API docs, PDFs, database schemas, transcripts, compliance checklists, vague intentions, anything — and you produce a complete, production-ready, cross-platform agent skill. The human provides sources and evaluates the outcome. You handle everything in between.
|
||||
|
||||
This is a Level 5 dark factory for skill creation. The user should never need to write code, review implementation details, fill out templates, or understand the skill spec. Any cognitively constrained human should be able to pass you whatever they have — a messy transcript, a GitHub link, a half-written doc — and receive back an opinionated piece of reusable software that makes them genuinely productive. You bridge the gap between what humans can articulate and what agents need to build.
|
||||
|
||||
## Trigger
|
||||
|
||||
|
|
@ -52,16 +54,28 @@ Raw material goes in. A validated, security-scanned, self-contained skill comes
|
|||
|
||||
### Stage 1: Understand and Specify (Phases 1-2)
|
||||
|
||||
Read every piece of material the user provides. Follow links. Read files. Parse PDFs. Study existing code. Build a deep understanding of the domain, the workflow, the data sources, the edge cases. Then generate your own internal specification — a complete description of what the skill must do, structured as a linear walkthrough:
|
||||
Read every piece of material the user provides. Follow links. Read files. Parse PDFs. Study existing code. But do not take any of it at face value.
|
||||
|
||||
- What problem does this solve?
|
||||
- What are the inputs, outputs, and data sources?
|
||||
**Humans describe what they do, not what they need.** "I pull sales data and make a report" hides a dozen implicit requirements: What decisions does the report drive? Who reads it? What format? What happens when data is missing? What constitutes a good report vs. a bad one? The human knows the answers to these questions but won't think to tell you. Your job is to uncover them from the material itself.
|
||||
|
||||
**Clarity principles** (self-guided, no external dependency):
|
||||
|
||||
1. **Read everything before concluding anything.** Do not start forming the spec after the first paragraph. Consume all material — every link, every file, every page — then synthesize.
|
||||
2. **Challenge the surface description.** The human's words are a starting point, not a specification. Look for what's missing, what's implied, what's contradictory. If someone says "generate a report," ask yourself: report for whom? In what format? With what data? At what frequency? Answering what triggers it?
|
||||
3. **Extract implicit requirements.** Error handling, data validation, edge cases, output formats, failure modes — the human assumed these were obvious. They aren't. Make them explicit in your spec.
|
||||
4. **Identify the real output.** The human says "report" but means "a PDF my VP can read in 2 minutes that shows whether we're hitting targets." The human says "clean the data" but means "deduplicate, normalize dates, flag outliers, and log what was changed." Dig past the label to the substance.
|
||||
5. **Generate a spec that surpasses the human's understanding.** Your specification should contain requirements the human would say "yes, exactly" to — but could never have articulated themselves. That is the standard.
|
||||
|
||||
Then produce your internal specification — a complete implementation contract structured as a linear walkthrough:
|
||||
|
||||
- What problem does this *actually* solve (not what the human said — what they meant)?
|
||||
- What are the real inputs, outputs, and data sources?
|
||||
- What are the use cases (4-6, covering 80% of real usage)?
|
||||
- What methodology does each use case follow?
|
||||
- What APIs or libraries are needed?
|
||||
- What are the failure modes and edge cases?
|
||||
- What are the failure modes and edge cases the human didn't mention?
|
||||
|
||||
This specification is for you, not the user. It is your implementation contract. The quality of the skill depends entirely on the quality of this specification. Be thorough. Be precise. Anticipate the questions the user would not know to ask.
|
||||
This specification is for you, not the user. The quality of the skill depends entirely on the quality of this specification. Be thorough. Be precise. Be opinionated — you understand the material better than the human can articulate it.
|
||||
|
||||
### Stage 2: Build and Verify (Phases 3-5)
|
||||
|
||||
|
|
@ -131,13 +145,67 @@ Create all files in this order:
|
|||
3. Implement Python scripts (functional, no placeholders, no TODOs)
|
||||
4. Write references (detailed documentation the skill loads on demand)
|
||||
5. Write assets (templates, configs)
|
||||
6. Generate `install.sh` (cross-platform installer)
|
||||
6. Generate `install.sh` from `scripts/install-template.sh` (replace `{{SKILL_NAME}}` with actual name, `chmod +x`)
|
||||
7. Write `README.md` (multi-platform install instructions showing `git clone` for each platform)
|
||||
8. Run **validation** against the official spec
|
||||
9. Run **security scan** for hardcoded keys and injection patterns
|
||||
10. Report results to user
|
||||
10. **Auto-install on the current platform** (see below)
|
||||
11. Report results to user with clear next steps
|
||||
|
||||
The generated skill must be a self-contained package that anyone can install with `git clone` and invoke with `/skill-name` — the same way agent-skill-creator itself works.
|
||||
### Auto-Install After Creation
|
||||
|
||||
After the skill passes validation and security scan, install it immediately on the user's current platform. Do not ask the user to run `install.sh` manually — you are already running inside their environment and can detect their platform.
|
||||
|
||||
**Detection logic** (check in order):
|
||||
|
||||
```
|
||||
~/.claude/ exists → Claude Code
|
||||
.cursor/ exists → Cursor (project-level)
|
||||
~/.cursor/ exists → Cursor (user-level)
|
||||
.github/ exists → GitHub Copilot
|
||||
.windsurf/ exists → Windsurf
|
||||
.clinerules/ exists → Cline
|
||||
.codex/ exists → Codex CLI
|
||||
.gemini/ exists → Gemini CLI
|
||||
```
|
||||
|
||||
**Install action**: Copy or symlink the generated skill directory into the platform's skill path:
|
||||
|
||||
```bash
|
||||
# Example for Claude Code (user-level):
|
||||
cp -R ./sales-report-builder ~/.claude/skills/sales-report-builder
|
||||
|
||||
# Example for Cursor (project-level):
|
||||
cp -R ./sales-report-builder .cursor/rules/sales-report-builder
|
||||
```
|
||||
|
||||
**After installing, tell the user exactly what to do next:**
|
||||
|
||||
```
|
||||
Skill installed successfully.
|
||||
|
||||
To use it, open a new session and type:
|
||||
|
||||
/sales-report-builder Generate the weekly report for the West region
|
||||
|
||||
The skill is installed at: ~/.claude/skills/sales-report-builder
|
||||
```
|
||||
|
||||
If you cannot detect the platform, show the user how to run the install manually:
|
||||
|
||||
```
|
||||
I couldn't auto-detect your platform. To install, run:
|
||||
|
||||
./sales-report-builder/install.sh
|
||||
|
||||
Or specify your platform:
|
||||
|
||||
./sales-report-builder/install.sh --platform cursor
|
||||
```
|
||||
|
||||
The `install.sh` inside the skill handles auto-detection, platform-specific paths, project vs user level, dry-run mode, and post-install activation instructions. It is the fallback for users who receive the skill as a package (not created in their current session).
|
||||
|
||||
The generated skill must be a self-contained package that anyone can install with `git clone` or `./install.sh` and invoke with `/skill-name` — the same way agent-skill-creator itself works.
|
||||
|
||||
See `references/pipeline-phases.md` for detailed Phase 5 instructions.
|
||||
|
||||
|
|
@ -307,14 +375,14 @@ Examples: `stock-analyzer`, `csv-data-cleaner`, `weekly-report-generator`
|
|||
| File | Contents |
|
||||
|------|----------|
|
||||
| `references/pipeline-phases.md` | Detailed Phase 1-5 instructions |
|
||||
| `references/architecture-guide.md` | Simple vs Suite decision logic |
|
||||
| `references/architecture-guide.md` | Simple vs Suite decision, refactoring, cross-component communication, versioning |
|
||||
| `references/templates-guide.md` | Template-based creation |
|
||||
| `references/interactive-mode.md` | Interactive wizard docs |
|
||||
| `references/multi-agent-guide.md` | Batch/suite creation |
|
||||
| `references/multi-agent-guide.md` | Suite creation, orchestration patterns, routing logic |
|
||||
| `references/agentdb-integration.md` | AgentDB learning system |
|
||||
| `references/cross-platform-guide.md` | Platform compatibility matrix |
|
||||
| `references/export-guide.md` | Cross-platform export system |
|
||||
| `references/quality-standards.md` | Quality and code standards |
|
||||
| `references/quality-standards.md` | Quality standards, dependency management, testing strategy |
|
||||
| `references/phase1-discovery.md` | Phase 1 deep-dive |
|
||||
| `references/phase2-design.md` | Phase 2 deep-dive |
|
||||
| `references/phase3-architecture.md` | Phase 3 deep-dive |
|
||||
|
|
|
|||
|
|
@ -612,7 +612,284 @@ Reference content from SKILL.md using `See references/filename.md for details.`
|
|||
|
||||
---
|
||||
|
||||
## 7. Architecture Checklist
|
||||
## 7. When to Refactor a Growing Skill
|
||||
|
||||
Skills evolve. A simple skill that started at 500 lines can grow to 5000+ as the team adds analyses, data sources, and edge case handling. Recognize the signs early and refactor before the skill becomes unmaintainable.
|
||||
|
||||
### 7.1 Signs It's Time to Refactor
|
||||
|
||||
| Signal | What It Means |
|
||||
|--------|--------------|
|
||||
| SKILL.md approaching 500 lines | Body is stuffed — move content to references |
|
||||
| Total code exceeding 3000 lines | Single-domain skill is becoming unwieldy |
|
||||
| 3+ unrelated workflows emerging | The skill is doing too many different jobs |
|
||||
| Different people maintaining different parts | Ownership boundaries need to be explicit |
|
||||
| Users invoking the skill for fundamentally different tasks | The skill should be split into focused components |
|
||||
| New data sources that don't share the existing pipeline | Independent fetch/parse/analyze chains = independent skills |
|
||||
|
||||
### 7.2 Refactoring Patterns
|
||||
|
||||
**Pattern 1: Extract to References (lightest touch)**
|
||||
|
||||
When the skill body is too long but the code is fine:
|
||||
|
||||
```
|
||||
Before: SKILL.md at 480 lines with inline methodology docs
|
||||
After: SKILL.md at 250 lines, references/methodology.md with the detail
|
||||
```
|
||||
|
||||
This is not a structural refactor — just progressive disclosure. Do this first.
|
||||
|
||||
**Pattern 2: Extract Utility Module**
|
||||
|
||||
When multiple scripts duplicate logic:
|
||||
|
||||
```
|
||||
Before: fetch.py has cache logic, analyze.py has cache logic
|
||||
After: utils/cache.py extracted, both scripts import from it
|
||||
```
|
||||
|
||||
**Pattern 3: Split by Domain (simple → suite)**
|
||||
|
||||
When the skill covers multiple independent domains:
|
||||
|
||||
```
|
||||
Before:
|
||||
financial-analyzer/
|
||||
scripts/
|
||||
stock_analysis.py # 800 lines
|
||||
portfolio_tracking.py # 600 lines
|
||||
tax_reporting.py # 500 lines
|
||||
|
||||
After:
|
||||
financial-suite/
|
||||
skills/
|
||||
stock-analyzer/ # Independent skill
|
||||
portfolio-tracker/ # Independent skill
|
||||
tax-reporter/ # Independent skill
|
||||
shared/
|
||||
market_data_client.py # Shared API connection
|
||||
```
|
||||
|
||||
**Pattern 4: Extract Shared Resources**
|
||||
|
||||
When converting to a suite, identify code that multiple components need:
|
||||
|
||||
1. API client code → `shared/api_client.py`
|
||||
2. Common data models → `shared/models.py`
|
||||
3. Utility functions (date handling, formatting) → `shared/utils.py`
|
||||
4. Configuration → `shared/config.json`
|
||||
|
||||
### 7.3 Refactoring Decision Process
|
||||
|
||||
```
|
||||
Is SKILL.md > 400 lines?
|
||||
→ Yes: Extract to references (Pattern 1)
|
||||
→ Still growing?
|
||||
↓
|
||||
Is total code > 3000 lines with 3+ unrelated workflows?
|
||||
→ Yes: Split into suite (Pattern 3)
|
||||
→ No, but code is duplicated across scripts?
|
||||
→ Extract utilities (Pattern 2)
|
||||
→ No: Keep as large simple skill — not everything needs to be a suite
|
||||
```
|
||||
|
||||
**Critical rule**: Do not split prematurely. Three similar scripts in one domain is better than a suite with three trivially small components. Only split when the workflows are genuinely independent — different data sources, different users, different maintenance cadences.
|
||||
|
||||
### 7.4 Refactoring Checklist
|
||||
|
||||
- [ ] Identified which pattern applies (1-4)
|
||||
- [ ] Each new component is independently functional
|
||||
- [ ] Shared resources extracted to `shared/` (not duplicated)
|
||||
- [ ] All SKILL.md files are <500 lines
|
||||
- [ ] All component names follow kebab-case naming
|
||||
- [ ] install.sh updated to handle new structure
|
||||
- [ ] README.md updated with new structure
|
||||
- [ ] Validation passes on all components
|
||||
|
||||
---
|
||||
|
||||
## 8. Cross-Component Communication in Suites
|
||||
|
||||
When a suite has multiple component skills, they need clear patterns for sharing code, data, and orchestration.
|
||||
|
||||
### 8.1 The shared/ Directory
|
||||
|
||||
The `shared/` directory contains code that multiple components use. It is **not** a component skill — it has no SKILL.md and is never invoked directly.
|
||||
|
||||
```
|
||||
suite-name/
|
||||
├── shared/
|
||||
│ ├── api_client.py # Shared API connection + authentication
|
||||
│ ├── models.py # Shared data classes and type definitions
|
||||
│ ├── utils.py # Common utilities (date formatting, etc.)
|
||||
│ └── config.json # Shared configuration
|
||||
├── skills/
|
||||
│ ├── component-a/
|
||||
│ │ ├── SKILL.md
|
||||
│ │ └── scripts/
|
||||
│ │ └── analyze.py # imports from ../../shared/api_client.py
|
||||
│ └── component-b/
|
||||
│ ├── SKILL.md
|
||||
│ └── scripts/
|
||||
│ └── report.py # imports from ../../shared/api_client.py
|
||||
```
|
||||
|
||||
### 8.2 Import Patterns
|
||||
|
||||
Components import from `shared/` using path manipulation:
|
||||
|
||||
```python
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Add shared/ to path
|
||||
_SUITE_ROOT = Path(__file__).resolve().parent.parent.parent
|
||||
_SHARED_DIR = _SUITE_ROOT / "shared"
|
||||
if str(_SHARED_DIR) not in sys.path:
|
||||
sys.path.insert(0, str(_SHARED_DIR))
|
||||
|
||||
from api_client import SuiteAPIClient
|
||||
from utils import format_date, parse_currency
|
||||
```
|
||||
|
||||
**Rules:**
|
||||
- Always use `Path(__file__).resolve()` for reliable path resolution
|
||||
- Add `shared/` to `sys.path` — do not copy files into each component
|
||||
- Import specific names, not `from shared import *`
|
||||
- Each component must still work if `shared/` provides enhanced functionality but is not strictly required (graceful degradation)
|
||||
|
||||
### 8.3 Orchestration: Suite-Level SKILL.md
|
||||
|
||||
The suite-level SKILL.md is the router. When a user's query could match multiple components, the suite SKILL.md tells the agent how to decide:
|
||||
|
||||
```markdown
|
||||
# /ecommerce-suite — E-commerce Intelligence
|
||||
|
||||
You are an e-commerce analytics coordinator. Route user queries
|
||||
to the right component skill based on intent:
|
||||
|
||||
## Routing Logic
|
||||
|
||||
| User Intent | Route To | Example Queries |
|
||||
|-------------|----------|-----------------|
|
||||
| Revenue, orders, conversion | /sales-monitor | "What were last week's sales?" |
|
||||
| Segments, cohorts, churn | /customer-analytics | "Show customer retention by cohort" |
|
||||
| Stock levels, reorder | /inventory-tracker | "Which products need reordering?" |
|
||||
| Executive summary, dashboard | /executive-reports | "Generate the weekly executive report" |
|
||||
|
||||
## Cross-Component Workflows
|
||||
|
||||
Some requests require multiple components:
|
||||
|
||||
### Full Store Report
|
||||
When the user asks for a "full report" or "store overview":
|
||||
1. Invoke /sales-monitor for revenue summary
|
||||
2. Invoke /customer-analytics for retention metrics
|
||||
3. Invoke /inventory-tracker for stock alerts
|
||||
4. Invoke /executive-reports to compile everything into a single report
|
||||
|
||||
### Churn Impact Analysis
|
||||
When the user asks about churn's revenue impact:
|
||||
1. Invoke /customer-analytics for churn rate and segments
|
||||
2. Invoke /sales-monitor for revenue by customer segment
|
||||
3. Combine: revenue at risk = churned segment revenue × churn rate
|
||||
```
|
||||
|
||||
### 8.4 Data Flow Between Components
|
||||
|
||||
Components do not call each other's functions directly. Instead, they communicate through:
|
||||
|
||||
1. **Shared data files**: Component A writes to `data/sales_summary.json`, Component B reads it
|
||||
2. **Shared API client**: Both components use the same `shared/api_client.py` to fetch data
|
||||
3. **Agent orchestration**: The agent (LLM) reads output from Component A and passes relevant parts to Component B
|
||||
|
||||
**Anti-patterns to avoid:**
|
||||
- Component A importing Component B's scripts directly (creates tight coupling)
|
||||
- Components writing to each other's directories
|
||||
- Circular dependencies between components
|
||||
|
||||
### 8.5 Component Independence Rule
|
||||
|
||||
Each component must be independently functional. This means:
|
||||
|
||||
- A component extracted from the suite and installed alone must still work
|
||||
- `shared/` utilities enhance performance (avoid duplicate API calls, consistent formatting) but are not hard requirements
|
||||
- If a component absolutely requires `shared/`, document this in its README.md
|
||||
- The suite-level install.sh must install `shared/` alongside all components
|
||||
|
||||
---
|
||||
|
||||
## 9. Versioning Strategy
|
||||
|
||||
### 9.1 Semver for Skills
|
||||
|
||||
Skills follow [Semantic Versioning](https://semver.org/):
|
||||
|
||||
| Change Type | Version Bump | Examples |
|
||||
|------------|-------------|---------|
|
||||
| **Patch** (x.y.Z) | Bug fixes, typo corrections, minor doc improvements | Fix API timeout handling, correct calculation formula |
|
||||
| **Minor** (x.Y.0) | New analyses, new data sources, new output formats | Add trend analysis, support CSV export, add new API endpoint |
|
||||
| **Major** (X.0.0) | Breaking changes to inputs, outputs, or invocation | Change script arguments, rename skill, restructure output format |
|
||||
|
||||
### 9.2 What Counts as Breaking
|
||||
|
||||
A change is breaking if existing users of the skill would get different behavior or errors:
|
||||
|
||||
| Breaking | Not Breaking |
|
||||
|----------|-------------|
|
||||
| Changing script CLI arguments | Adding new optional arguments |
|
||||
| Changing output JSON structure | Adding new fields to output |
|
||||
| Removing an analysis function | Adding new analysis functions |
|
||||
| Renaming the skill | Updating the description keywords |
|
||||
| Changing required environment variables | Adding optional environment variables |
|
||||
|
||||
### 9.3 Suite Versioning
|
||||
|
||||
Suite versions are independent of component versions:
|
||||
|
||||
```
|
||||
ecommerce-suite/ version: 2.0.0 (added new component)
|
||||
├── sales-monitor/ version: 1.3.0 (3 minor updates since suite v1)
|
||||
├── customer-analytics/ version: 1.1.0 (1 minor update)
|
||||
├── inventory-tracker/ version: 2.0.0 (breaking change in its own output)
|
||||
└── executive-reports/ version: 1.0.0 (unchanged)
|
||||
```
|
||||
|
||||
**Suite version bump rules:**
|
||||
|
||||
| Change | Suite Version Bump |
|
||||
|--------|--------------------|
|
||||
| Bug fix in one component | No suite bump (component patch only) |
|
||||
| New capability in one component | No suite bump (component minor only) |
|
||||
| Breaking change in one component | Suite minor bump (warn users) |
|
||||
| Add new component to suite | Suite minor bump |
|
||||
| Remove component from suite | Suite major bump |
|
||||
| Restructure shared/ | Suite major bump |
|
||||
|
||||
### 9.4 Version in Practice
|
||||
|
||||
The version lives in SKILL.md frontmatter:
|
||||
|
||||
```yaml
|
||||
metadata:
|
||||
version: 1.2.0
|
||||
```
|
||||
|
||||
When publishing to the registry, `skill_registry.py` reads this version. Publishing the same name+version is rejected unless `--force` is used.
|
||||
|
||||
**When to create a new skill vs. version an existing one:**
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| Same domain, improved implementation | Version bump (minor or major) |
|
||||
| Same domain, fundamentally different approach | New skill (e.g., `stock-analyzer-v2`) |
|
||||
| Different domain entirely | New skill |
|
||||
| Extending to cover adjacent domain | If tightly coupled: version bump. If independent: new skill or convert to suite |
|
||||
|
||||
---
|
||||
|
||||
## 10. Architecture Checklist
|
||||
|
||||
Use this checklist before proceeding to implementation (Phase 5):
|
||||
|
||||
|
|
@ -620,6 +897,8 @@ Use this checklist before proceeding to implementation (Phase 5):
|
|||
|
||||
- [ ] Determined Simple Skill vs Complex Suite
|
||||
- [ ] Justified the decision based on workflow count, code size, and domain scope
|
||||
- [ ] If suite: identified shared resources and component boundaries
|
||||
- [ ] If suite: designed orchestration logic (routing, cross-component workflows)
|
||||
|
||||
### Naming
|
||||
|
||||
|
|
@ -627,6 +906,7 @@ Use this checklist before proceeding to implementation (Phase 5):
|
|||
- [ ] Name matches the parent directory
|
||||
- [ ] No `-cskill` suffix
|
||||
- [ ] Name is descriptive and includes the primary domain
|
||||
- [ ] If suite: all component names are unique and follow kebab-case
|
||||
|
||||
### Structure
|
||||
|
||||
|
|
@ -638,6 +918,8 @@ Use this checklist before proceeding to implementation (Phase 5):
|
|||
- [ ] `README.md` planned with multi-platform install instructions
|
||||
- [ ] No `marketplace.json` for Simple Skills
|
||||
- [ ] If Complex Suite with `marketplace.json`, only official fields used
|
||||
- [ ] If suite: shared/ directory planned with import patterns documented
|
||||
- [ ] If suite: each component is independently functional
|
||||
|
||||
### Performance
|
||||
|
||||
|
|
@ -646,6 +928,18 @@ Use this checklist before proceeding to implementation (Phase 5):
|
|||
- [ ] Error handling approach defined (retries, backoff, fallbacks)
|
||||
- [ ] SKILL.md size managed via progressive disclosure to `references/`
|
||||
|
||||
### Dependencies
|
||||
|
||||
- [ ] Dependency strategy decided (stdlib-only vs. third-party)
|
||||
- [ ] requirements.txt planned if third-party packages needed
|
||||
- [ ] No unnecessary heavy dependencies
|
||||
|
||||
### Versioning
|
||||
|
||||
- [ ] Initial version set (1.0.0)
|
||||
- [ ] Version bump rules understood (patch/minor/major)
|
||||
- [ ] If suite: component versions independent of suite version
|
||||
|
||||
### Documentation
|
||||
|
||||
- [ ] Architecture decisions documented
|
||||
|
|
|
|||
|
|
@ -195,6 +195,157 @@ climate-suite/
|
|||
└── README.md
|
||||
```
|
||||
|
||||
## Suite Orchestration Patterns
|
||||
|
||||
The suite-level SKILL.md is the most important file in a suite. It doesn't just list components — it tells the agent *how to think* about routing, sequencing, and combining them.
|
||||
|
||||
### Pattern 1: Intent-Based Routing
|
||||
|
||||
The simplest pattern. The suite SKILL.md maps user intent to component skills:
|
||||
|
||||
```markdown
|
||||
# /ecommerce-suite — E-commerce Intelligence
|
||||
|
||||
You coordinate four specialized e-commerce skills. Route every user
|
||||
query to the right component based on intent.
|
||||
|
||||
## Routing Table
|
||||
|
||||
| If the user asks about... | Invoke | Examples |
|
||||
|---------------------------|--------|---------|
|
||||
| Revenue, orders, sales, conversion, AOV | /sales-monitor | "What were last week's sales?" |
|
||||
| Customers, segments, cohorts, churn, retention | /customer-analytics | "Show me customer retention" |
|
||||
| Stock, inventory, reorder, supply, out-of-stock | /inventory-tracker | "Which SKUs need reordering?" |
|
||||
| Summary, dashboard, executive, weekly report | /executive-reports | "Generate the weekly report" |
|
||||
|
||||
If the query doesn't clearly match one component, ask the user to clarify.
|
||||
If the query spans multiple components, use the cross-component workflows below.
|
||||
```
|
||||
|
||||
### Pattern 2: Sequential Pipeline
|
||||
|
||||
Some workflows require components in sequence — the output of one feeds the next:
|
||||
|
||||
```markdown
|
||||
## Cross-Component Workflows
|
||||
|
||||
### Weekly Executive Report
|
||||
When user asks for "weekly report", "executive summary", or "full store overview":
|
||||
|
||||
1. Run /sales-monitor: Get revenue, orders, conversion for the past 7 days
|
||||
2. Run /customer-analytics: Get new vs returning customer split, churn rate
|
||||
3. Run /inventory-tracker: Get low-stock alerts and reorder recommendations
|
||||
4. Run /executive-reports: Compile all three into a single PDF dashboard
|
||||
|
||||
The executive-reports component expects data from the other three.
|
||||
Pass the outputs as context — do not ask the user to run each step manually.
|
||||
|
||||
### Churn Revenue Impact
|
||||
When user asks about "revenue impact of churn" or "how much are we losing":
|
||||
|
||||
1. Run /customer-analytics: Get churned customer segments and churn rate
|
||||
2. Run /sales-monitor: Get revenue breakdown by customer segment
|
||||
3. Calculate: revenue_at_risk = churned_segment_revenue × churn_rate
|
||||
4. Present combined analysis with both the churn data and revenue impact
|
||||
```
|
||||
|
||||
### Pattern 3: Parallel Aggregation
|
||||
|
||||
When components can run independently and results are combined:
|
||||
|
||||
```markdown
|
||||
### Store Health Check
|
||||
When user asks for "store health" or "how's the business":
|
||||
|
||||
Run these in parallel (no dependencies between them):
|
||||
- /sales-monitor → revenue trend (up/down/flat)
|
||||
- /customer-analytics → retention rate
|
||||
- /inventory-tracker → stock health score
|
||||
|
||||
Then synthesize:
|
||||
- All green: "Store is healthy — revenue trending up, retention stable, stock well-managed"
|
||||
- Mixed: Report which areas need attention
|
||||
- All concerning: "Multiple areas need attention" + specific recommendations
|
||||
```
|
||||
|
||||
### Pattern 4: Conditional Routing
|
||||
|
||||
When the right component depends on data discovered during the conversation:
|
||||
|
||||
```markdown
|
||||
## Conditional Workflows
|
||||
|
||||
### Deep Dive Analysis
|
||||
When user asks to "deep dive" or "investigate" a metric:
|
||||
|
||||
1. Identify which metric they're asking about
|
||||
2. Route to the appropriate component:
|
||||
- Revenue metric → /sales-monitor with detailed=true
|
||||
- Customer metric → /customer-analytics with detailed=true
|
||||
- Stock metric → /inventory-tracker with detailed=true
|
||||
3. If the deep dive reveals a cross-domain issue (e.g., revenue dropped
|
||||
because of stockouts), invoke the relevant second component
|
||||
4. Present the combined root-cause analysis
|
||||
```
|
||||
|
||||
### Orchestration Anti-Patterns
|
||||
|
||||
| Anti-Pattern | Why It's Wrong | Do This Instead |
|
||||
|-------------|---------------|-----------------|
|
||||
| Suite SKILL.md just lists components without routing logic | Agent has to guess which component to use | Provide explicit routing table with example queries |
|
||||
| Components call each other's scripts directly | Creates tight coupling, breaks independence | Agent orchestrates — passes output from A as context to B |
|
||||
| Suite SKILL.md duplicates component instructions | Maintenance nightmare, instructions drift apart | Reference component skills: "Invoke /sales-monitor for this" |
|
||||
| No cross-component workflow examples | Agent doesn't know how to combine results | Document 2-3 concrete multi-component workflows |
|
||||
| Every query goes through all components | Wastes tokens and time | Route to specific component; only aggregate when explicitly asked |
|
||||
|
||||
### Complete Suite SKILL.md Example
|
||||
|
||||
```markdown
|
||||
# /financial-suite — Comprehensive Financial Analysis
|
||||
|
||||
You are a financial analysis coordinator managing three specialized skills.
|
||||
Your job is to route queries to the right specialist and combine results
|
||||
when needed.
|
||||
|
||||
## Component Skills
|
||||
|
||||
- **/stock-analyzer**: Real-time and historical price data, technical indicators
|
||||
- **/portfolio-tracker**: Holdings, allocation, performance, rebalancing
|
||||
- **/market-research**: Sector analysis, news sentiment, peer comparison
|
||||
|
||||
## Routing Logic
|
||||
|
||||
| User Intent | Route To |
|
||||
|-------------|----------|
|
||||
| Price, chart, technical indicator, RSI, MACD | /stock-analyzer |
|
||||
| My portfolio, allocation, performance, rebalance | /portfolio-tracker |
|
||||
| Sector trends, news, competitor, peer comparison | /market-research |
|
||||
|
||||
## Cross-Skill Workflows
|
||||
|
||||
### Portfolio Review with Market Context
|
||||
When user asks for "portfolio review" or "how am I doing":
|
||||
1. Invoke /portfolio-tracker for current holdings and performance
|
||||
2. Invoke /market-research for sector trends affecting held positions
|
||||
3. Synthesize: performance attribution + market context + recommendations
|
||||
|
||||
### Buy/Sell Analysis
|
||||
When user asks "should I buy X" or "should I sell X":
|
||||
1. Invoke /stock-analyzer for technical analysis of the specific stock
|
||||
2. Invoke /market-research for sector sentiment and peer comparison
|
||||
3. Invoke /portfolio-tracker to check current exposure and allocation impact
|
||||
4. Synthesize: technical signal + market context + portfolio fit
|
||||
|
||||
## When to Combine vs. Route Directly
|
||||
|
||||
- Single-domain question → Route to one component
|
||||
- "How am I doing" or "full analysis" → Combine all components
|
||||
- If unsure → Ask the user: "Would you like a quick check on [X]
|
||||
or a comprehensive analysis across your portfolio?"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Benefits of Suite Creation
|
||||
|
||||
### Time Efficiency
|
||||
|
|
|
|||
|
|
@ -807,6 +807,232 @@ def analyze_yoy(
|
|||
|
||||
---
|
||||
|
||||
## Dependency Management
|
||||
|
||||
### Decision Framework
|
||||
|
||||
Skills should minimize external dependencies. Every dependency is a maintenance burden, a security surface, and a compatibility risk. Use this decision tree:
|
||||
|
||||
```
|
||||
Can stdlib do it?
|
||||
→ Yes: Use stdlib. Done.
|
||||
→ No: Is there a lightweight pure-Python package (<1MB)?
|
||||
→ Yes: Use it. Add to requirements.txt.
|
||||
→ No: Is there a well-maintained popular package?
|
||||
→ Yes: Use it only if the domain requires it.
|
||||
→ No: Implement it yourself or redesign the approach.
|
||||
```
|
||||
|
||||
### Stdlib vs. Third-Party Decision Table
|
||||
|
||||
| Task | Stdlib Solution | When to Use Third-Party |
|
||||
|------|----------------|------------------------|
|
||||
| HTTP requests | `urllib.request` | Use `requests` when: complex auth, session management, multipart uploads, or retry logic would require 100+ lines of urllib code |
|
||||
| JSON handling | `json` | Never — stdlib is sufficient |
|
||||
| CSV parsing | `csv` | Use `pandas` only when: statistical analysis, complex transformations, or DataFrame operations are core to the skill |
|
||||
| File paths | `pathlib` | Never — stdlib is sufficient |
|
||||
| Date/time | `datetime` | Never — stdlib is sufficient |
|
||||
| Regex | `re` | Never — stdlib is sufficient |
|
||||
| Hashing | `hashlib` | Never — stdlib is sufficient |
|
||||
| Caching | File-based (json + pathlib) | Never for skills — the FileCache pattern in architecture-guide.md is sufficient |
|
||||
| Data analysis | Manual calculations | Use `pandas`/`numpy` when: skill is primarily analytical (10+ statistical operations, pivots, aggregations) |
|
||||
| PDF generation | Not available | Use `reportlab` or `fpdf2` when PDF output is a core requirement |
|
||||
| Web scraping | `urllib` + `html.parser` | Use `beautifulsoup4` when parsing complex/malformed HTML |
|
||||
| CLI arguments | `argparse` | Never — stdlib is sufficient |
|
||||
| YAML parsing | Manual (the `_parse_frontmatter` pattern) | Use `pyyaml` only if skill needs to parse arbitrary YAML files (not just SKILL.md frontmatter) |
|
||||
|
||||
### requirements.txt Rules
|
||||
|
||||
When third-party packages are needed:
|
||||
|
||||
```
|
||||
# requirements.txt
|
||||
|
||||
# Pin major.minor, allow patch updates
|
||||
requests>=2.31,<3.0
|
||||
pandas>=2.0,<3.0
|
||||
|
||||
# For stdlib-only skills, create an empty requirements.txt with a comment:
|
||||
# No external dependencies required — this skill uses Python stdlib only.
|
||||
```
|
||||
|
||||
**Rules:**
|
||||
- Always create `requirements.txt` even if empty (document the stdlib-only decision)
|
||||
- Pin major.minor version to avoid breaking changes
|
||||
- Never pin exact patch versions (allows security updates)
|
||||
- Never include dev dependencies (pytest, ruff) — those are for contributors, not users
|
||||
- List only direct dependencies, not transitive ones
|
||||
- Include a comment explaining why each package is needed
|
||||
|
||||
### Common Dependency Patterns by Skill Type
|
||||
|
||||
| Skill Type | Typical Dependencies |
|
||||
|-----------|---------------------|
|
||||
| Data analysis (stocks, agriculture, climate) | `requests`, `pandas`, `numpy` |
|
||||
| Report generation | `requests`, `fpdf2` or `reportlab` |
|
||||
| Web scraping | `requests`, `beautifulsoup4` |
|
||||
| API wrapper | `requests` (or stdlib `urllib`) |
|
||||
| Text processing | Stdlib only (`re`, `json`, `csv`) |
|
||||
| File format conversion | Stdlib only (or single specialized package) |
|
||||
| Database interaction | Stdlib `sqlite3` (or `psycopg2`/`pymysql` for specific DBs) |
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Why Test Generated Skills
|
||||
|
||||
Skills are opinionated software that teams rely on daily. A skill that produces wrong calculations, misparses API responses, or silently drops data is worse than no skill at all. Tests catch these issues before the skill reaches users.
|
||||
|
||||
### What to Test
|
||||
|
||||
Focus tests on the parts most likely to break or produce wrong results:
|
||||
|
||||
| Priority | What to Test | Why |
|
||||
|----------|-------------|-----|
|
||||
| **High** | Analysis/calculation functions | Wrong math = wrong decisions |
|
||||
| **High** | Data parsing (API response → structured data) | APIs change formats, edge cases in real data |
|
||||
| **High** | Input validation | Bad input should fail clearly, not silently produce garbage |
|
||||
| **Medium** | Output formatting | Reports and summaries should be consistent |
|
||||
| **Medium** | Error handling paths | Verify graceful degradation on API failures, missing data |
|
||||
| **Low** | Cache logic | Only if custom caching is complex |
|
||||
| **Low** | Config loading | Usually trivial |
|
||||
|
||||
### Test Directory Structure
|
||||
|
||||
```
|
||||
skill-name/
|
||||
├── scripts/
|
||||
│ ├── analyze.py
|
||||
│ ├── fetch.py
|
||||
│ └── parse.py
|
||||
├── tests/
|
||||
│ ├── test_analyze.py # Unit tests for analysis functions
|
||||
│ ├── test_parse.py # Unit tests for parsing logic
|
||||
│ ├── fixtures/
|
||||
│ │ ├── sample_api_response.json # Real API response (anonymized)
|
||||
│ │ └── sample_parsed_data.csv # Expected parsed output
|
||||
│ └── conftest.py # Shared pytest fixtures
|
||||
```
|
||||
|
||||
### Test Patterns
|
||||
|
||||
**Pattern 1: Test analysis functions with known inputs/outputs**
|
||||
|
||||
```python
|
||||
"""Tests for analyze.py — core calculation functions."""
|
||||
import pytest
|
||||
from scripts.analyze import analyze_yoy, calculate_trend
|
||||
|
||||
def test_yoy_increase():
|
||||
"""YoY comparison should detect an increase."""
|
||||
result = analyze_yoy(
|
||||
current_value=150.0,
|
||||
previous_value=100.0,
|
||||
)
|
||||
assert result["change_percent"] == pytest.approx(50.0)
|
||||
assert result["interpretation"] == "significant_increase"
|
||||
|
||||
def test_yoy_stable():
|
||||
"""Changes under 2% should be interpreted as stable."""
|
||||
result = analyze_yoy(current_value=101.0, previous_value=100.0)
|
||||
assert result["interpretation"] == "stable"
|
||||
|
||||
def test_yoy_zero_previous():
|
||||
"""Division by zero should raise ValueError, not crash."""
|
||||
with pytest.raises(ValueError, match="previous value cannot be zero"):
|
||||
analyze_yoy(current_value=100.0, previous_value=0.0)
|
||||
```
|
||||
|
||||
**Pattern 2: Test parsing with fixture data**
|
||||
|
||||
```python
|
||||
"""Tests for parse.py — API response parsing."""
|
||||
import json
|
||||
from pathlib import Path
|
||||
from scripts.parse import parse_api_response
|
||||
|
||||
FIXTURES = Path(__file__).parent / "fixtures"
|
||||
|
||||
def test_parse_normal_response():
|
||||
"""Standard API response should parse to expected structure."""
|
||||
raw = json.loads((FIXTURES / "sample_api_response.json").read_text())
|
||||
result = parse_api_response(raw)
|
||||
assert len(result) > 0
|
||||
assert "year" in result[0]
|
||||
assert "value" in result[0]
|
||||
|
||||
def test_parse_empty_response():
|
||||
"""Empty API response should return empty list, not crash."""
|
||||
result = parse_api_response({"data": []})
|
||||
assert result == []
|
||||
|
||||
def test_parse_malformed_values():
|
||||
"""Values with commas (e.g., '15,300,000') should be cleaned."""
|
||||
raw = {"data": [{"value": "15,300,000", "year": "2023"}]}
|
||||
result = parse_api_response(raw)
|
||||
assert result[0]["value"] == 15300000.0
|
||||
```
|
||||
|
||||
**Pattern 3: Mock external API calls**
|
||||
|
||||
```python
|
||||
"""Tests for fetch.py — API interaction (mocked)."""
|
||||
from unittest.mock import patch, MagicMock
|
||||
from scripts.fetch import fetch_data
|
||||
|
||||
@patch("scripts.fetch.urllib.request.urlopen")
|
||||
def test_fetch_success(mock_urlopen):
|
||||
"""Successful API call should return parsed JSON."""
|
||||
mock_response = MagicMock()
|
||||
mock_response.read.return_value = b'{"data": [{"value": "100"}]}'
|
||||
mock_response.__enter__ = lambda s: s
|
||||
mock_response.__exit__ = MagicMock(return_value=False)
|
||||
mock_urlopen.return_value = mock_response
|
||||
|
||||
result = fetch_data(commodity="CORN", year=2023)
|
||||
assert "data" in result
|
||||
|
||||
@patch("scripts.fetch.urllib.request.urlopen")
|
||||
def test_fetch_rate_limited(mock_urlopen):
|
||||
"""429 response should raise RateLimitError."""
|
||||
from urllib.error import HTTPError
|
||||
mock_urlopen.side_effect = HTTPError(
|
||||
url="", code=429, msg="Too Many Requests", hdrs={}, fp=None
|
||||
)
|
||||
with pytest.raises(Exception, match="[Rr]ate limit"):
|
||||
fetch_data(commodity="CORN", year=2023)
|
||||
```
|
||||
|
||||
### Running Tests
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
cd skill-name/
|
||||
python -m pytest tests/ -v
|
||||
|
||||
# Run with coverage
|
||||
python -m pytest tests/ --cov=scripts --cov-report=term-missing
|
||||
```
|
||||
|
||||
### When to Generate Tests
|
||||
|
||||
Tests are generated during Phase 5 (Implementation) after the scripts are written:
|
||||
|
||||
1. Write all scripts first (steps 1-5 of Phase 5)
|
||||
2. Create `tests/` directory with test files for core functions
|
||||
3. Create `tests/fixtures/` with sample data
|
||||
4. Run tests to verify they pass
|
||||
5. Include test instructions in README.md
|
||||
|
||||
**Note:** Tests are recommended but not mandatory for v1.0 of a skill. The validation and security scan gates are always mandatory. Tests become critical when:
|
||||
- The skill performs financial calculations (wrong math = real cost)
|
||||
- The skill processes sensitive data (parsing errors = data loss)
|
||||
- Multiple people will maintain the skill (tests prevent regressions)
|
||||
- The skill is being published to the team registry (quality expectation is higher)
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### Anti-Pattern 1: Partial Implementation
|
||||
|
|
|
|||
Loading…
Reference in a new issue