# Mandatory Quality Standards ## Fundamental Principles **Production-Ready, Not Prototype** - Code must work without modifications - Doesn't need "now implement X" - Can be used immediately **Functional, Not Placeholder** - Complete code in all functions - No TODO, pass, NotImplementedError - Robust error handling **Useful, Not Generic** - Specific and detailed content - Concrete examples, not abstract - Not just external links **Current, Not Stale** - Include `metadata.created` and `metadata.last_reviewed` dates in frontmatter - Set `metadata.review_interval_days` (default: 90 days) - Declare external dependencies in `metadata.dependencies` so health can be checked - Declare expected API response shapes in `metadata.schema_expectations` for drift detection - Run `python3 scripts/staleness_check.py path/to/skill/` periodically to detect stale skills - When publishing to a registry, use `python3 scripts/skill_registry.py stale` to audit all skills --- ## Standards by File Type ### Python Scripts #### ✅ MANDATORY **1. Complete structure**: ```python #!/usr/bin/env python3 """Module docstring""" # Imports import ... # Constants CONST = value # Classes/Functions class/def ... # Main def main(): ... if __name__ == "__main__": main() ``` **2. Docstrings**: - Module docstring: 3-5 lines - Class docstring: Description + Example - Method docstring: Args, Returns, Raises, Example **3. Type hints**: ```python def function(param1: str, param2: int = 10) -> Dict[str, Any]: ... ``` **4. Error handling**: ```python try: result = risky_operation() except SpecificError as e: # Handle specifically log_error(e) raise CustomError(f"Context: {e}") ``` **5. Validations**: ```python def process(data: Dict) -> pd.DataFrame: # Validate input if not data: raise ValueError("Data cannot be empty") if 'required_field' not in data: raise ValueError("Missing required field") # Process ... # Validate output assert len(result) > 0, "Result cannot be empty" assert result['value'].notna().all(), "No null values allowed" return result ``` **6. Appropriate logging**: ```python import logging logger = logging.getLogger(__name__) def fetch_data(): logger.info("Fetching data from API...") # ... logger.debug(f"Received {len(data)} records") # ... logger.error(f"API error: {e}") ``` #### ❌ FORBIDDEN ```python # ❌ DON'T DO THIS: def analyze(): # TODO: implement analysis pass def process(data): # ❌ No type hints # ❌ No docstring result = data # ❌ No real logic return result # ❌ No validation def fetch_api(url): response = requests.get(url) # ❌ No timeout return response.json() # ❌ No error handling ``` #### ✅ DO THIS: ```python def analyze_yoy(df: pd.DataFrame, commodity: str, year1: int, year2: int) -> Dict: """ Perform year-over-year analysis Args: df: DataFrame with parsed data commodity: Commodity name (e.g., "CORN") year1: Current year year2: Previous year Returns: Dict with keys: - production_current: float - production_previous: float - change_percent: float - interpretation: str Raises: ValueError: If data not found for specified years DataQualityError: If data fails validation Example: >>> analyze_yoy(df, "CORN", 2023, 2022) {'production_current': 15.3, 'change_percent': 11.7, ...} """ # Validate inputs if commodity not in df['commodity'].unique(): raise ValueError(f"Commodity {commodity} not found in data") # Filter data df1 = df[(df['commodity'] == commodity) & (df['year'] == year1)] df2 = df[(df['commodity'] == commodity) & (df['year'] == year2)] if len(df1) == 0 or len(df2) == 0: raise ValueError(f"Data not found for {commodity} in {year1} or {year2}") # Extract values prod1 = df1['production'].iloc[0] prod2 = df2['production'].iloc[0] # Calculate change = prod1 - prod2 change_pct = (change / prod2) * 100 # Interpret if abs(change_pct) < 2: interpretation = "stable" elif change_pct > 10: interpretation = "significant_increase" elif change_pct > 2: interpretation = "moderate_increase" elif change_pct < -10: interpretation = "significant_decrease" else: interpretation = "moderate_decrease" # Return return { "commodity": commodity, "production_current": round(prod1, 1), "production_previous": round(prod2, 1), "change_absolute": round(change, 1), "change_percent": round(change_pct, 1), "interpretation": interpretation } ``` --- ### SKILL.md #### ✅ MANDATORY **1. Valid frontmatter**: ```yaml --- name: agent-name description: [150-250 words with keywords] --- ``` **2. Size**: 5000-7000 words **3. Mandatory sections**: - When to use (specific triggers) - Data source (detailed API) - Workflows (complete step-by-step) - Scripts (each one explained) - Analyses (methodologies) - Errors (complete handling) - Validations (mandatory) - Keywords (complete list) - Examples (5+ complete) **4. Detailed workflows**: ✅ **GOOD**: ```markdown ### Workflow: YoY Comparison 1. **Identify question parameters** - Commodity: [extract from question] - Years: Current vs previous (or specified) 2. **Fetch data** ```bash python scripts/fetch_nass.py \ --commodity CORN \ --years 2023,2022 \ --output data/raw/corn_2023_2022.json ``` 3. **Parse** ```bash python scripts/parse_nass.py \ --input data/raw/corn_2023_2022.json \ --output data/processed/corn.csv ``` 4. **Analyze** ```bash python scripts/analyze_nass.py \ --input data/processed/corn.csv \ --analysis yoy \ --commodity CORN \ --year1 2023 \ --year2 2022 \ --output data/analysis/corn_yoy.json ``` 5. **Interpret results** File `data/analysis/corn_yoy.json` contains: ```json { "production_current": 15.3, "change_percent": 11.7, "interpretation": "significant_increase" } ``` Respond to user: "Corn production grew 11.7% in 2023..." ``` ❌ **BAD**: ```markdown ### Workflow: Comparison 1. Get data 2. Compare 3. Return result ``` **5. Complete examples**: ✅ **GOOD**: ```markdown ### Example 1: YoY Comparison **Question**: "How's corn production compared to last year?" **Executed flow**: [Specific commands with outputs] **Generated answer**: "Corn production in 2023 is 15.3 billion bushels, growth of 11.7% vs 2022 (13.7 billion). Growth comes mainly from area increase (+8%) with stable yield." ``` ❌ **BAD**: ```markdown ### Example: Comparison User asks about comparison. Agent compares and responds. ``` #### ❌ FORBIDDEN - Empty sections - "See documentation" - Workflows without specific commands - Generic examples --- ### References #### ✅ MANDATORY **1. Useful and self-contained content**: ✅ **GOOD** (references/api-guide.md): ```markdown ## Endpoint: Get Production Data **URL**: `GET https://quickstats.nass.usda.gov/api/api_GET/` **Parameters**: - `commodity_desc`: Commodity name - Example: "CORN", "SOYBEANS" - Case-sensitive - `year`: Desired year - Example: 2023 - Range: 1866-present **Complete request example**: ```bash curl -H "X-Api-Key: YOUR_KEY" \ "https://quickstats.nass.usda.gov/api/api_GET/?commodity_desc=CORN&year=2023&format=JSON" ``` **Expected response**: ```json { "data": [ { "year": 2023, "commodity_desc": "CORN", "value": "15,300,000,000", "unit_desc": "BU" } ] } ``` **Important fields**: - `value`: Comes as STRING with commas - Solution: `value.replace(',', '')` - Convert to float after ``` ❌ **BAD**: ```markdown ## API Endpoint For details on how to use the API, consult the official documentation at: https://quickstats.nass.usda.gov/api [End of file] ``` **2. Adequate size**: - API guide: 1500-2000 words - Analysis methods: 2000-3000 words - Troubleshooting: 1000-1500 words **3. Concrete examples**: - Always include examples with real values - Executable code blocks - Expected outputs #### ❌ FORBIDDEN - "For more information, see [link]" - Sections with only 2-3 lines - Lists without details - Circular references ("see other doc that sees other doc") --- ### Assets (Configs) #### ✅ MANDATORY **1. Syntactically valid JSON**: ```bash # ALWAYS validate: python -c "import json; json.load(open('config.json'))" ``` **2. Real values**: ✅ **GOOD**: ```json { "api": { "base_url": "https://quickstats.nass.usda.gov/api", "api_key_env": "NASS_API_KEY", "_instructions": "Get free API key from: https://quickstats.nass.usda.gov/api#registration", "rate_limit_per_day": 1000, "timeout_seconds": 30 } } ``` ❌ **BAD**: ```json { "api": { "base_url": "YOUR_API_URL_HERE", "api_key": "YOUR_KEY_HERE" } } ``` **3. Inline comments** (using `_comment` or `_note`): ```json { "_comment": "Differentiated TTL by data type", "cache": { "ttl_historical_days": 365, "_note_historical": "Historical data doesn't change", "ttl_current_days": 7, "_note_current": "Current year data may be revised" } } ``` --- ### README.md #### ✅ MANDATORY **1. Complete installation instructions**: ✅ **GOOD**: ```markdown ## Installation ### 1. Get API Key (Free) 1. Access https://quickstats.nass.usda.gov/api#registration 2. Fill form: - Name: [your name] - Email: [your email] - Purpose: "Personal research" 3. Click "Submit" 4. You'll receive email with API key in ~1 minute 5. Key format: `A1B2C3D4-E5F6-G7H8-I9J0-K1L2M3N4O5P6` ### 2. Configure Environment **Option A - Export** (temporary): ```bash export NASS_API_KEY="your_key_here" ``` **Option B - .bashrc/.zshrc** (permanent): ```bash echo 'export NASS_API_KEY="your_key_here"' >> ~/.bashrc source ~/.bashrc ``` **Option C - .env file** (per project): ```bash echo "NASS_API_KEY=your_key_here" > .env ``` ### 3. Install Dependencies ```bash cd nass-usda-agriculture pip install -r requirements.txt ``` Requirements: - requests - pandas - numpy ``` ❌ **BAD**: ```markdown ## Installation 1. Get API key from the official website 2. Configure environment 3. Install dependencies 4. Done! ``` **2. Concrete usage examples**: ✅ **GOOD**: ```markdown ## Examples ### Example 1: Current Production ``` You: "What's US corn production in 2023?" Claude: "Corn production in 2023 was 15.3 billion bushels (389 million metric tons)..." ``` ### Example 2: YoY Comparison ``` You: "Compare soybeans this year vs last year" Claude: "Soybean production in 2023 is 2.6% below 2022: - 2023: 4.165 billion bushels - 2022: 4.276 billion bushels - Drop from area (-4.5%), yield improved (+0.8%)" ``` [3-5 more examples] ``` ❌ **BAD**: ```markdown ## Usage Ask questions about agriculture and the agent will respond. ``` **3. Specific troubleshooting**: ✅ **GOOD**: ```markdown ### Error: "NASS_API_KEY environment variable not found" **Cause**: API key not configured **Step-by-step solution**: 1. Verify key was obtained: https://... 2. Configure environment: ```bash export NASS_API_KEY="your_key_here" ``` 3. Verify: ```bash echo $NASS_API_KEY ``` 4. Should show your key 5. If doesn't work, restart terminal **Still not working?** - Check for extra spaces in key - Verify key hasn't expired (validity: 1 year) - Re-generate key if needed ``` --- ## Quality Checklist ### Per Python Script - [ ] Shebang: `#!/usr/bin/env python3` - [ ] Module docstring (3-5 lines) - [ ] Organized imports (stdlib, 3rd party, local) - [ ] Constants at top (if applicable) - [ ] Type hints in all public functions - [ ] Docstrings in classes (description + attributes + example) - [ ] Docstrings in methods (Args, Returns, Raises, Example) - [ ] Error handling for risky operations - [ ] Input validations - [ ] Output validations - [ ] Appropriate logging - [ ] Main function with argparse - [ ] if __name__ == "__main__" - [ ] Functional code (no TODO/pass) - [ ] Valid syntax (test: `python -m py_compile script.py`) ### Per SKILL.md - [ ] Frontmatter with name and description - [ ] Description 150-250 characters with keywords - [ ] Size 5000+ words - [ ] "When to Use" section with specific triggers - [ ] "Data Source" section detailed - [ ] Step-by-step workflows with commands - [ ] Scripts explained individually - [ ] Analyses documented (objective, methodology) - [ ] Errors handled (all expected) - [ ] Validations listed - [ ] Performance/cache explained - [ ] Complete keywords - [ ] Complete examples (5+) ### Per Reference File - [ ] 1000+ words - [ ] Useful content (not just links) - [ ] Concrete examples with real values - [ ] Executable code blocks - [ ] Well structured (headings, lists) - [ ] No empty sections - [ ] No "TODO: write" ### Per Asset (Config) - [ ] Syntactically valid JSON (validate!) - [ ] Real values (not "YOUR_X_HERE" without context) - [ ] Inline comments (_comment, _note) - [ ] Instructions for values user must fill - [ ] Logical and organized structure ### Per README.md - [ ] Step-by-step installation - [ ] How to get API key (detailed) - [ ] How to configure (3 options) - [ ] How to install dependencies - [ ] How to install in Claude Code - [ ] Usage examples (5+) - [ ] Troubleshooting (10+ problems) - [ ] License - [ ] Contact/contribution (if applicable) ### Complete Agent - [ ] DECISIONS.md documents all choices - [ ] **VERSION** file created (e.g. 1.0.0) - [ ] **CHANGELOG.md** created with complete v1.0.0 entry - [ ] **INSTALACAO.md** with complete didactic tutorial - [ ] **comprehensive_{domain}_report()** implemented - [ ] SKILL.md with version in frontmatter metadata - [ ] 18+ files created - [ ] ~1500+ lines of Python code - [ ] ~10,000+ words of documentation - [ ] 2+ configs - [ ] requirements.txt - [ ] .gitignore (if needed) - [ ] No placeholder/TODO - [ ] Valid syntax (Python, JSON, YAML) - [ ] Ready to use (production-ready) --- ## Quality Examples ### Example: Error Handling ❌ **BAD**: ```python def fetch(url): return requests.get(url).json() ``` ✅ **GOOD**: ```python def fetch(url: str, timeout: int = 30) -> Dict: """ Fetch data from URL with error handling Args: url: URL to fetch timeout: Timeout in seconds Returns: JSON response as dict Raises: NetworkError: If connection fails TimeoutError: If request times out APIError: If API returns error """ try: response = requests.get(url, timeout=timeout) response.raise_for_status() data = response.json() if 'error' in data: raise APIError(f"API error: {data['error']}") return data except requests.Timeout: raise TimeoutError(f"Request timed out after {timeout}s") except requests.ConnectionError as e: raise NetworkError(f"Connection failed: {e}") except requests.HTTPError as e: if e.response.status_code == 429: raise RateLimitError("Rate limit exceeded") else: raise APIError(f"HTTP {e.response.status_code}: {e}") ``` ### Example: Validations ❌ **BAD**: ```python def parse(data): df = pd.DataFrame(data) return df ``` ✅ **GOOD**: ```python def parse(data: List[Dict]) -> pd.DataFrame: """Parse and validate data""" # Validate input if not data: raise ValueError("Data cannot be empty") if not isinstance(data, list): raise TypeError(f"Expected list, got {type(data)}") # Parse df = pd.DataFrame(data) # Validate schema required_cols = ['year', 'commodity', 'value'] missing = set(required_cols) - set(df.columns) if missing: raise ValueError(f"Missing required columns: {missing}") # Validate types df['year'] = pd.to_numeric(df['year'], errors='raise') df['value'] = pd.to_numeric(df['value'], errors='raise') # Validate ranges current_year = datetime.now().year if (df['year'] > current_year).any(): raise ValueError(f"Future years found (max allowed: {current_year})") if (df['value'] < 0).any(): raise ValueError("Negative values found") # Validate no duplicates if df.duplicated(subset=['year', 'commodity']).any(): raise ValueError("Duplicate records found") return df ``` ### Example: Docstrings ❌ **BAD**: ```python def analyze(df, commodity): """Analyze data""" # ... ``` ✅ **GOOD**: ```python def analyze_yoy( df: pd.DataFrame, commodity: str, year1: int, year2: int ) -> Dict[str, Any]: """ Perform year-over-year comparison analysis Compares production, area, and yield between two years and decomposes growth into area vs yield contributions. Args: df: DataFrame with columns ['year', 'commodity', 'production', 'area', 'yield'] commodity: Commodity name (e.g., "CORN", "SOYBEANS") year1: Current year to compare year2: Previous year to compare against Returns: Dict containing: - production_current (float): Production in year1 (million units) - production_previous (float): Production in year2 - change_absolute (float): Absolute change - change_percent (float): Percent change - decomposition (dict): Area vs yield contribution - interpretation (str): "increase", "decrease", or "stable" Raises: ValueError: If commodity not found in data ValueError: If either year not found in data DataQualityError: If production != area * yield (tolerance > 1%) Example: >>> df = pd.DataFrame([ ... {'year': 2023, 'commodity': 'CORN', 'production': 15.3, 'area': 94.6, 'yield': 177}, ... {'year': 2022, 'commodity': 'CORN', 'production': 13.7, 'area': 89.2, 'yield': 173} ... ]) >>> result = analyze_yoy(df, "CORN", 2023, 2022) >>> result['change_percent'] 11.7 """ # [Complete implementation] ``` --- ## Dependency Management ### Decision Framework Skills should minimize external dependencies. Every dependency is a maintenance burden, a security surface, and a compatibility risk. Use this decision tree: ``` Can stdlib do it? → Yes: Use stdlib. Done. → No: Is there a lightweight pure-Python package (<1MB)? → Yes: Use it. Add to requirements.txt. → No: Is there a well-maintained popular package? → Yes: Use it only if the domain requires it. → No: Implement it yourself or redesign the approach. ``` ### Stdlib vs. Third-Party Decision Table | Task | Stdlib Solution | When to Use Third-Party | |------|----------------|------------------------| | HTTP requests | `urllib.request` | Use `requests` when: complex auth, session management, multipart uploads, or retry logic would require 100+ lines of urllib code | | JSON handling | `json` | Never — stdlib is sufficient | | CSV parsing | `csv` | Use `pandas` only when: statistical analysis, complex transformations, or DataFrame operations are core to the skill | | File paths | `pathlib` | Never — stdlib is sufficient | | Date/time | `datetime` | Never — stdlib is sufficient | | Regex | `re` | Never — stdlib is sufficient | | Hashing | `hashlib` | Never — stdlib is sufficient | | Caching | File-based (json + pathlib) | Never for skills — the FileCache pattern in architecture-guide.md is sufficient | | Data analysis | Manual calculations | Use `pandas`/`numpy` when: skill is primarily analytical (10+ statistical operations, pivots, aggregations) | | PDF generation | Not available | Use `reportlab` or `fpdf2` when PDF output is a core requirement | | Web scraping | `urllib` + `html.parser` | Use `beautifulsoup4` when parsing complex/malformed HTML | | CLI arguments | `argparse` | Never — stdlib is sufficient | | YAML parsing | Manual (the `_parse_frontmatter` pattern) | Use `pyyaml` only if skill needs to parse arbitrary YAML files (not just SKILL.md frontmatter) | ### requirements.txt Rules When third-party packages are needed: ``` # requirements.txt # Pin major.minor, allow patch updates requests>=2.31,<3.0 pandas>=2.0,<3.0 # For stdlib-only skills, create an empty requirements.txt with a comment: # No external dependencies required — this skill uses Python stdlib only. ``` **Rules:** - Always create `requirements.txt` even if empty (document the stdlib-only decision) - Pin major.minor version to avoid breaking changes - Never pin exact patch versions (allows security updates) - Never include dev dependencies (pytest, ruff) — those are for contributors, not users - List only direct dependencies, not transitive ones - Include a comment explaining why each package is needed ### Common Dependency Patterns by Skill Type | Skill Type | Typical Dependencies | |-----------|---------------------| | Data analysis (stocks, agriculture, climate) | `requests`, `pandas`, `numpy` | | Report generation | `requests`, `fpdf2` or `reportlab` | | Web scraping | `requests`, `beautifulsoup4` | | API wrapper | `requests` (or stdlib `urllib`) | | Text processing | Stdlib only (`re`, `json`, `csv`) | | File format conversion | Stdlib only (or single specialized package) | | Database interaction | Stdlib `sqlite3` (or `psycopg2`/`pymysql` for specific DBs) | --- ## Testing Strategy ### Why Test Generated Skills Skills are opinionated software that teams rely on daily. A skill that produces wrong calculations, misparses API responses, or silently drops data is worse than no skill at all. Tests catch these issues before the skill reaches users. ### What to Test Focus tests on the parts most likely to break or produce wrong results: | Priority | What to Test | Why | |----------|-------------|-----| | **High** | Analysis/calculation functions | Wrong math = wrong decisions | | **High** | Data parsing (API response → structured data) | APIs change formats, edge cases in real data | | **High** | Input validation | Bad input should fail clearly, not silently produce garbage | | **Medium** | Output formatting | Reports and summaries should be consistent | | **Medium** | Error handling paths | Verify graceful degradation on API failures, missing data | | **Low** | Cache logic | Only if custom caching is complex | | **Low** | Config loading | Usually trivial | ### Test Directory Structure ``` skill-name/ ├── scripts/ │ ├── analyze.py │ ├── fetch.py │ └── parse.py ├── tests/ │ ├── test_analyze.py # Unit tests for analysis functions │ ├── test_parse.py # Unit tests for parsing logic │ ├── fixtures/ │ │ ├── sample_api_response.json # Real API response (anonymized) │ │ └── sample_parsed_data.csv # Expected parsed output │ └── conftest.py # Shared pytest fixtures ``` ### Test Patterns **Pattern 1: Test analysis functions with known inputs/outputs** ```python """Tests for analyze.py — core calculation functions.""" import pytest from scripts.analyze import analyze_yoy, calculate_trend def test_yoy_increase(): """YoY comparison should detect an increase.""" result = analyze_yoy( current_value=150.0, previous_value=100.0, ) assert result["change_percent"] == pytest.approx(50.0) assert result["interpretation"] == "significant_increase" def test_yoy_stable(): """Changes under 2% should be interpreted as stable.""" result = analyze_yoy(current_value=101.0, previous_value=100.0) assert result["interpretation"] == "stable" def test_yoy_zero_previous(): """Division by zero should raise ValueError, not crash.""" with pytest.raises(ValueError, match="previous value cannot be zero"): analyze_yoy(current_value=100.0, previous_value=0.0) ``` **Pattern 2: Test parsing with fixture data** ```python """Tests for parse.py — API response parsing.""" import json from pathlib import Path from scripts.parse import parse_api_response FIXTURES = Path(__file__).parent / "fixtures" def test_parse_normal_response(): """Standard API response should parse to expected structure.""" raw = json.loads((FIXTURES / "sample_api_response.json").read_text()) result = parse_api_response(raw) assert len(result) > 0 assert "year" in result[0] assert "value" in result[0] def test_parse_empty_response(): """Empty API response should return empty list, not crash.""" result = parse_api_response({"data": []}) assert result == [] def test_parse_malformed_values(): """Values with commas (e.g., '15,300,000') should be cleaned.""" raw = {"data": [{"value": "15,300,000", "year": "2023"}]} result = parse_api_response(raw) assert result[0]["value"] == 15300000.0 ``` **Pattern 3: Mock external API calls** ```python """Tests for fetch.py — API interaction (mocked).""" from unittest.mock import patch, MagicMock from scripts.fetch import fetch_data @patch("scripts.fetch.urllib.request.urlopen") def test_fetch_success(mock_urlopen): """Successful API call should return parsed JSON.""" mock_response = MagicMock() mock_response.read.return_value = b'{"data": [{"value": "100"}]}' mock_response.__enter__ = lambda s: s mock_response.__exit__ = MagicMock(return_value=False) mock_urlopen.return_value = mock_response result = fetch_data(commodity="CORN", year=2023) assert "data" in result @patch("scripts.fetch.urllib.request.urlopen") def test_fetch_rate_limited(mock_urlopen): """429 response should raise RateLimitError.""" from urllib.error import HTTPError mock_urlopen.side_effect = HTTPError( url="", code=429, msg="Too Many Requests", hdrs={}, fp=None ) with pytest.raises(Exception, match="[Rr]ate limit"): fetch_data(commodity="CORN", year=2023) ``` ### Running Tests ```bash # Run all tests cd skill-name/ python -m pytest tests/ -v # Run with coverage python -m pytest tests/ --cov=scripts --cov-report=term-missing ``` ### When to Generate Tests Tests are generated during Phase 5 (Implementation) after the scripts are written: 1. Write all scripts first (steps 1-5 of Phase 5) 2. Create `tests/` directory with test files for core functions 3. Create `tests/fixtures/` with sample data 4. Run tests to verify they pass 5. Include test instructions in README.md **Note:** Tests are recommended but not mandatory for v1.0 of a skill. The validation and security scan gates are always mandatory. Tests become critical when: - The skill performs financial calculations (wrong math = real cost) - The skill processes sensitive data (parsing errors = data loss) - Multiple people will maintain the skill (tests prevent regressions) - The skill is being published to the team registry (quality expectation is higher) --- ## Anti-Patterns ### Anti-Pattern 1: Partial Implementation ❌ **NO**: ```python def yoy_comparison(df, commodity, year1, year2): # Implement YoY comparison pass def state_ranking(df, commodity): # TODO: implement ranking raise NotImplementedError() ``` ✅ **YES**: ```python # [Complete and functional code for BOTH functions] ``` ### Anti-Pattern 2: Empty References ❌ **NO**: ```markdown # Analysis Methods ## YoY Comparison This method compares two years. ## Ranking This method ranks states. ``` ✅ **YES**: ```markdown # Analysis Methods ## YoY Comparison ### Objective Compare metrics between current and previous year... ### Detailed Methodology **Formulas**: ``` Δ X = X(t) - X(t-1) Δ X% = (Δ X / X(t-1)) × 100 ``` **Decomposition** (for production): [Complete mathematics] **Interpretation**: - |Δ| < 2%: Stable - Δ > 10%: Significant increase [...] ### Validations [List] ### Complete Numerical Example [With real values] ``` ### Anti-Pattern 3: Useless Configs ❌ **NO**: ```json { "api_url": "INSERT_URL", "api_key": "INSERT_KEY" } ``` ✅ **YES**: ```json { "_comment": "Configuration for NASS USDA Agent", "api": { "base_url": "https://quickstats.nass.usda.gov/api", "_note": "This is the official USDA NASS API base URL", "api_key_env": "NASS_API_KEY", "_key_instructions": "Get free API key from: https://quickstats.nass.usda.gov/api#registration" } } ``` --- ## Final Validation Before delivering to user, verify: ### Sanity Test ```bash # 1. Python syntax find scripts -name "*.py" -exec python -m py_compile {} \; # 2. JSON syntax python -c "import json; json.load(open('assets/config.json'))" # 3. Imports make sense grep -r "^import\|^from" scripts/*.py | sort | uniq # Verify all libs are: stdlib, requests, pandas, numpy # No imports of uninstalled libs # 4. SKILL.md has frontmatter head -5 SKILL.md | grep "^---$" # 5. SKILL.md size wc -w SKILL.md # Should be > 5000 words ``` ### Final Checklist - [ ] Syntax check passed (Python, JSON) - [ ] No import of non-existent lib - [ ] No TODO or pass - [ ] SKILL.md > 5000 words - [ ] References with content - [ ] README with complete instructions - [ ] DECISIONS.md created - [ ] requirements.txt created