refactor: remove optimize_loop.py, replace with agent-driven optimization

optimize_loop.py was framework-only (needed external LLM API). The optimization is now an auxiliary function in SKILL.md driven by the already-running agent. All references updated across README, CLAUDE.md, diagnose.py, and writing-config.example.yaml. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:58:20 +08:00 · 2026-03-30 19:58:20 +08:00 · 5fb20083af
commit 5fb20083af
parent c7d618a0d1
5 changed files with 94 additions and 164 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1,75 @@
 # CLAUDE.md
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 ## Project Overview
 WeWrite is a WeChat public account (公众号) content generation AI skill. It automates the full workflow from trending topic discovery to draft box publishing. It works as both a Claude Code skill (via SKILL.md) and an OpenClaw-compatible skill (via `dist/openclaw/`).
 The core pipeline is defined in SKILL.md (Steps 1-8): environment check → topic selection → framework + material collection → writing → SEO/anti-AI verification → visual AI → formatting/publishing → wrap-up.
 ## Commands
 ```bash
 # Install dependencies
 pip install -r requirements.txt
 # Toolkit CLI
 python3 toolkit/cli.py preview article.md --theme sspai       # Preview as HTML
 python3 toolkit/cli.py publish article.md --cover cover.png --title "标题"  # Publish to WeChat
 python3 toolkit/cli.py gallery                                  # Browse all 16 themes
 python3 toolkit/cli.py themes                                   # List theme names
 python3 toolkit/cli.py image-post img1.jpg img2.jpg -t "标题"   # Image post (carousel)
 # Data collection scripts
 python3 scripts/fetch_hotspots.py --limit 20                   # Trending topics
 python3 scripts/seo_keywords.py --json "关键词1" "关键词2"      # SEO keyword analysis
 python3 scripts/fetch_stats.py <article_id>                     # WeChat article stats
 python3 scripts/humanness_score.py article.md --verbose         # AI detection scoring (11 checks)
 # Build OpenClaw-compatible skill (also runs in CI on push to main)
 python3 scripts/build_openclaw.py
 ```
 No formal test suite exists. CI only rebuilds the OpenClaw version on push to main.
 ## Architecture
 ### Dual Nature: Skill + Toolkit
 - **As a skill** (SKILL.md): An agent-orchestrated 8-step pipeline. The LLM reads SKILL.md and executes steps, calling Python scripts as tools. Reference docs in `references/` are loaded on-demand by the agent at specific steps.
 - **As a standalone toolkit** (`toolkit/cli.py`): A Python CLI for Markdown→WeChat HTML conversion and publishing, usable independently of the skill.
 ### Key Directories
 - `scripts/` — Data collection utilities (hotspots, SEO, stats) and build tools. Called by the agent during pipeline execution.
 - `toolkit/` — Markdown→WeChat HTML converter, theme engine, WeChat API client, image generation. The CLI entry point is `toolkit/cli.py`.
 - `personas/` — 5 YAML writing personality presets controlling tone, data presentation, emotional arc. Loaded in Step 4b.
 - `references/` — Agent-loaded docs (writing rules, frameworks, SEO, topic scoring). These are NOT code — they are instruction sets the LLM reads and follows.
 - `toolkit/themes/` — 16 YAML theme definitions. Parsed by `toolkit/theme.py`, applied as inline CSS by `toolkit/converter.py`.
 ### Formatting Pipeline (toolkit)
 `converter.py` is the core: Markdown → HTML with inline styles + WeChat compatibility fixes (CJK spacing, bold punctuation, list→section conversion, external links→footnotes, dark mode attributes). WeChat strips `<style>` tags, so all CSS must be inlined. Themes are YAML files defining colors and base CSS; `theme.py` parses them, `converter.py` applies them.
 ### OpenClaw Compatibility
 `scripts/build_openclaw.py` transforms SKILL.md for OpenClaw: replaces `{skill_dir}` with `{baseDir}`, renames tool references (WebSearch→web_search, etc.), copies referenced files. CI runs this on push to main and commits to `dist/openclaw/`.
 ### Configuration Files
 - `config.yaml` (from `config.example.yaml`) — WeChat API credentials + image API key. Missing → graceful degradation (skip_publish, skip_image_gen).
 - `style.yaml` (from `style.example.yaml`) — User's writing profile (name, topics, tone, persona, theme). Auto-created via onboard flow on first run.
 - `writing-config.yaml` (from `writing-config.example.yaml`) — Writing parameters (sentence variance, idiom density, etc.). Optimized per-user via the "优化参数" auxiliary function in SKILL.md.
 All three are .gitignored — each user generates their own.
 ### Graceful Degradation
 The pipeline never hard-fails. Missing config → skip_publish/skip_image_gen flags. Script failures → WebSearch or LLM fallback. Image gen fails → output prompts only. These flags are set in Step 1 and automatically respected by later steps.
 ## Language & Conventions
 - All code is Python 3.11+. No type checking or linter configured.
 - Commit messages are in Chinese, format: `type: description` (e.g., `fix: ...`, `chore: ...`).
 - The project language (README, SKILL.md, comments, references) is Chinese.
--- a/README.md
+++ b/README.md
@ -168,7 +168,7 @@ wewrite/
 ├── SKILL.md                  # 主管道（273行，Step 1-8）
 ├── config.example.yaml       # API 配置模板
 ├── style.example.yaml        # 风格配置模板
-├── writing-config.example.yaml # 写作参数模板（可用 optimize loop 调优）
+├── writing-config.example.yaml # 写作参数模板（说"优化参数"自动调优）
 ├── requirements.txt
 │
 ├── dist/openclaw/            # OpenClaw 兼容版（CI 自动构建）
@ -179,8 +179,7 @@ wewrite/
 │   ├── fetch_stats.py          # 微信文章数据回填
 │   ├── build_playbook.py       # 从历史文章生成 Playbook
 │   ├── learn_edits.py          # 学习人工修改
-│   ├── humanness_score.py      # 文章"人味"打分器（客观 checklist + LLM 判官）
+│   ├── humanness_score.py      # 文章"人味"打分器（11 项检测 + 参数映射）
 │   ├── optimize_loop.py        # autoresearch 风格迭代优化框架
 │   └── build_openclaw.py       # SKILL.md → OpenClaw 格式转换
 │
 ├── toolkit/                  # Markdown → 微信工具链
@ -235,19 +234,25 @@ Step 8  写入历史 → 回复用户（含编辑建议 + 飞轮提示）
 默认全自动。说"交互模式"可在选题/框架/配图处暂停确认。
-## 优化循环（实验性）
+## 写作参数优化
-借鉴 [autoresearch](https://github.com/karpathy/autoresearch) 的 change→score→keep/rollback 模式，WeWrite 提供写作参数自动调优框架：
+在对话中说「优化写作参数」或「优化参数」，Agent 会自动迭代调优你的 `writing-config.yaml`：
 1. 用当前参数写测试短文
 2. 用 `humanness_score.py` 打分（11 项检测，连续 0-1 分数）
 3. 找到最低分维度，调整对应参数
 4. 重复 N 轮（默认 3 轮）
 5. 保留得分最好的参数组合
 ```bash
-# 对一篇文章打分（客观 checklist + 主观 LLM 判官）
+# 独立打分（不需要 Agent）
 python3 scripts/humanness_score.py article.md --verbose
-# 迭代优化写作参数
+# JSON 输出（含每项分数 + 参数映射）
-python3 scripts/optimize_loop.py --topic "AI Agent" --iterations 10
+python3 scripts/humanness_score.py article.md --json
 ```
-框架开源，但优化后的 `writing-config.yaml` 不入 git——每个用户跑出自己的最优参数。
+优化后的 `writing-config.yaml` 不入 git——每个用户跑出自己的最优参数。
 ## Toolkit 独立使用
--- a/scripts/diagnose.py
+++ b/scripts/diagnose.py
@ -157,7 +157,7 @@ def check_enhancements():
    else:
        checks.append(make_check(
            "enhancement", "writing_config", "warn",
-            "not found → using defaults (run optimize_loop.py to tune)",
+            "not found → using defaults (say '优化参数' to tune)",
        ))
    # playbook.md
@ -240,7 +240,7 @@ def compute_summary(checks):
        elif name == "playbook":
            recs.append('Edit a generated article, then say "学习我的修改" to build playbook.md')
        elif name == "writing_config":
-            recs.append('Run: python3 scripts/optimize_loop.py --topic "your topic" --iterations 10')
+            recs.append('Say "优化参数" to run the optimization loop')
        elif name == "history_articles":
            recs.append("Generate your first article to start building history")
        elif name == "dimension_variance":
--- a/scripts/optimize_loop.py
+++ b/scripts/optimize_loop.py
@ -1,149 +0,0 @@
 #!/usr/bin/env python3
 """
 WeWrite Optimization Loop — autoresearch-style iterative improvement.
 Inspired by Karpathy's autoresearch: change → score → keep/rollback → repeat.
 But instead of optimizing ML training code, we optimize WRITING RULES to
 produce articles that pass AI detection while maintaining quality.
 The mutable surface: writing-config.yaml (style parameters + prompt rules)
 The fixed evaluation: humanness_score.py (objective checklist + subjective feel)
 The metric: composite_score (lower = more human, like val_bpb)
 Usage:
    python3 optimize_loop.py --topic "AI Agent" --iterations 10
    python3 optimize_loop.py --topic "AI Agent" --iterations 5 --verbose
 Architecture:
    1. Load current writing-config.yaml
    2. Generate article with current config
    3. Score with humanness_score.py
    4. LLM proposes a change to writing-config.yaml
    5. Generate article with new config
    6. Score again
    7. If improved → keep (commit). If not → rollback.
    8. Log to results.tsv
    9. Repeat.
 Requirements:
    - ANTHROPIC_API_KEY in environment (for article generation + LLM judge)
    - writing-config.yaml in skill root (created on first run with defaults)
 """
 import argparse
 import json
 import os
 import subprocess
 import sys
 from datetime import datetime
 from pathlib import Path
 import yaml
 SKILL_DIR = Path(__file__).parent.parent
 CONFIG_PATH = SKILL_DIR / "writing-config.yaml"
 RESULTS_PATH = SKILL_DIR / "optimization-results.tsv"
 DEFAULT_CONFIG = {
    "persona": "科技媒体资深编辑，写了八年公众号，对AI行业有深度认知",
    "sentence_variance": 0.7,
    "broken_sentence_rate": 0.04,
    "idiom_density": 0.15,
    "filler_style": "mixed",  # literary / casual / mixed / minimal
    "paragraph_rhythm": "chaotic",  # structured / chaotic / wave
    "self_correction_rate": 0.02,
    "tangent_frequency": "every_800_chars",  # never / every_500 / every_800 / every_1200
    "real_data_density": "high",  # low / medium / high
    "word_temperature_bias": "warm",  # cold / warm / hot / balanced
    "emotional_arc": "restrained_to_burst",  # flat / gradual / restrained_to_burst / volatile
    "opening_style": "scene",  # scene / data / question / anecdote / cold_open
    "closing_style": "open_question",  # summary / open_question / image / abrupt
    "structure_linearity": 0.3,  # 0=fully non-linear, 1=fully linear
 }
 def ensure_config():
    """Create default writing-config.yaml if it doesn't exist."""
    if not CONFIG_PATH.exists():
        with open(CONFIG_PATH, "w", encoding="utf-8") as f:
            yaml.dump(DEFAULT_CONFIG, f, allow_unicode=True, default_flow_style=False)
        print(f"Created default config: {CONFIG_PATH}")
    return yaml.safe_load(CONFIG_PATH.read_text(encoding="utf-8"))
 def score_article(article_path: str) -> dict:
    """Run humanness_score.py on an article. Returns parsed result."""
    result = subprocess.run(
        ["python3", str(SKILL_DIR / "scripts" / "humanness_score.py"), article_path, "--json"],
        capture_output=True, text=True
    )
    if result.returncode != 0:
        print(f"Scoring failed: {result.stderr}", file=sys.stderr)
        return {"composite_score": 100.0, "error": result.stderr}
    return json.loads(result.stdout)
 def log_result(iteration: int, composite: float, config_summary: str, status: str, description: str):
    """Append result to TSV log."""
    header_needed = not RESULTS_PATH.exists()
    with open(RESULTS_PATH, "a", encoding="utf-8") as f:
        if header_needed:
            f.write("iteration\ttimestamp\tcomposite\tstatus\tdescription\tconfig_change\n")
        ts = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        f.write(f"{iteration}\t{ts}\t{composite:.2f}\t{status}\t{description}\t{config_summary}\n")
 def print_banner(iteration: int, total: int):
    print(f"\n{'='*60}")
    print(f"  OPTIMIZATION LOOP — Iteration {iteration}/{total}")
    print(f"{'='*60}")
 def main():
    parser = argparse.ArgumentParser(description="WeWrite optimization loop")
    parser.add_argument("--topic", required=True, help="Article topic for testing")
    parser.add_argument("--iterations", type=int, default=10, help="Number of iterations")
    parser.add_argument("--verbose", "-v", action="store_true")
    args = parser.parse_args()
    print(f"""
 ╔══════════════════════════════════════════════════════╗
 ║  WeWrite Optimization Loop                          ║
 ║  Topic: {args.topic:<44s}║
 ║  Iterations: {args.iterations:<39d}║
 ║                                                      ║
 ║  Pattern: change config → generate → score →         ║
 ║           keep if better, rollback if worse           ║
 ╚══════════════════════════════════════════════════════╝
 """)
    config = ensure_config()
    print("This script provides the FRAMEWORK for optimization.")
    print("To run the full loop, you need:")
    print("  1. An article generation function (Claude API)")
    print("  2. A scoring function (humanness_score.py — included)")
    print("  3. An LLM to propose config changes (Claude API)")
    print()
    print("Current config:")
    print(yaml.dump(config, allow_unicode=True, default_flow_style=False))
    print()
    print("Run this loop via Claude Code / OpenClaw agent:")
    print()
    print("  Agent reads writing-config.yaml")
    print("  → generates article with those rules")
    print("  → scores with: python3 scripts/humanness_score.py article.md --json")
    print("  → proposes a config change")
    print("  → generates new article")
    print("  → scores again")
    print("  → if composite_score decreased → commit config change")
    print("  → if composite_score same/worse → rollback")
    print("  → logs to optimization-results.tsv")
    print("  → repeats")
    print()
    print("To test scoring on an existing article:")
    print(f"  python3 scripts/humanness_score.py <article.md> --verbose")
 if __name__ == "__main__":
    main()
--- a/writing-config.example.yaml
+++ b/writing-config.example.yaml
@ -1,10 +1,9 @@
 # WeWrite 写作参数（可优化）
-# 复制为 writing-config.yaml，然后用 optimize loop 迭代调优
+# 复制为 writing-config.yaml，在对话中说"优化参数"让 Agent 迭代调优
-# 或手动调整后观察朱雀检测结果
+# 或手动调整后用 humanness_score.py 评估
 #
 # 这个文件是起点，不是最优解。
-# 运行: python3 scripts/optimize_loop.py --topic "你的主题" --iterations 10
+# 在对话中说"优化参数"即可自动调优，每轮调整得分最低的参数。
 # 每次迭代会修改 writing-config.yaml 中的参数，保留得分更好的版本。
 #
 # 参数分三层，对应 writing-guide.md 的反检测结构。