feat: structured edit learning with typed patterns and confidence scoring

learn_edits.py: patterns now have type/key/description/rule fields, confidence auto-computed from occurrences + recency with 30-day decay. --summarize --json outputs aggregated patterns sorted by confidence. learn-edits.md: playbook.md format changed from free text to structured YAML rules with confidence levels. Rules with confidence ≥ 5 become hard constraints in Step 4, < 5 are soft references, < 2 get pruned. SKILL.md Step 4: playbook priority now confidence-gated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 20:23:33 +08:00 · 2026-03-30 20:23:33 +08:00 · 344f7509f1
commit 344f7509f1
parent 2d3d8e5f54
3 changed files with 245 additions and 115 deletions
--- a/SKILL.md
+++ b/SKILL.md
@ -198,7 +198,7 @@ WebSearch: "{选题关键词} 数据 报告 2025 2026"

 人格文件定义了：语气浓度、数据呈现方式、情绪弧线、段落节奏、不确定性表达模板等。作为 Step 4c 的硬性约束执行。

-**优先级**：playbook.md > persona > writing-guide.md。writing-guide 是底线（禁用词等），persona 在此基础上特化风格参数，playbook 是用户个性化的最终覆盖。
+**优先级**：playbook.md（confidence ≥ 5 的规则）> persona > writing-guide.md。writing-guide 是底线（禁用词等），persona 在此基础上特化风格参数，playbook 中高置信度规则是用户个性化的最终覆盖。playbook 中 confidence < 5 的规则作为软性参考。

 **4c. 写文章**：
 - H1 标题（20-28 字） + H2 结构，1500-2500 字
--- a/references/learn-edits.md
+++ b/references/learn-edits.md
@ -1,15 +1,15 @@
 # 学习人工修改（核心飞轮）

-这是 WeWrite 最重要的长期价值。每次用户编辑文章后让系统学习，下一次的初稿就会更接近用户的风格，需要的编辑量越来越少。
+这是 WeWrite 最重要的长期价值。每次用户编辑文章后让系统学习，下一<EFBFBD><EFBFBD><EFBFBD>的初稿就会更接近用户的风<EFBFBD><EFBFBD>，需要的编辑量越来越少。

-**飞轮效应**：初稿需要改 30% → 学习 5 次后只需改 15% → 学习 20 次后只需改 5%
+**飞轮效应**：初稿需要改 30% → 学习 5 次<EFBFBD><EFBFBD>只需改 15% → 学习 20 次后只需改 5%

 **触发**：用户说"我改了，学习一下"、"学习我的修改"

 ## 1. 获取 draft 和 final

- **draft**：`output/` 下最新的 .md 文件（按修改时间排序，`ls -t output/*.md | head -1`）
- **final**：用户提供修改后的版本。主动引导用户："请把你改好的文章全文粘贴给我，或者告诉我文件路径。如果你是在微信后台编辑器里改的，可以全选复制后直接粘贴到这里。"
+- **draft**：`output/` 下最新的 .md 文件（按修改时间排序<EFBFBD><EFBFBD><EFBFBD>`ls -t output/*.md | head -1`）
+- **final**：用户提供修改后的版本。主动引<EFBFBD><EFBFBD>用<EFBFBD><EFBFBD>："请把你改好的文章全文粘贴给我，或<EFBC8C><E68896>告诉我文件路径。如果你是在微信后台编辑器里改的，可以全选复制后直接粘贴到这里。"

 ## 2. 运行 diff 分析

@ -17,35 +17,72 @@
 python3 {skill_dir}/scripts/learn_edits.py --draft {draft_path} --final {final_path}
 ```

-## 3. 分析并记录
+## 3. 分析并记<EFBFBD><EFBFBD> pattern

-读取脚本输出的 diff 数据，对每个有意义的修改分类：
+读取脚本输<EFBFBD><EFBFBD><EFBFBD>的 diff 数据和 INSTRUCTIONS FOR AGENT，对每个有意义的修改写入 pattern。

- **用词替换**：AI 用了"讲真"，人工改成"坦白说"
- **段落删除**：人工觉得某段多余
- **段落新增**：人工补充了 AI 没写的内容
- **结构调整**：H2 顺序或分段方式的变化
- **标题修改**：标题风格偏好
- **语气调整**：整体语气的偏移方向
+**每个 pattern 必须包含**：
+- `type`<EFBFBD><EFBFBD>`word_sub` / `para_delete` / `para_add` / `structure` / `title` / `tone` / `expression`
+- `key`：短唯一标识（英文，如 `avoid_jiangzhen`、`shorter_paragraphs`、`more_negative_emotion`）
+- `description`：这次修改是什么（如"把'讲真'替换为'坦白说'"）
+- `rule`：可执行的写作指令<EFBFBD><EFBFBD><EFBFBD>**必须是祈使句，不是描述句**）

-将分类结果写入 `lessons/` 下的 diff YAML 文件的 edits 和 patterns 字段。
+**key 的复用**：如果这次的修改和之前某个 lesson 里的 pattern 是同一种偏好（比如又一次把段落改短了），使用**相同的 key**。这样 `--summarize` 时 occurrences 会累加<E7B4AF><E58AA0><EFBFBD>confidence 自动提升。

-## 4. 自动触发 Playbook 更新
+编辑 lesson YAML 文件中的 `patterns` 列表，写入分<E585A5><E58886><EFBFBD>结果。

-每积累 5 次 lessons，自动触发 playbook 更新：
+## 4. Playbook 更新
+
+每积累 5 次 lessons，触发 playbook 更新：

 ```bash
-python3 {skill_dir}/scripts/learn_edits.py --summarize
+python3 {skill_dir}/scripts/learn_edits.py --summarize --json
 ```

-脚本输出所有 lessons 的汇总数据。**Agent 必须执行以下步骤完成闭环**：
+读取 JSON 输出，按以下规则更新 `{skill_dir}/playbook.md`：

-1. 读取 summarize 输出，找出反复出现的 pattern（≥2 次）
-2. 读取当前 `{skill_dir}/playbook.md`（如果不存在则从零创建）
-3. **将 pattern 转化为可执行的写作规则**写入 playbook.md：
-   - 不要写"用户偏好简短段落"（描述性，不可执行）
-   - 要写"段落不超过 80 字，长段必须在 3 句内换行"（指令性，可执行）
-   - 每条规则必须是写作时能直接遵循的具体指令
-4. 保存 playbook.md
+### playbook.md 格式

-**验证闭环**：playbook.md 更新后，下次写作时"Playbook 优先"规则会自动加载新 pattern，初稿会反映用户偏好。
+playbook.md 是 YAML 格式，每条规则带 confidence 和元数据：
+
+```yaml
+# WeWrite Playbook — 从用户<E794A8><E688B7><EFBFBD>辑中学习的写作规则
+# 由 Agent 自动<E887AA><E58AA8><EFBFBD>护，不要手动编辑
+# confidence ≥ 5 的规则在 Step 4 写作时作为硬性约束<E7BAA6><E69D9F>行
+# confidence < 5 的规则作为软性参考
+
+rules:
+  - key: "shorter_paragraphs"
+    type: "expression"
+    rule: "段落不超过 80 字，长段必须在 3 句内换行"
+    confidence: 7.0
+    occurrences: 4
+    last_seen: "2026-03-28"
+
+  - key: "avoid_jiangzhen"
+    type: "word_sub"
+    rule: "不要使用'讲真'，用'坦白说'代替"
+    confidence: 5.0
+    occurrences: 2
+    last_seen: "2026-03-30"
+```
+
+### 更新规则
+
+1. **新增**：summarize 中出<E4B8AD><E587BA>了 playbook 里没有的 key → 直接添加
+2. **更新**：summarize 中的 confidence/occurrences/rule 比 playbook 里的新 → 用新值覆盖
+3. **保留**：playbook 中有但 summarize 中没有的规则 → 保留不动（可能是早期学到的，仍然<E4BB8D><E784B6>效）
+4. **衰减淘汰**：confidence < 2 的规则 → 删除（太旧或不再相关）
+
+## 5. Step 4 如何使用 playbook
+
+Step 4 写作时读取 playbook.md：
+
+- **confidence ≥ 5 的规则**：作为硬性约束执行（和 persona 同级）
+- **confidence 3-5 的规则**：作为软性参考（倾向遵循但不强制）
+- **confidence < 3 的规则**：忽略（可能已过时）
+
+这确保：
+- 用户反复确认的偏好（高 confidence）被严格执行
+- 只出现过一次的偏好（低 confidence）不<EFBC89><E4B88D>过度影响
+- 用户风格变化时，旧规则自然衰减退出
--- a/scripts/learn_edits.py
+++ b/scripts/learn_edits.py
@ -3,16 +3,21 @@
 Learn from human edits by diffing AI draft vs published final.

 Compares the original AI-generated article with the human-edited version,
-categorizes the changes, and saves lessons to lessons/.
+computes structured diffs, and saves typed lessons to lessons/.

-When 5+ lessons accumulate, outputs a prompt for the Agent to update playbook.md.
+Each lesson has:
+  - type: word_sub / para_delete / para_add / structure / title / tone
+  - occurrences: how many times this pattern has been seen across all lessons
+  - first_seen / last_seen: timestamps for confidence decay
+  - confidence: auto-computed from occurrences + recency
+
+When summarizing, outputs all patterns with aggregated confidence scores.
+The Agent uses this to write structured playbook.md rules.

 Usage:
    python3 learn_edits.py --draft path/to/draft.md --final path/to/final.md
-    python3 learn_edits.py --summarize   # summarize all lessons
-
-The script does structural analysis; the Agent (LLM) interprets the diffs
-and writes the lesson YAML + playbook updates.
+    python3 learn_edits.py --summarize          # all lessons with confidence
+    python3 learn_edits.py --summarize --json    # JSON output for agent
 """

 import argparse
@ -20,13 +25,24 @@ import difflib
 import json
 import re
 import sys
-from datetime import datetime
+from datetime import datetime, timedelta
 from pathlib import Path

 import yaml

 SKILL_DIR = Path(__file__).parent.parent

+# Pattern types with descriptions
+PATTERN_TYPES = {
+    "word_sub": "用词替换",
+    "para_delete": "段落删除",
+    "para_add": "段落新增",
+    "structure": "结构调整",
+    "title": "标题修改",
+    "tone": "语气调整",
+    "expression": "表达偏好",
+}
+

 def load_text(path: str) -> str:
    return Path(path).read_text(encoding="utf-8")
@ -36,7 +52,6 @@ def split_sections(text: str) -> list[dict]:
    """Split markdown into sections by H2 headers."""
    sections = []
    current = {"header": "(intro)", "lines": []}
-
    for line in text.split("\n"):
        if line.strip().startswith("## "):
            if current["lines"] or current["header"] != "(intro)":
@ -44,7 +59,6 @@ def split_sections(text: str) -> list[dict]:
            current = {"header": line.strip(), "lines": []}
        else:
            current["lines"].append(line)
-
    sections.append(current)
    return sections

@ -61,44 +75,30 @@ def compute_diff(draft: str, final: str) -> dict:
    draft_lines = draft.split("\n")
    final_lines = final.split("\n")

-    # Line-level diff
    differ = difflib.unified_diff(draft_lines, final_lines, lineterm="")
    diff_lines = list(differ)

-    # Categorize changes
-    additions = []
-    deletions = []
-    for line in diff_lines:
-        if line.startswith("+") and not line.startswith("+++"):
-            additions.append(line[1:].strip())
-        elif line.startswith("-") and not line.startswith("---"):
-            deletions.append(line[1:].strip())
+    additions = [l[1:].strip() for l in diff_lines
+                 if l.startswith("+") and not l.startswith("+++") and l[1:].strip()]
+    deletions = [l[1:].strip() for l in diff_lines
+                 if l.startswith("-") and not l.startswith("---") and l[1:].strip()]

-    # Filter empty lines
-    additions = [l for l in additions if l]
-    deletions = [l for l in deletions if l]
-
-    # Title change
    draft_title = extract_title(draft)
    final_title = extract_title(final)
-    title_changed = draft_title != final_title

-    # Section-level analysis
    draft_sections = split_sections(draft)
    final_sections = split_sections(final)
    draft_h2s = [s["header"] for s in draft_sections if s["header"] != "(intro)"]
    final_h2s = [s["header"] for s in final_sections if s["header"] != "(intro)"]
-    structure_changed = draft_h2s != final_h2s

-    # Word count change
    draft_chars = len(draft.replace("\n", "").replace(" ", ""))
    final_chars = len(final.replace("\n", "").replace(" ", ""))

    return {
-        "title_changed": title_changed,
+        "title_changed": draft_title != final_title,
        "draft_title": draft_title,
        "final_title": final_title,
-        "structure_changed": structure_changed,
+        "structure_changed": draft_h2s != final_h2s,
        "draft_h2s": draft_h2s,
        "final_h2s": final_h2s,
        "lines_added": len(additions),
@ -111,22 +111,22 @@ def compute_diff(draft: str, final: str) -> dict:
    }


-def save_diff_for_analysis(diff_result: dict, draft_path: str, final_path: str):
-    """Save diff data for Agent to analyze and write lessons."""
+def save_lesson(diff_result: dict, draft_path: str, final_path: str) -> Path:
+    """Save structured lesson data for Agent to analyze."""
    lessons_dir = SKILL_DIR / "lessons"
    lessons_dir.mkdir(parents=True, exist_ok=True)

    date_str = datetime.now().strftime("%Y-%m-%d")
-    diff_file = lessons_dir / f"{date_str}-diff.yaml"
+    lesson_file = lessons_dir / f"{date_str}-diff.yaml"

-    # If file exists, append a counter
    counter = 1
-    while diff_file.exists():
-        diff_file = lessons_dir / f"{date_str}-diff-{counter}.yaml"
+    while lesson_file.exists():
+        lesson_file = lessons_dir / f"{date_str}-diff-{counter}.yaml"
        counter += 1

    data = {
        "date": date_str,
+        "timestamp": datetime.now().isoformat(),
        "draft_file": str(draft_path),
        "final_file": str(final_path),
        "diff_summary": {
@ -138,45 +138,138 @@ def save_diff_for_analysis(diff_result: dict, draft_path: str, final_path: str):
            "lines_deleted": diff_result["lines_deleted"],
            "char_diff": diff_result["char_diff"],
        },
-        "edits": [],  # Agent fills this after analysis
-        "patterns": [],  # Agent fills this after analysis
+        # Agent fills these after analyzing the draft and final:
+        "patterns": [],
+        # Pattern format (Agent writes):
+        # - type: "word_sub"        # one of PATTERN_TYPES keys
+        #   key: "avoid_jiangzhen"  # short unique identifier
+        #   description: "把'讲真'替换为'坦白说'"
+        #   rule: "不要使用'讲真'，用'坦白说'代替"  # imperative, executable
    }

-    with open(diff_file, "w", encoding="utf-8") as f:
+    with open(lesson_file, "w", encoding="utf-8") as f:
        yaml.dump(data, f, allow_unicode=True, default_flow_style=False)

-    return diff_file
+    return lesson_file


-def count_lessons() -> int:
-    """Count existing lesson files."""
+def load_all_lessons() -> list[dict]:
+    """Load all lesson files."""
    lessons_dir = SKILL_DIR / "lessons"
    if not lessons_dir.exists():
-        return 0
-    return len(list(lessons_dir.glob("*-diff*.yaml")))
-
-
-def summarize_lessons():
-    """Load all lessons and output for Agent to update playbook."""
-    lessons_dir = SKILL_DIR / "lessons"
-    if not lessons_dir.exists():
-        print("No lessons directory found.")
-        return
-
-    lesson_files = sorted(lessons_dir.glob("*-diff*.yaml"))
-    if not lesson_files:
-        print("No lessons found.")
-        return
-
-    all_lessons = []
-    for f in lesson_files:
+        return []
+    lessons = []
+    for f in sorted(lessons_dir.glob("*-diff*.yaml")):
        with open(f, "r", encoding="utf-8") as fh:
            data = yaml.safe_load(fh)
            if data:
-                all_lessons.append(data)
+                lessons.append(data)
+    return lessons

-    print(f"Total lessons: {len(all_lessons)}")
-    print(json.dumps(all_lessons, ensure_ascii=False, indent=2))
+
+def compute_confidence(occurrences: int, first_seen: str, last_seen: str) -> float:
+    """Compute confidence score from frequency and recency.
+
+    Confidence = base_from_occurrences + recency_bonus - age_decay.
+
+    - 1 occurrence = 3 (low, might be one-off)
+    - 2 occurrences = 5 (moderate, likely a preference)
+    - 3+ occurrences = 7+ (high, confirmed preference)
+    - Recency bonus: +1 if last_seen within 7 days
+    - Age decay: -1 per 30 days since last_seen (user style evolves)
+    - Clamped to 1-10
+    """
+    base = min(8, 2 + occurrences * 2)
+
+    try:
+        last = datetime.fromisoformat(last_seen)
+        days_since = (datetime.now() - last).days
+    except (ValueError, TypeError):
+        days_since = 0
+
+    recency_bonus = 1.0 if days_since <= 7 else 0.0
+    age_decay = max(0, days_since // 30)
+
+    return max(1.0, min(10.0, base + recency_bonus - age_decay))
+
+
+def aggregate_patterns(lessons: list[dict]) -> list[dict]:
+    """Aggregate patterns across all lessons. Returns sorted by confidence."""
+    pattern_map = {}  # key → aggregated data
+
+    for lesson in lessons:
+        date = lesson.get("date", "")
+        timestamp = lesson.get("timestamp", date)
+        for p in lesson.get("patterns", []):
+            key = p.get("key", "")
+            if not key:
+                continue
+            if key not in pattern_map:
+                pattern_map[key] = {
+                    "key": key,
+                    "type": p.get("type", "expression"),
+                    "description": p.get("description", ""),
+                    "rule": p.get("rule", ""),
+                    "occurrences": 0,
+                    "first_seen": timestamp,
+                    "last_seen": timestamp,
+                }
+            entry = pattern_map[key]
+            entry["occurrences"] += 1
+            # Keep the most recent description/rule (may evolve)
+            if p.get("description"):
+                entry["description"] = p["description"]
+            if p.get("rule"):
+                entry["rule"] = p["rule"]
+            # Update timestamps
+            if timestamp < entry["first_seen"]:
+                entry["first_seen"] = timestamp
+            if timestamp > entry["last_seen"]:
+                entry["last_seen"] = timestamp
+
+    # Compute confidence for each
+    results = []
+    for entry in pattern_map.values():
+        entry["confidence"] = round(compute_confidence(
+            entry["occurrences"], entry["first_seen"], entry["last_seen"]
+        ), 1)
+        results.append(entry)
+
+    # Sort by confidence descending
+    results.sort(key=lambda x: x["confidence"], reverse=True)
+    return results
+
+
+def summarize_lessons(as_json: bool = False):
+    """Load all lessons, aggregate patterns, output with confidence scores."""
+    lessons = load_all_lessons()
+    if not lessons:
+        print("No lessons found.")
+        return
+
+    patterns = aggregate_patterns(lessons)
+
+    if as_json:
+        print(json.dumps({
+            "total_lessons": len(lessons),
+            "total_patterns": len(patterns),
+            "patterns": patterns,
+        }, ensure_ascii=False, indent=2))
+        return
+
+    print(f"Total lessons: {len(lessons)}")
+    print(f"Unique patterns: {len(patterns)}")
+    print()
+
+    for p in patterns:
+        type_label = PATTERN_TYPES.get(p["type"], p["type"])
+        conf_bar = "█" * int(p["confidence"]) + "░" * (10 - int(p["confidence"]))
+        print(f"  {conf_bar} {p['confidence']:4.1f}  [{type_label}] {p['key']}")
+        print(f"         {p['description']}")
+        if p["rule"]:
+            print(f"         → {p['rule']}")
+        print(f"         seen {p['occurrences']}x, first {p['first_seen'][:10]}, last {p['last_seen'][:10]}")
+        print()


 def main():
@ -184,21 +277,19 @@ def main():
    parser.add_argument("--draft", help="Path to AI draft")
    parser.add_argument("--final", help="Path to human-edited final")
    parser.add_argument("--summarize", action="store_true", help="Summarize all lessons")
+    parser.add_argument("--json", action="store_true", help="JSON output (with --summarize)")
    args = parser.parse_args()

    if args.summarize:
-        summarize_lessons()
+        summarize_lessons(as_json=args.json)
        return

    if not args.draft or not args.final:
        print("Error: --draft and --final required", file=sys.stderr)
        sys.exit(1)

-    # Load texts
    draft = load_text(args.draft)
    final = load_text(args.final)
-
-    # Compute diff
    diff_result = compute_diff(draft, final)

    # Print summary
@ -230,43 +321,45 @@ def main():
        for line in diff_result["additions_sample"][:10]:
            print(f"  + {line[:80]}")

-    # Save for Agent analysis
-    diff_file = save_diff_for_analysis(diff_result, args.draft, args.final)
-    print(f"\nDiff saved to: {diff_file}")
+    # Save lesson
+    lesson_file = save_lesson(diff_result, args.draft, args.final)
+    print(f"\nLesson saved to: {lesson_file}")

-    # Check if playbook update should be triggered
-    lesson_count = count_lessons()
+    lesson_count = len(load_all_lessons())
    print(f"Total lessons: {lesson_count}")

    if lesson_count >= 5 and lesson_count % 5 == 0:
        print(f"\n{'=' * 60}")
        print("PLAYBOOK UPDATE TRIGGERED")
        print(f"{'=' * 60}")
-        print(f"{lesson_count} lessons accumulated. Agent should:")
-        print(f"1. Read all lessons: python3 learn_edits.py --summarize")
-        print(f"2. Read current playbook: playbook.md")
-        print(f"3. Update playbook with recurring patterns from lessons")
+        print(f"{lesson_count} lessons. Agent should run:")
+        print(f"  python3 scripts/learn_edits.py --summarize --json")
+        print(f"Then update playbook.md with high-confidence patterns.")

-    # Output instructions for Agent
+    # Instructions for Agent
    print(f"""
 {'=' * 60}
 INSTRUCTIONS FOR AGENT
 {'=' * 60}

-Read the draft and final versions, then analyze the edits:
+Read the draft and final versions, then for each meaningful edit:

 1. Read: {args.draft}
 2. Read: {args.final}
-3. For each meaningful edit, classify it:
-   - type: "用词替换" / "段落删除" / "段落新增" / "结构调整" / "标题修改" / "语气调整"
-   - before: (original text)
-   - after: (edited text)
-   - pattern: (what this tells us about the user's preference)
+3. For each edit, add a pattern entry to {lesson_file}:

-4. Update {diff_file} with the edits and patterns lists.
+   patterns:
+     - type: "word_sub"           # {' / '.join(PATTERN_TYPES.keys())}
+       key: "short_unique_id"     # e.g. "avoid_jiangzhen", "shorter_paragraphs"
+       description: "把'讲真'替换为'坦白说'"
+       rule: "不要使用'讲真'，用'坦白说'代替"  # imperative, executable

-5. If this is a recurring pattern (seen in previous lessons too),
-   consider updating playbook.md.
+4. Rules must be imperative (可执行的指令), not descriptive.
+   BAD:  "用户偏好简短段落"
+   GOOD: "段落不超过 80 字，长段必须在 3 句内换行"
+
+5. If pattern already exists in previous lessons (same key),
+   confidence will auto-increase on next --summarize.
 """)