chore: rebuild dist/openclaw from source

2026-03-30 16:55:01 +00:00 · 2026-03-30 16:55:01 +00:00 · f9722fb93b
commit f9722fb93b
parent a92d453856
13 changed files with 683 additions and 17 deletions
--- a/dist/openclaw/SKILL.md
+++ b/dist/openclaw/SKILL.md
@ -43,7 +43,7 @@ description: |
  2. 如果有 fail 项 → 直接报告，建议修复
  3. 如果全 pass 或仅 warn → 继续 LLM 深度分析：
     - 读取 `style.yaml` 的 tone/voice 与 writing_persona，判断是否矛盾
-     - 读取 `writing-config.yaml`（如存在），检查是否有 AI 特征参数（emotional_arc: flat、paragraph_rhythm: structured、closing_style: summary）
+     - 读取 `writing-config.yaml`（如存在），检查是否有 AI 特征参数（emotional_arc: flat、paragraph_rhythm: structured、closing_tendency: summary）
     - 读取 `history.yaml` 最近 5 篇，检查 persona 使用和 web_search 降级情况
  4. 综合输出自然语言报告 + 按优先级排序的改进建议
 - 用户说"优化写作参数"/"优化参数"/"跑优化" → 执行以下流程：
@ -97,6 +97,7 @@ python3 -c "import markdown, bs4, cssutils, requests, yaml, pygments, PIL" 2>&1
 | Python 依赖 | 静默 | 提供 `pip install -r requirements.txt` |
 | `wechat.appid` + `secret` | 静默 | 设 `skip_publish = true` |
 | `image.api_key` | 静默 | 设 `skip_image_gen = true` |
+| `references/exemplars/index.yaml` | 静默 | 提示："范文库为空。如果你有已发布的文章（markdown），可以说**'导入范文'**建立风格库，写出来的文章会更像你。没有也不影响使用。" |

 **1.2 版本检查**（静默通过或提醒）：

@ -189,6 +190,7 @@ web_search: "{选题关键词} 数据 报告 2025 2026"
 读取: {baseDir}/playbook.md（如果存在，按 confidence 分级执行）
 读取: {baseDir}/writing-config.yaml（如果存在，作为写作参数）
 读取: {baseDir}/history.yaml（最近 3 篇的 dimensions 字段）
+读取: {baseDir}/references/exemplars/index.yaml（如果存在）
 ```

 **4.1 历史最佳参数参考**（有 history.yaml 且包含 composite_score 时执行）：
@ -208,18 +210,75 @@ web_search: "{选题关键词} 数据 报告 2025 2026"

 人格文件定义了：语气浓度、数据呈现方式、情绪弧线、段落节奏、不确定性表达模板等。作为 4.4 的硬性约束执行。

-**优先级**：playbook.md（confidence ≥ 5 的规则）> persona > writing-guide.md。writing-guide 是底线（禁用词等），persona 在此基础上特化风格参数，playbook 中高置信度规则是用户个性化的最终覆盖。playbook 中 confidence < 5 的规则作为软性参考。
+**优先级**：playbook.md（confidence ≥ 5 的规则）> persona > 范文风格 > writing-guide.md。writing-guide 是底线（禁用词等），范文提供风格示范（句长节奏、情绪表达方式），persona 在此基础上特化风格参数（语气浓度、数据呈现），playbook 中高置信度规则是用户个性化的最终覆盖。playbook 中 confidence < 5 的规则作为软性参考。

-**4.4 写文章**：
+**4.4 范文风格注入**（有 `references/exemplars/index.yaml` 时执行）：
+
+从 index.yaml 筛选 category 匹配当前框架类型的范文，按 humanness_score 升序（越低越人类）取 top 3。读取对应 .md 文件的片段内容。
+
+在写作 prompt 中注入：
+
+> 以下是该公众号风格的真实段落示例，模仿其句长节奏、情绪强度和口语化程度：
+>
+> 【开头风格】
+> {exemplar_1 的开头钩子段}
+>
+> 【情绪段风格】
+> {exemplar_2 的情绪高峰段}
+>
+> 【转折风格】
+> {exemplar_2 或 exemplar_3 的转折/自纠段（如有）}
+>
+> 【收尾风格】
+> {exemplar_3 的收尾段}
+
+Category 映射规则：
+
+| 框架类型 | exemplar category |
+|----------|-------------------|
+| 痛点型/深度解读 | tech-opinion |
+| 故事型 | story-emotional |
+| 清单型/对比型 | list-practical |
+| 热点解读型 | hot-take |
+| 其他 | general |
+
+如果匹配到的范文不足 3 篇，用 general category 补足。
+
+**Fallback（范文库为空时）**：读取 `{baseDir}/references/exemplar-seeds.yaml`，从每个段落类型中随机选 1 个注入 prompt。种子段落只示范人类写作的结构模式（句长方差、情绪锐度、自我纠正、非总结式收尾），不携带特定风格。注入时使用：
+
+> 以下是人类写作的结构模式示例，注意模仿其句长节奏和情绪表达方式（不要模仿具体内容或风格）：
+>
+> 【开头模式】{seeds.opening_hooks 随机 1 个}
+>
+> 【情绪段模式】{seeds.emotional_peaks 随机 1 个}
+>
+> 【转折模式】{seeds.transitions 随机 1 个}
+>
+> 【收尾模式】{seeds.closings 随机 1 个}
+
+建库命令：`python3 {baseDir}/scripts/extract_exemplar.py article.md`
+
+**4.5 写文章**：
 - H1 标题（20-28 字） + H2 结构，1500-2500 字
 - 真实素材锚定：Step 3.2 的素材分散嵌入各 H2 段落
 - **写作人格**：按 4.3 加载的人格参数写作（数据呈现方式、个人声音浓度、不确定性表达等）
+- **收尾方式**：persona 的 `closing_tendency` 仅作为倾向参考。根据文章内容和情绪弧线自行判断最自然的收尾方式（参见 writing-guide.md 收尾多样性表）。如果 history.yaml 中最近 3 篇有 `closing_type` 字段，避免使用相同的收尾类型
 - 3 层反检测规则（统计/语言/内容）在初稿阶段全部生效
 - 2-3 个编辑锚点：`<!-- ✏️ 编辑建议：在这里加一句你自己的经历/看法 -->`
 - 可选容器语法：`:::dialogue`、`:::timeline`、`:::callout`、`:::quote`

 保存到 `{baseDir}/output/{date}-{slug}.md`

+**4.6 快速自检**（写完后立即执行，减少 Step 5 重写概率）：
+
+对初稿做 3 项最易不达标的快速扫描，**当场修复**，不留到 Step 5：
+
+1. **禁用词扫描**：检查 writing-guide.md 2.1 的禁用词列表，命中的直接替换（最常见的问题，修复成本最低）
+2. **句长方差检查**：粗略扫描是否有连续 3 句以上长度接近的段落，有则拆句或加短句
+3. **负面情绪检查**：全文是否有 ≥ 2 处真实负面表达，不够则在编辑锚点附近补充
+
+这 3 项检查不需要调用脚本，LLM 自行完成即可。目标是让初稿在进入 Step 5 前已经消除最明显的问题。
+
 ---

 ### Step 5: SEO + 验证
@ -249,7 +308,7 @@ web_search: "{选题关键词} 数据 报告 2025 2026"
 | 内容 | 密度波浪 | 高密度段后跟低密度段 | 3.3 |
 | 内容 | 维度贯穿 | 激活维度全文可见 | 3.4 |

-不通过 → 定向重写该段落。3 次仍不过 → 标注跳过。
+不通过 → **定向修复**：只替换不达标的具体句子/段落，不动已通过的部分。每轮最多改 3 处，改完立即重新检查该项。2 轮仍不过 → 标注跳过，继续下一项。

 **5.3 脚本验证**（补充逐项检查）：

@ -261,8 +320,8 @@ python3 {baseDir}/scripts/humanness_score.py {article_path} --json --tier3 {agen

 解读 JSON 中 `composite_score`：
 - < 30 → 通过，继续 Step 6
- 30-50 → 查看 `param_scores` 中最低分项，定向重写对应段落
- \> 50 → 重大问题，逐个低分项修复，最多 3 轮
+- 30-50 → 查看 `param_scores` 中最低分的 1-2 项，只修复对应的具体句子（不重写整段），改完重新打分。1 轮即可
+- \> 50 → 取 `param_scores` 最低的 2-3 项，逐项定向修复（每项只改最相关的 1-2 处），最多 2 轮。仍 > 50 则标记 DONE_WITH_CONCERNS 继续

 ---

@ -332,6 +391,7 @@ python3 {baseDir}/toolkit/cli.py preview {markdown} --theme {theme} --no-open -o
  writing_persona: "{人格名}"
  dimensions:
    - "{维度}: {选项}"
+  closing_type: "{收尾类型}"  # trailing_off/unanswered/scene_revert/abrupt_stop/anti_conclusion/image
  composite_score: {Step 5.3 的 composite_score}  # 0=人类, 100=AI
  writing_config_snapshot:  # 本次使用的关键参数（从 writing-config.yaml 提取）
    sentence_variance: {值}
@ -366,6 +426,8 @@ python3 {baseDir}/toolkit/cli.py preview {markdown} --theme {theme} --no-open -o
 | 做一个小绿书/图片帖 | `python3 {baseDir}/toolkit/cli.py image-post img1.jpg img2.jpg -t "标题"` |
 | 诊断配置 / 检查反AI / 为什么AI检测没过 | `python3 {baseDir}/scripts/diagnose.py --json` + LLM 交叉分析 |
 | 优化写作参数 / 优化参数 | 迭代循环：写测试短文 → 打分 → 调参（见辅助功能） |
+| 导入范文 / 建范文库 | `python3 {baseDir}/scripts/extract_exemplar.py article.md` |
+| 查看范文库 | `python3 {baseDir}/scripts/extract_exemplar.py --list` |

 ---

@ -380,7 +442,8 @@ python3 {baseDir}/toolkit/cli.py preview {markdown} --theme {theme} --no-open -o
 | 素材采集（web_search） | LLM 训练数据中可验证的公开信息 |
 | 维度随机化 | history 空时跳过去重 |
 | Persona 文件不存在 | 回退到 midnight-friend（默认） |
-| 去 AI 验证 | 3 次重写不过则跳过该项 |
+| 范文库为空 | Fallback 到 exemplar-seeds.yaml（通用模式） |
+| 去 AI 验证 | 2 轮定向修复不过则跳过该项 |
 | 生图失败 | 输出提示词 |
 | 推送失败 | 本地 HTML |
 | 历史写入 | 警告不阻断 |
--- a/dist/openclaw/VERSION
+++ b/dist/openclaw/VERSION
@ -1 +1 @@
-1.3.0
+1.3.3
--- a/dist/openclaw/personas/cold-analyst.yaml
+++ b/dist/openclaw/personas/cold-analyst.yaml
@ -17,7 +17,7 @@ single_sentence_paragraph_rate: 0.08   # 少用单句段落，保持专业感

 emotional_arc: "flat_with_insight"     # 整体平稳，在关键洞察处提升强度
 opening_style: "thesis"               # 开头直接亮核心论点
-closing_style: "implications"         # 以"这意味着什么"收束
+closing_tendency: "implications"       # 倾向于以"这意味着什么"收束，但根据文章内容自行判断最合适的收尾方式

 data_intro_pattern: "framework → data → implication → caveat"
 # 示例：
--- a/dist/openclaw/personas/industry-observer.yaml
+++ b/dist/openclaw/personas/industry-observer.yaml
@ -16,7 +16,7 @@ single_sentence_paragraph_rate: 0.10

 emotional_arc: "steady_with_spikes"   # 整体平稳，1-2 处锐利判断
 opening_style: "news_hook"            # 以一个行业事件/数据切入
-closing_style: "open_question"        # 留一个没答案的问题
+closing_tendency: "open_question"      # 倾向于留一个没答案的问题，但根据文章内容自行判断最合适的收尾方式

 data_intro_pattern: "context → data → contrast → judgment"
 # 示例：
--- a/dist/openclaw/personas/midnight-friend.yaml
+++ b/dist/openclaw/personas/midnight-friend.yaml
@ -18,7 +18,7 @@ single_sentence_paragraph_rate: 0.25  # 25% 的段落只有 1 句
 # 情绪
 emotional_arc: "restrained_to_burst"
 opening_style: "personal_moment"  # 以一个私人时刻开头（"凌晨一点多…"）
-closing_style: "trailing_off"     # 不收束，像聊天自然结尾（"我先睡了"/"真的看不清楚"）
+closing_tendency: "trailing_off"   # 倾向于不收束、像聊天自然结尾，但根据文章内容自行判断最合适的收尾方式

 # 数据呈现
 data_intro_pattern: "scene → reaction → data → interpretation"
--- a/dist/openclaw/personas/sharp-journalist.yaml
+++ b/dist/openclaw/personas/sharp-journalist.yaml
@ -17,7 +17,7 @@ single_sentence_paragraph_rate: 0.20   # 多用短句成段制造节奏

 emotional_arc: "cold_open_to_sharp_close"
 opening_style: "cold_open"       # 直接切入核心矛盾，不铺垫
-closing_style: "sharp_statement" # 一句定性收束
+closing_tendency: "sharp_statement" # 倾向于一句定性收束，但根据文章内容自行判断最合适的收尾方式

 data_intro_pattern: "claim → evidence → twist"
 # 示例：
--- a/dist/openclaw/personas/warm-editor.yaml
+++ b/dist/openclaw/personas/warm-editor.yaml
@ -16,7 +16,7 @@ single_sentence_paragraph_rate: 0.15

 emotional_arc: "gentle_build"          # 缓慢升温，情绪在中后段到达高点
 opening_style: "scene"                 # 以一个温暖的场景开头
-closing_style: "image"                 # 用一个画面收束
+closing_tendency: "image"               # 倾向于用一个画面收束，但根据文章内容自行判断最合适的收尾方式

 data_intro_pattern: "story → embed data → feeling"
 # 示例：
--- a/dist/openclaw/references/exemplar-seeds.yaml
+++ b/dist/openclaw/references/exemplar-seeds.yaml
@ -0,0 +1,96 @@
+# 通用人类写作模式种子
+#
+# 用途：没有范文库的用户，Step 4.4 用这些段落作为 few-shot 注入，
+#       教 LLM "人类写作的结构模式长什么样"。
+#
+# 设计原则：
+#   - 只示范结构模式（句长方差、情绪锐度、自我纠正、非总结式收尾）
+#   - 不携带特定风格/人格（任何 persona 都能兼容）
+#   - 每个段落标注了它示范的反AI模式
+#
+# 有用户自己的范文库时，这个文件不会被使用。
+
+opening_hooks:
+  - text: |
+      好多年没有坐公交了，上次去太子湾，由于景区限行，只能把车停在外面，坐景区免费接驳车过去。
+      前面座位看到一个小女孩一直在刷那种 AI 生成的短视频，画面非常粗糙，内容也很假，滑到下一个居然还是差不多的东西，看得津津有味。
+      当时看到这一幕我甚至有点伤心。
+    pattern: "日常观察切入 → 意外情绪反应。不总结、不预告、不铺垫。"
+
+  - text: |
+      **本硕八年毕业，单程通勤两个半小时，**月薪2690。 这是市场给我贴的标签。
+      **裸辞。创业，年收超7位数。** 这是我自己撕掉那个标签之后，重新定义的自己。
+      一路走来，中间发生了什么？我讲给你听。
+    pattern: "标签→撕裂对比开头。两组加粗短句制造视觉和语义落差。句长标准差 45.7（数据最高）。"
+
+  - text: |
+      29号，我和小伙伴在深圳搞活动。
+      活动结束之后，我想顺道拜访一个多年没见的老朋友，发消息过去。
+      他回：在三亚。
+      我问：度假？
+      他说：带孩子。
+      我盯着手机屏幕，愣了整整三秒。
+    pattern: "对话碎片制造节奏。2-4字短句紧邻20+字长句。物理反应替代心理描写。"
+
+  - text: |
+      我信了这套话很多年。
+      "要有长期主义。要相信复利。时间是最好的朋友。"
+      最惨的一次，在一个方向扎进去3年，回头一看，什么都没留下来。
+    pattern: "先认同再推翻。引用常见正确的话→用个人惨痛经历否定。开头即高潮。"
+
+emotional_peaks:
+  - text: |
+      我信了这套话很多年。
+      最惨的一次，在一个方向扎进去3年，回头一看，什么都没留下来。
+      这不是失败——失败还有个明确的结果。
+      是你信错了一件事。
+    pattern: "用'最惨'而非'有挑战'。否定委婉说法（'这不是失败'），给出更痛的定义。"
+
+  - text: |
+      很多人在温水煮青蛙的过程中得过且过，过着看似满意、实则内心有很多不满的生活，然后说一句，算了吧，现在这样也还行，但这样反而错失了挖掘自己最大潜力的机会。
+    pattern: "用'温水煮青蛙'具象化停滞感。'算了吧'是内心独白式引用。来自得分最低（32.8）的文章。"
+
+  - text: |
+      讲真，我每次看到这种争论，都觉得……怎么说呢……挺无语的。
+      不是说这些人蠢。
+      是他们在纠结一个根本不存在的问题。
+      什么叫"AI味道"？你能定义吗？你能量化吗？你能验证吗？
+      不能。
+      那你在纠结什么？
+    pattern: "填充词（'怎么说呢'）+ 连续反问不给答案 + 单字段落（'不能。'）。"
+
+transitions:
+  - text: |
+      我第一反应是"孩子这时候不应该在学校吗"，第二反应是想把这话发过去，第三反应是我把那句话吞回去了——因为我在那三秒里想清楚了一件事。
+    pattern: "思维过程外化（三个反应）。破折号打断 → 时间锚点（'三秒'）→ 悬念。"
+
+  - text: |
+      不过，到了之后我发现，什么作息啊，学习强度啊，都不是最难熬的，人才是。
+    pattern: "列举预期困难再一句否定。转折词 + 真实困难揭示。"
+
+  - text: |
+      不过话又说回来。知道自己在局里，这件事本身，就已经是出局的开始了。
+    pattern: "'不过话又说回来'——自我推翻后重新定位。制造思维的非线性感。"
+
+closings:
+  - text: |
+      时间是你唯一不可再生的资源。
+      把它投进一个真实存在的锚点，才叫复利。
+      投进一个"我相信它会好"的希望，叫做漫长的等死。
+    pattern: "重新定义核心概念收尾。'等死'替代励志结论。来自得分 36.7 的文章。"
+
+  - text: |
+      有了 AI 之后，很多事都更容易了，但也正因为更容易了，什么东西真的值得做、值得花很多年去换，反而变得更难想清楚。要做什么可能比怎么更快做出一个东西更加重要了。
+    pattern: "结尾是未完成的思考，不是结论。'可能'留有余地。没有升华。"
+
+  - text: |
+      我苦哈哈的在电脑前，写这篇文章，想着我的女儿。
+      差距是真实的。
+      机会也是真实的。
+      时钟在走，窗口在收窄。
+    pattern: "回到写作现场。重复句式（'是真实的'）制造执念感。来自得分 33.0 的文章。"
+
+  - text: |
+      不要在那个愣住的感觉里待太久。
+      那个感觉，待久了，就成了借口。
+    pattern: "回扣开头意象。两句话收束，不解释。草率感本身就是风格。"
--- a/dist/openclaw/references/exemplars/.gitkeep
+++ b/dist/openclaw/references/exemplars/.gitkeep
--- a/dist/openclaw/references/writing-guide.md
+++ b/dist/openclaw/references/writing-guide.md
@ -101,6 +101,18 @@ AI 段落长度趋于均匀。人类段落忽长忽短。
 - 每段末尾都用反问句（变成另一种模式化）
 - 口语词匀速分布（不要每 200 字准时出现一个"讲真"）
 - 总结性收尾（"让我们拭目以待"/"未来可期"）
+- 连续文章使用相同收尾结构（收尾方式应由文章内容决定，不是由人格模板决定）
+
+**收尾多样性**：persona 的 `closing_tendency` 是倾向而非硬规则。根据文章走到结尾时的内容和情绪自行判断最自然的收尾方式。以下是常见的人类收尾模式，每篇文章选最贴合内容的一种：
+
+| 模式 | 特征 | 适合场景 |
+|------|------|---------|
+| 自然断流 | 像聊天说到一半停了（"我先睡了"/"就这样吧"） | 深夜风格、随笔 |
+| 未答之问 | 以问题结尾，不给答案 | 争议话题、引发思考 |
+| 场景回扣 | 回到开头的意象/场景 | 叙事类、故事驱动 |
+| 硬切 | 最后一个论点说完直接结束，无收束语 | 评论、观点类 |
+| 反结论 | 明确拒绝给结论（"我也不知道"/"答案可能不存在"） | 复杂议题、探索性 |
+| 画面定格 | 用一个视觉画面收束 | 情感类、人物类 |

 **writing-config 参数**：`emotional_arc`（flat/gradual/restrained_to_burst/volatile）

--- a/dist/openclaw/scripts/extract_exemplar.py
+++ b/dist/openclaw/scripts/extract_exemplar.py
@ -0,0 +1,374 @@
+#!/usr/bin/env python3
+"""
+Extract style exemplars from human-written articles for SICO-style few-shot injection.
+
+Takes a markdown article, analyzes it for style fingerprints, extracts key
+segments (opening hook, emotional peak, transition/self-correction, closing),
+and saves structured exemplar files to references/exemplars/.
+
+Usage:
+    python3 scripts/extract_exemplar.py article.md
+    python3 scripts/extract_exemplar.py article.md --category tech-opinion --source "公众号名"
+    python3 scripts/extract_exemplar.py article1.md article2.md article3.md  # batch
+    python3 scripts/extract_exemplar.py --list                                # list all exemplars
+"""
+
+import argparse
+import json
+import re
+import sys
+from datetime import datetime
+from pathlib import Path
+
+import yaml
+
+# Reuse analysis functions from humanness_score
+sys.path.insert(0, str(Path(__file__).parent))
+import humanness_score as hs
+
+SKILL_DIR = Path(__file__).parent.parent
+EXEMPLARS_DIR = SKILL_DIR / "references" / "exemplars"
+INDEX_FILE = EXEMPLARS_DIR / "index.yaml"
+
+CATEGORIES = ["tech-opinion", "story-emotional", "list-practical", "hot-take", "general"]
+
+# Category detection markers
+STORY_MARKERS = [
+    "我", "我们", "那天", "那年", "记得", "后来", "当时",
+    "第一次", "最后", "突然", "终于",
+]
+
+
+# ============================================================
+# Segment Extraction
+# ============================================================
+
+def extract_headings(text):
+    """Extract H2 headings from markdown."""
+    return re.findall(r'^##\s+(.+)$', text, re.MULTILINE)
+
+
+def extract_title(text):
+    """Extract H1 title from markdown."""
+    m = re.search(r'^#\s+(.+)$', text, re.MULTILINE)
+    return m.group(1).strip() if m else ""
+
+
+def extract_opening(paragraphs, max_chars=250):
+    """Extract opening hook — first non-empty paragraph(s) up to max_chars."""
+    result = []
+    total = 0
+    for p in paragraphs:
+        if total + len(p) > max_chars and result:
+            break
+        result.append(p)
+        total += len(p)
+    return "\n\n".join(result)
+
+
+def extract_emotional_peak(paragraphs):
+    """Find paragraph with highest negative emotion density."""
+    best_para, best_density = "", -1.0
+    for p in paragraphs:
+        if len(p) < 20:
+            continue
+        count = sum(1 for m in hs.NEGATIVE_MARKERS if m in p)
+        density = count / len(p) * 100
+        if density > best_density:
+            best_density = density
+            best_para = p
+    return best_para if best_density > 0 else ""
+
+
+def extract_transition(paragraphs):
+    """Find paragraph with most self-correction / transition patterns."""
+    transition_words = [
+        "但是", "不过", "然而", "话说回来", "换个角度",
+        "说回来", "但话又说回来", "不对", "算了",
+    ]
+    best_para, best_count = "", 0
+    for p in paragraphs:
+        if len(p) < 20:
+            continue
+        count = sum(len(re.findall(pat, p)) for pat in hs.SELF_CORRECTION_PATTERNS)
+        count += sum(p.count(w) for w in transition_words)
+        if count > best_count:
+            best_count = count
+            best_para = p
+    return best_para if best_count > 0 else ""
+
+
+def extract_closing(paragraphs, max_chars=250):
+    """Extract closing paragraph(s), reading backwards."""
+    result = []
+    total = 0
+    for p in reversed(paragraphs):
+        if total + len(p) > max_chars and result:
+            break
+        result.insert(0, p)
+        total += len(p)
+    return "\n\n".join(result)
+
+
+# ============================================================
+# Category Detection
+# ============================================================
+
+def detect_category(text, paragraphs, headings):
+    """Auto-detect article category from content features."""
+    data_count = sum(len(re.findall(p, text)) for p in hs.REAL_SOURCE_PATTERNS)
+    story_count = sum(text.count(m) for m in STORY_MARKERS)
+    h2_count = len(headings)
+    neg_count = sum(1 for m in hs.NEGATIVE_MARKERS if m in text)
+
+    scores = {
+        "tech-opinion": data_count * 2,
+        "story-emotional": story_count * 1.5,
+        "list-practical": h2_count * 3 if h2_count >= 5 else 0,
+        "hot-take": neg_count * 2 + data_count if len(text) < 2000 else 0,
+        "general": 5,
+    }
+    return max(scores, key=scores.get)
+
+
+# ============================================================
+# Statistical Fingerprint
+# ============================================================
+
+def compute_vocab_temperature(text):
+    """Compute vocabulary temperature band distribution."""
+    counts = {
+        "cold": sum(text.count(w) for w in hs.COLD_WORDS),
+        "warm": sum(text.count(w) for w in hs.WARM_WORDS),
+        "hot": sum(text.count(w) for w in hs.HOT_WORDS),
+        "wild": sum(text.count(w) for w in hs.WILD_WORDS),
+    }
+    total = sum(counts.values())
+    if total == 0:
+        return {k: 0.25 for k in counts}
+    return {k: round(v / total, 2) for k, v in counts.items()}
+
+
+def compute_paragraph_cv(paragraphs):
+    """Coefficient of variation for paragraph lengths."""
+    if len(paragraphs) < 3:
+        return 0.0
+    lengths = [len(p) for p in paragraphs]
+    mean = sum(lengths) / len(lengths)
+    if mean == 0:
+        return 0.0
+    variance = sum((l - mean) ** 2 for l in lengths) / len(lengths)
+    return round((variance ** 0.5) / mean, 2)
+
+
+def count_short_paragraphs(text):
+    """Count single-sentence short paragraphs (1-10 chars, non-heading)."""
+    return sum(1 for l in text.split('\n')
+               if l.strip() and 1 <= len(l.strip()) <= 10
+               and not l.strip().startswith('#'))
+
+
+# ============================================================
+# Main Extraction
+# ============================================================
+
+def extract_exemplar(text, category=None, source=None):
+    """Analyze article and return structured exemplar dict."""
+    clean = re.sub(r'^#+\s+.*$', '', text, flags=re.MULTILINE).strip()
+    paragraphs = hs._split_paragraphs(text)
+    sentences = hs._split_sentences(clean)
+    headings = extract_headings(text)
+    title = extract_title(text) or source or ""
+
+    if not category:
+        category = detect_category(clean, paragraphs, headings)
+
+    score_result = hs.score_article(text)
+
+    # Sentence length stats
+    lengths = [len(s) for s in sentences]
+    if len(lengths) >= 2:
+        mean = sum(lengths) / len(lengths)
+        variance = sum((l - mean) ** 2 for l in lengths) / len(lengths)
+        sentence_stddev = round(variance ** 0.5, 1)
+    else:
+        sentence_stddev = 0.0
+
+    neg_count = sum(1 for s in sentences if any(m in s for m in hs.NEGATIVE_MARKERS))
+    negative_ratio = round(neg_count / len(sentences), 2) if sentences else 0.0
+
+    return {
+        "title": title,
+        "source": source or title,
+        "category": category,
+        "humanness_score": score_result["composite_score"],
+        "fingerprint": {
+            "sentence_stddev": sentence_stddev,
+            "vocab_temperature": compute_vocab_temperature(clean),
+            "negative_ratio": negative_ratio,
+            "paragraph_cv": compute_paragraph_cv(paragraphs),
+            "short_paragraphs": count_short_paragraphs(text),
+        },
+        "segments": {
+            "opening": extract_opening(paragraphs),
+            "emotional_peak": extract_emotional_peak(paragraphs),
+            "transition": extract_transition(paragraphs),
+            "closing": extract_closing(paragraphs),
+        },
+        "extracted_at": datetime.now().strftime("%Y-%m-%d"),
+        "char_count": len(clean),
+    }
+
+
+# ============================================================
+# Persistence
+# ============================================================
+
+def save_exemplar(exemplar):
+    """Save exemplar to markdown file and update index.yaml. Returns filepath."""
+    EXEMPLARS_DIR.mkdir(parents=True, exist_ok=True)
+
+    category = exemplar["category"]
+    num = 1
+    while (EXEMPLARS_DIR / f"{category}-{num:03d}.md").exists():
+        num += 1
+    filename = f"{category}-{num:03d}.md"
+    filepath = EXEMPLARS_DIR / filename
+
+    fp = exemplar["fingerprint"]
+    seg = exemplar["segments"]
+
+    frontmatter = {
+        "source": exemplar["source"],
+        "category": category,
+        "humanness_score": exemplar["humanness_score"],
+        "sentence_stddev": fp["sentence_stddev"],
+        "vocab_temperature": fp["vocab_temperature"],
+        "negative_ratio": fp["negative_ratio"],
+        "paragraph_cv": fp["paragraph_cv"],
+        "short_paragraphs": fp["short_paragraphs"],
+        "extracted_at": exemplar["extracted_at"],
+    }
+
+    content = "---\n"
+    content += yaml.dump(frontmatter, allow_unicode=True, default_flow_style=False)
+    content += "---\n\n"
+
+    section_map = [
+        ("opening", "开头钩子"),
+        ("emotional_peak", "情绪高峰"),
+        ("transition", "转折/自纠"),
+        ("closing", "收尾"),
+    ]
+    for key, label in section_map:
+        if seg.get(key):
+            content += f"## {label}\n\n{seg[key]}\n\n"
+
+    filepath.write_text(content, encoding="utf-8")
+    _update_index(filename, exemplar)
+    return filepath
+
+
+def _update_index(filename, exemplar):
+    """Add or update entry in index.yaml."""
+    index = []
+    if INDEX_FILE.exists():
+        with open(INDEX_FILE, "r", encoding="utf-8") as f:
+            index = yaml.safe_load(f) or []
+
+    entry = {
+        "file": filename,
+        "source": exemplar["source"],
+        "category": exemplar["category"],
+        "humanness_score": exemplar["humanness_score"],
+        "extracted_at": exemplar["extracted_at"],
+    }
+    index = [e for e in index if e.get("file") != filename]
+    index.append(entry)
+    index.sort(key=lambda x: (x["category"], x["humanness_score"]))
+
+    with open(INDEX_FILE, "w", encoding="utf-8") as f:
+        yaml.dump(index, f, allow_unicode=True, default_flow_style=False)
+
+
+# ============================================================
+# List / CLI
+# ============================================================
+
+def list_exemplars():
+    """Print all exemplars in the library."""
+    if not INDEX_FILE.exists():
+        print("范文库为空。用法: python3 scripts/extract_exemplar.py article.md")
+        return
+
+    with open(INDEX_FILE, "r", encoding="utf-8") as f:
+        index = yaml.safe_load(f) or []
+
+    if not index:
+        print("范文库为空。")
+        return
+
+    print(f"\n{'=' * 60}")
+    print(f"范文库 ({len(index)} 篇)")
+    print(f"{'=' * 60}")
+
+    by_cat = {}
+    for e in index:
+        by_cat.setdefault(e["category"], []).append(e)
+
+    for cat, entries in sorted(by_cat.items()):
+        print(f"\n  [{cat}] ({len(entries)} 篇)")
+        for e in entries:
+            score = e["humanness_score"]
+            bar = "█" * int((100 - score) / 10) + "░" * (10 - int((100 - score) / 10))
+            print(f"    {bar} {score:5.1f}  {e['source'][:40]}")
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Extract style exemplars from articles")
+    parser.add_argument("inputs", nargs="*", help="Markdown article file(s)")
+    parser.add_argument("--category", "-c", choices=CATEGORIES,
+                        help="Article category (auto-detected if omitted)")
+    parser.add_argument("--source", "-s", help="Source name (e.g. account name)")
+    parser.add_argument("--list", "-l", action="store_true", help="List all exemplars")
+    parser.add_argument("--json", action="store_true", help="JSON output")
+    args = parser.parse_args()
+
+    if args.list:
+        list_exemplars()
+        return
+
+    if not args.inputs:
+        parser.print_help()
+        sys.exit(1)
+
+    for input_path in args.inputs:
+        path = Path(input_path)
+        if not path.exists():
+            print(f"Error: {input_path} not found", file=sys.stderr)
+            continue
+
+        text = path.read_text(encoding="utf-8")
+        source = args.source or path.stem  # fallback to filename without extension
+        exemplar = extract_exemplar(text, category=args.category, source=source)
+        filepath = save_exemplar(exemplar)
+
+        if args.json:
+            print(json.dumps(exemplar, ensure_ascii=False, indent=2))
+        else:
+            print(f"✓ {path.name}")
+            print(f"  Category:  {exemplar['category']}")
+            print(f"  Score:     {exemplar['humanness_score']:.1f}/100")
+            print(f"  Segments:  {sum(1 for v in exemplar['segments'].values() if v)}/4")
+            fp = exemplar["fingerprint"]
+            print(f"  Stddev:    {fp['sentence_stddev']}")
+            print(f"  Neg ratio: {fp['negative_ratio']:.0%}")
+            print(f"  Para CV:   {fp['paragraph_cv']}")
+            temp = fp["vocab_temperature"]
+            print(f"  Temp:      cold={temp['cold']} warm={temp['warm']} hot={temp['hot']} wild={temp['wild']}")
+            print(f"  Saved to:  {filepath}")
+            print()
+
+
+if __name__ == "__main__":
+    main()
--- a/dist/openclaw/scripts/humanness_score.py
+++ b/dist/openclaw/scripts/humanness_score.py
@ -55,10 +55,23 @@ REAL_SOURCE_PATTERNS = [
 ]

 NEGATIVE_MARKERS = [
+    # 直接负面情绪
    "失望", "糟糕", "扯", "坑", "烂", "差劲", "崩溃", "吐槽", "骂",
    "怒", "烦", "焦虑", "担忧", "不满", "恶心", "可怕", "可悲", "可笑",
    "离谱", "尴尬", "无语", "蠢", "惨", "亏", "危",
+    # 绝望/迷茫
+    "绝望", "迷茫", "心累", "丧", "后悔", "后怕", "心寒",
+    # 欺骗/操控（隐性负面）
+    "骗", "忽悠", "割韭菜", "套路", "画大饼", "洗脑",
+    # 失败/徒劳
+    "白费", "白搭", "没戏", "黄了", "凉了", "废了",
+    # 自嘲/自贬
+    "傻", "天真", "吃亏", "自嗨", "打脸",
+    # 讽刺/反语
+    "呵呵", "好吧", "行吧", "真服了",
+    # 短语
    "太扯了", "说实话我很失望", "搞什么", "不靠谱", "受不了",
+    "受够了", "想哭", "伤心", "苦哈哈", "得过且过",
 ]

 COMMON_ADVERBS = [
@ -69,10 +82,27 @@ COMMON_ADVERBS = [
    "竟然", "简直", "几乎", "完全", "绝对", "必然",
 ]

-COLD_WORDS = ["边际", "认知负荷", "信息不对称", "路径依赖", "商业模式", "生态系统", "增量"]
-WARM_WORDS = ["说白了", "其实吧", "讲真", "说实话", "坦白讲", "懂的都懂", "怎么说呢"]
-HOT_WORDS = ["DNA动了", "格局打开", "遥遥<EFBFBD><EFBFBD>先", "卷", "内卷", "炸了", "杀疯了", "吃灰"]
-WILD_WORDS = ["整挺好", "不靠谱", "瞎折腾", "搁这儿", "糊弄", "扯", "嗯"]
+COLD_WORDS = [
+    "边际", "认知负荷", "信息不对称", "路径依赖", "商业模式", "生态系统", "增量",
+    "技术栈", "标准化", "结构性", "规模化", "护城河", "飞轮", "闭环",
+    "赛道", "壁垒", "方法论", "底层逻辑", "第一性原理", "杠杆", "复利",
+    "ROI", "PMF", "代运营", "供给侧", "需求侧",
+]
+WARM_WORDS = [
+    "说白了", "其实吧", "讲真", "说实话", "坦白讲", "懂的都懂", "怎么说呢",
+    "老实说", "这么说吧", "你想啊", "别急", "慢慢来",
+    "有意思的是", "好玩的是", "巧的是", "说来话长", "话说回来",
+]
+HOT_WORDS = [
+    "DNA动了", "格局打开", "遥遥领先", "卷", "内卷", "炸了", "杀疯了", "吃灰",
+    "凡尔赛", "标题党", "躺平", "摆烂", "破防", "上头", "内耗",
+    "蒸发", "出圈", "降维打击", "弯道超车",
+]
+WILD_WORDS = [
+    "整挺好", "不靠谱", "瞎折腾", "搁这儿", "糊弄", "扯", "嗯",
+    "苦哈哈", "傻乎乎", "稀里糊涂", "得了吧", "算了吧",
+    "摔了跤", "交学费", "踩坑", "翻车", "栽了",
+]

 SELF_CORRECTION_PATTERNS = [
    r'不对[，,]', r'准确说', r'算了', r'说错了',
@ -314,6 +344,81 @@ def run_tier(checks, text):
    return results


+# ============================================================
+# Calibration (bell-curve + over-optimization penalty)
+# ============================================================
+
+# Human article baselines (from 15 example articles, 2026-03-30)
+# Dimensions where AI over-optimizes: bell-curve scoring penalizes
+# both "too low" AND "too high" relative to human average.
+_BELL_CURVE_CHECKS = {
+    "broken_sentences": 0.39,
+    "self_correction": 0.20,
+    "sentence_length_range": 0.71,
+    "paragraph_length_variance": 0.52,
+    "banned_words": 0.73,
+}
+
+
+def _bell_curve(raw_score, center):
+    """Score peaks at center (human avg), penalizes over-optimization.
+
+    Below center: linear rise (as before).
+    Above center: quadratic penalty — too much is suspicious.
+    """
+    if center <= 0:
+        return raw_score
+    if raw_score <= center:
+        return raw_score / center
+    else:
+        overshoot = (raw_score - center) / (1.0 - center) if center < 1 else 0
+        return max(0.0, 1.0 - overshoot * overshoot)
+
+
+def calibrate_tiers(tier1, tier2):
+    """Apply bell-curve calibration and over-optimization penalty in-place."""
+    # 1. Bell-curve adjustment for over-optimizable dimensions
+    for tier in [tier1, tier2]:
+        for name, data in tier.items():
+            if name.startswith("_"):
+                continue
+            if name in _BELL_CURVE_CHECKS:
+                raw = data["score"]
+                center = _BELL_CURVE_CHECKS[name]
+                calibrated = round(max(0.0, min(1.0, _bell_curve(raw, center))), 4)
+                data["raw_score"] = raw
+                data["score"] = calibrated
+                data["detail"] += f" [calibrated from {raw:.2f}, center={center}]"
+
+    # 2. Over-optimization penalty: if 60%+ of checks score > 0.8,
+    #    the article is suspiciously "perfect" — apply global penalty.
+    all_scores = []
+    for tier in [tier1, tier2]:
+        for name, data in tier.items():
+            if not name.startswith("_"):
+                all_scores.append(data["score"])
+
+    high_count = sum(1 for s in all_scores if s > 0.8)
+    over_opt_ratio = high_count / len(all_scores) if all_scores else 0
+    penalty = 1.0
+    if over_opt_ratio >= 0.6:
+        penalty = 0.85  # 15% penalty for suspiciously perfect articles
+
+    if penalty < 1.0:
+        for tier in [tier1, tier2]:
+            for name, data in tier.items():
+                if not name.startswith("_"):
+                    data["score"] = round(data["score"] * penalty, 4)
+
+    # 3. Recalculate tier summaries
+    for tier in [tier1, tier2]:
+        scores = [data["score"] for name, data in tier.items() if not name.startswith("_")]
+        tier["_summary"]["mean_score"] = round(sum(scores) / len(scores), 4) if scores else 0
+        tier["_summary"]["scores"] = [round(s, 4) for s in scores]
+
+    return penalty
+
+
 # ============================================================
 # Composite Score
 # ============================================================
@ -364,6 +469,7 @@ def score_article(text, verbose=False, tier3_score=None):

    tier1 = run_tier(TIER1_CHECKS, clean)
    tier2 = run_tier(TIER2_CHECKS, clean)
+    over_opt_penalty = calibrate_tiers(tier1, tier2)
    composite, weights = compute_composite(tier1, tier2, tier3_score)
    param_scores = build_param_scores(tier1, tier2)

@ -377,6 +483,7 @@ def score_article(text, verbose=False, tier3_score=None):
        },
        "weights": weights,
        "param_scores": param_scores,
+        "over_optimization_penalty": over_opt_penalty,
        "char_count": len(clean),
    }

--- a/dist/openclaw/scripts/learn_edits.py
+++ b/dist/openclaw/scripts/learn_edits.py
@ -325,6 +325,20 @@ def main():
    lesson_file = save_lesson(diff_result, args.draft, args.final)
    print(f"\nLesson saved to: {lesson_file}")

+    # Auto-grow exemplar library from edited finals
+    final_title = extract_title(final)
+    try:
+        import extract_exemplar
+        exemplar = extract_exemplar.extract_exemplar(final, source=final_title or "user-edited")
+        if exemplar["humanness_score"] <= 50:
+            exemplar_path = extract_exemplar.save_exemplar(exemplar)
+            print(f"\n✓ 终稿已加入范文库: {exemplar_path}")
+            print(f"  Score: {exemplar['humanness_score']:.1f}/100, Category: {exemplar['category']}")
+        else:
+            print(f"\n⚠ 终稿 humanness_score={exemplar['humanness_score']:.1f} > 50，未加入范文库")
+    except Exception as e:
+        print(f"\n⚠ 范文提取跳过: {e}")
+
    lesson_count = len(load_all_lessons())
    print(f"Total lessons: {lesson_count}")
 @ -1 +1 @@
 .3.0
 .3.3