Commit graph

26 commits

Author SHA1 Message Date
wangzhuc
8077635f25 feat: add Camoufox anti-detection browser and fix visibility:hidden bug
Add Camoufox as Level 2 fetcher to bypass WeChat bot verification.
Fix #js_content visibility:hidden style causing empty markdown output.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 14:22:32 +00:00
wangzhuc
25d6a44082 feat: add article content extraction with anti-scraping fallback
- New `scripts/fetch_article.py`: extract WeChat article content as Markdown
  with three-level fetch strategy (requests → Playwright → manual HTML)
- Refactor `learn_theme.py` to reuse `fetch_article.fetch_html()`, removing
  duplicate fetch logic
- Update SKILL.md: add "学习这篇文章/导入范文" auxiliary function
- Update README.md: add article extraction to feature table and directory tree

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 16:34:13 +00:00
wangzhuc
d6900fe85d feat(learn-theme): add CLI with argparse and terminal report
Replace the smoke-test main() with a proper argparse CLI that accepts
a URL and --name, validates the name, fetches + extracts + analyzes the
article, calls generate_theme_yaml(), and writes the YAML to
toolkit/themes/. Prints a human-readable theme report with color values
and typography. Adds `learn-theme` subcommand to toolkit/cli.py
(delegates to subprocess call of scripts/learn_theme.py).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 12:37:47 +08:00
wangzhuc
1cd9b4409f feat(learn-theme): add theme YAML generation from analyzed styles
Add `generate_theme_yaml()` that builds a complete theme YAML by loading
the professional-clean template CSS, substituting extracted colors and
typography, and deriving a dark-mode palette via `derive_darkmode()`.
Adds `import yaml`, `import argparse`, `from pathlib import Path`, and
module-level constants `TEMPLATE_THEME` / `THEMES_DIR`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 12:37:47 +08:00
wangzhuc
77e76077d8 fix(learn-theme): HTTP error handling, DRY title extraction, text_light fix
- fetch_article: catch RequestException, add raise_for_status()
- Extract _attach_title() shared by fetch_article and _load_from_file
- text_light: only search foreground colors, not background values

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:37:47 +08:00
wangzhuc
95ba69fd5a feat: learn_theme — add analyze_styles, DEFAULTS, most_common_value, derive_darkmode integration
Excludes dominant text color from accent candidates; blockquote-first
quote_bg heuristic avoids picking up decorative divider colors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 12:37:47 +08:00
wangzhuc
e457b4463b feat: learn_theme — add HTML fetch/extract layer (fetch_article, extract_styles, parse_inline_style)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 12:37:47 +08:00
wangzhuc
1168768618 feat: add learn_theme.py — color utilities (rgb_to_hex, lightness, is_gray, adjust_lightness, derive_darkmode)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 12:37:47 +08:00
wangzhuc
73a67fffc7 fix: improve local file matching for WeChat draft sync
- learn_edits.py: prioritize output_file field from history.yaml,
  fall back to title slug matching, then largest file
- SKILL.md: add output_file field to history.yaml schema
- Fixes wrong file match when multiple articles share the same date

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 16:20:17 +08:00
wangzhuc
2773a8bb9b feat: sync edits from WeChat draft box for learn-edits
- publisher.py: add get_draft() to fetch draft content by media_id,
  add html_to_plaintext() for HTML→text conversion
- learn_edits.py: add --from-wechat flag that auto-fetches latest draft
  from WeChat, converts both sides to plaintext, and diffs
- learn_edits.py: add markdown_to_plaintext() for local file conversion
- SKILL.md: update edit workflow — both local and WeChat edits supported

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 16:20:17 +08:00
wangzhuc
02f5e6d93b fix: calibrate humanness_score with bell-curve and over-optimization penalty
Problem: AI articles scored MORE human (avg 26.2) than actual human
articles (avg 44.0) — opposite of 朱雀's judgment. AI was gaming the
linear scoring by over-optimizing broken sentences, self-correction,
paragraph variance, etc.

Fix: Two calibration layers added after raw scoring:

1. Bell-curve scoring for 5 over-optimizable dimensions (broken_sentences,
   self_correction, sentence_length_range, paragraph_length_variance,
   banned_words). Score peaks at human article average, penalizes both
   too-low AND too-high values.

2. Over-optimization penalty: 15% global penalty when 60%+ of checks
   score above 0.8, indicating suspiciously "perfect" articles.

Results:
  Before: Human avg=44.0, AI avg=26.2 (WRONG direction)
  After:  Human avg=42.5, AI avg=44.0 (CORRECT direction)
  A/B test now agrees with 朱雀 (exemplar version scores better)

Baselines derived from 15 human articles tested on 2026-03-30.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 00:09:14 +08:00
wangzhuc
f7fe44c152 fix: expand negative markers and vocabulary temperature word lists
NEGATIVE_MARKERS: 26 → 51 words
  Added: despair (绝望/迷茫/心累), deception (骗/忽悠/割韭菜/套路),
  failure (白费/黄了/凉了), self-deprecation (傻/天真/自嗨),
  sarcasm (呵呵/行吧/真服了), complaint (受够了/苦哈哈)

COLD_WORDS: 7 → 25 (技术栈/标准化/护城河/飞轮/底层逻辑/PMF/ROI...)
WARM_WORDS: 7 → 15 (老实说/这么说吧/你想啊/有意思的是...)
HOT_WORDS: 8 → 19 (凡尔赛/标题党/躺平/摆烂/破防/上头/内耗...)
WILD_WORDS: 7 → 17 (苦哈哈/傻乎乎/交学费/踩坑/翻车...)

Impact on 15 exemplar articles:
  neg score avg: 0.15 → 0.27 (+80%)
  temp_mix: still low on short segments, but full articles now
  score 0.33-1.00 vs previously 0.00

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 23:23:51 +08:00
wangzhuc
83c963527c fix: use filename as fallback source when article has no H1 title
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 22:42:23 +08:00
wangzhuc
885cae8e7d feat: add SICO-style exemplar extraction system for few-shot writing
- New script: scripts/extract_exemplar.py
  Extracts style fingerprints from human-written articles (opening hook,
  emotional peak, transition/self-correction, closing) with statistical
  analysis (sentence stddev, vocab temperature, negative ratio, paragraph CV).
  Auto-detects category, supports batch import.

- SKILL.md: Add Step 4.4 exemplar injection
  Loads matching exemplars by category before writing, injects segments
  as few-shot style examples in the prompt.

- learn_edits.py: Auto-grow exemplar library
  After user edits, auto-extracts the final version into the exemplar
  library if humanness_score <= 50.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 22:32:02 +08:00
wangzhuc
344f7509f1 feat: structured edit learning with typed patterns and confidence scoring
learn_edits.py: patterns now have type/key/description/rule fields,
confidence auto-computed from occurrences + recency with 30-day decay.
--summarize --json outputs aggregated patterns sorted by confidence.

learn-edits.md: playbook.md format changed from free text to structured
YAML rules with confidence levels. Rules with confidence ≥ 5 become
hard constraints in Step 4, < 5 are soft references, < 2 get pruned.

SKILL.md Step 4: playbook priority now confidence-gated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 20:23:34 +08:00
wangzhuc
5fb20083af refactor: remove optimize_loop.py, replace with agent-driven optimization
optimize_loop.py was framework-only (needed external LLM API). The
optimization is now an auxiliary function in SKILL.md driven by the
already-running agent. All references updated across README, CLAUDE.md,
diagnose.py, and writing-config.example.yaml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:58:20 +08:00
wangzhuc
df72e51ea1 feat: rewrite humanness_score.py with continuous scoring and param mapping
- 11 checks across 2 tiers (6 statistical + 5 pattern), up from 6
- Continuous 0-1 scores instead of pass/fail booleans
- Each check maps to a writing-config parameter via param field
- New checks: negative emotion ratio, adverb density, vocabulary richness,
  sentence length range, self-correction patterns
- New --tier3 flag for agent to pass LLM structural analysis score
- param_scores in JSON output: flat param→score map for optimization
- Standalone mode redistributes weights (T1=62.5%, T2=37.5%)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:54:11 +08:00
wangzhuc
f1e9c084d9 feat: add version tracking and update mechanism
- Add VERSION file (1.2.0)
- SKILL.md Step 1: auto-check for updates on each run
- SKILL.md: add "更新" auxiliary function (git pull)
- README: install via git clone instead of cp/ln
- build_openclaw.py: include VERSION in dist

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 15:20:44 +08:00
wangzhuc
5fcceb7c72 feat: add anti-AI diagnostic command (scripts/diagnose.py)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 14:43:55 +08:00
wangzhuc
ca502cf6d3 fix: build_openclaw 排除 __pycache__,清理 dist 运行时残留
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 13:35:29 +08:00
wangzhuc
e1a0d6ef47 新增 OpenClaw 兼容:build 脚本 + CI + 首次产物
- scripts/build_openclaw.py:SKILL.md 转换({skill_dir}→{baseDir}、WebSearch→web_search、移除 allowed-tools)
- .github/workflows/build-openclaw.yml:push to main 时自动构建 dist/openclaw/
- dist/openclaw/:首次构建产物入库,OpenClaw 用户可直接使用

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 13:00:07 +08:00
ystyleb
039a6caa9d
fix: normalize hotspot scores across platforms for fair sorting
Previously, hotspots were sorted by raw hot values directly, but different
platforms use vastly different scales (Toutiao ~10M, Weibo ~1M, Baidu ~100K),
causing Toutiao to dominate all results while Weibo and Baidu entries were
always truncated.

Now uses rank-based normalization (0-100) within each source before merging,
so cross-platform sorting gives equal weight to each platform's top stories.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 00:13:35 +08:00
wangzhuc
8e16c70ead 新增优化循环框架:humanness_score.py + optimize_loop.py
借鉴 Karpathy autoresearch 的 change→score→keep/rollback 模式:
- humanness_score.py: 固定打分器,两层评分(客观checklist + 主观读者感)
  6项客观检查:禁用词/真实引用/破句/句长方差/段长方差/词汇温度
  1项主观LLM判官(stub,需配置API)
  复合分 0-100(越低越像人)
- optimize_loop.py: 迭代框架,通过修改 writing-config.yaml 参数
  自动生成文章→打分→保留或回滚→记录到 results.tsv

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 23:18:55 +08:00
wangzhuc
dd1de0d1e9 重构为单用户模式:去掉多客户架构 + 新增 Onboard/环境检查 + 修复 10 项问题
架构转变:从代运营多客户模式改为开源单用户模式。
- 去掉 clients/ 目录,style.yaml/history.yaml 扁平化到 skill root
- Step 1 简化(不再提取客户名,直接读 style.yaml)
- 新增 Step 0 环境检查(config/依赖/API 配置,降级标记传递到后续 Step)
- Onboard 改为首次设置流程(交互式问答 + 支持"用默认的直接写")
- 3 个脚本去掉 --client 参数,路径扁平化
- 修复 10 项 workflow 问题(降级传递、历史写入、wechat-constraints 引用等)
- evals 更新为单用户模式的 3 个场景
- 新增 style.example.yaml 作为默认模板

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:36:36 +08:00
wangzhuc
ec4a646359 Rename media-agent → WeWrite
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:18:38 +08:00
wangzhuc
1ab34fa450 Initial release — 公众号文章全流程 AI Skill
热点抓取 → 选题 → 框架 → 写作 → SEO → 视觉AI → 排版 → 微信草稿箱,
一句话触发完整流程。适用于 Claude Code skill 格式。

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:16:18 +08:00