chore: untrack docs/superpowers (already in .gitignore)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
ae37aa49e1
commit
f969dec5eb
2 changed files with 0 additions and 1009 deletions
|
|
@ -1,854 +0,0 @@
|
|||
# Learn Theme Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Add `learn-theme` command that extracts a reusable theme YAML from any WeChat article URL.
|
||||
|
||||
**Architecture:** New script `scripts/learn_theme.py` handles fetch → extract → analyze → generate. CLI adds a `learn-theme` subcommand that calls the script. SKILL.md gets a new trigger word. Zero changes to existing theme/converter code.
|
||||
|
||||
**Tech Stack:** Python 3.11+, requests, BeautifulSoup4, PyYAML, colorsys (stdlib)
|
||||
|
||||
**Spec:** `docs/superpowers/specs/2026-04-01-learn-theme-design.md`
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Color utility helpers
|
||||
|
||||
**Files:**
|
||||
- Create: `scripts/learn_theme.py`
|
||||
|
||||
These pure functions have no dependencies and are used by all later tasks.
|
||||
|
||||
- [ ] **Step 1: Create `scripts/learn_theme.py` with color utilities**
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Learn a WeChat formatting theme from a public article URL.
|
||||
|
||||
Usage:
|
||||
python3 scripts/learn_theme.py <url> --name <theme-name>
|
||||
"""
|
||||
|
||||
import colorsys
|
||||
import re
|
||||
|
||||
|
||||
def rgb_to_hex(rgb_str: str) -> str:
|
||||
"""Convert 'rgb(r, g, b)' or 'rgba(r,g,b,a)' to '#rrggbb'.
|
||||
|
||||
Returns the original string if it doesn't match rgb/rgba format.
|
||||
If already a hex color, returns it lowercased.
|
||||
"""
|
||||
if rgb_str.startswith("#"):
|
||||
return rgb_str.lower()
|
||||
m = re.match(r"rgba?\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)", rgb_str)
|
||||
if m:
|
||||
return "#{:02x}{:02x}{:02x}".format(
|
||||
int(m.group(1)), int(m.group(2)), int(m.group(3))
|
||||
)
|
||||
return rgb_str
|
||||
|
||||
|
||||
def lightness(hex_color: str) -> float:
|
||||
"""Return HLS lightness (0.0–1.0) for a hex color."""
|
||||
if not hex_color.startswith("#") or len(hex_color) != 7:
|
||||
return 0.5
|
||||
r = int(hex_color[1:3], 16) / 255
|
||||
g = int(hex_color[3:5], 16) / 255
|
||||
b = int(hex_color[5:7], 16) / 255
|
||||
return colorsys.rgb_to_hls(r, g, b)[1]
|
||||
|
||||
|
||||
def is_gray(hex_color: str, threshold: int = 30) -> bool:
|
||||
"""Check if a color is grayscale (R/G/B within `threshold` of each other)."""
|
||||
if not hex_color.startswith("#") or len(hex_color) != 7:
|
||||
return False
|
||||
r = int(hex_color[1:3], 16)
|
||||
g = int(hex_color[3:5], 16)
|
||||
b = int(hex_color[5:7], 16)
|
||||
return max(r, g, b) - min(r, g, b) < threshold
|
||||
|
||||
|
||||
def adjust_lightness(hex_color: str, target_l: float) -> str:
|
||||
"""Return a new hex color with lightness set to `target_l` (0.0–1.0)."""
|
||||
if not hex_color.startswith("#") or len(hex_color) != 7:
|
||||
return hex_color
|
||||
r = int(hex_color[1:3], 16) / 255
|
||||
g = int(hex_color[3:5], 16) / 255
|
||||
b = int(hex_color[5:7], 16) / 255
|
||||
h, _l, s = colorsys.rgb_to_hls(r, g, b)
|
||||
nr, ng, nb = colorsys.hls_to_rgb(h, max(0, min(1, target_l)), s)
|
||||
return "#{:02x}{:02x}{:02x}".format(
|
||||
int(nr * 255), int(ng * 255), int(nb * 255)
|
||||
)
|
||||
|
||||
|
||||
def derive_darkmode(colors: dict) -> dict:
|
||||
"""Derive dark mode colors from light mode colors."""
|
||||
dm = {}
|
||||
dm["background"] = "#1e1e1e"
|
||||
dm["text"] = adjust_lightness(colors.get("text", "#333333"), 0.8)
|
||||
dm["text_light"] = adjust_lightness(colors.get("text_light", "#666666"), 0.6)
|
||||
primary = colors.get("primary", "#2563eb")
|
||||
dm["primary"] = adjust_lightness(primary, min(lightness(primary) + 0.15, 0.85))
|
||||
dm["code_bg"] = "#2d2d2d"
|
||||
dm["code_color"] = "#d4d4d4"
|
||||
dm["quote_bg"] = "#252525"
|
||||
dm["quote_border"] = dm["primary"]
|
||||
return dm
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Verify file runs without errors**
|
||||
|
||||
Run: `python3 scripts/learn_theme.py` (will exit with no output since no CLI yet — just checks import/syntax)
|
||||
|
||||
Expected: No traceback. May print nothing or the usage error we'll add later.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add scripts/learn_theme.py
|
||||
git commit -m "feat(learn-theme): add color utility helpers"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: HTML fetch and style extraction
|
||||
|
||||
**Files:**
|
||||
- Modify: `scripts/learn_theme.py`
|
||||
|
||||
Add the functions to fetch a WeChat article and extract inline styles by element type.
|
||||
|
||||
- [ ] **Step 1: Add `parse_inline_style` function**
|
||||
|
||||
Append after the existing code in `scripts/learn_theme.py`:
|
||||
|
||||
```python
|
||||
def parse_inline_style(style_str: str) -> dict[str, str]:
|
||||
"""Parse an inline style string into a {property: value} dict."""
|
||||
props = {}
|
||||
for part in style_str.split(";"):
|
||||
part = part.strip()
|
||||
if ":" in part:
|
||||
k, v = part.split(":", 1)
|
||||
props[k.strip().lower()] = v.strip()
|
||||
return props
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add `fetch_article` function**
|
||||
|
||||
Append:
|
||||
|
||||
```python
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
|
||||
def fetch_article(url: str) -> BeautifulSoup:
|
||||
"""Fetch a WeChat article and return the #js_content element as soup.
|
||||
|
||||
Raises SystemExit on fetch failure or missing content element.
|
||||
"""
|
||||
headers = {
|
||||
"User-Agent": (
|
||||
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
|
||||
"AppleWebKit/537.36 (KHTML, like Gecko) "
|
||||
"Chrome/120.0.0.0 Safari/537.36"
|
||||
),
|
||||
"Accept": "text/html,application/xhtml+xml",
|
||||
"Accept-Language": "zh-CN,zh;q=0.9",
|
||||
}
|
||||
resp = requests.get(url, headers=headers, timeout=15)
|
||||
resp.encoding = "utf-8"
|
||||
soup = BeautifulSoup(resp.text, "html.parser")
|
||||
|
||||
content = soup.find(id="js_content")
|
||||
if not content:
|
||||
print(f"Error: #js_content not found — the URL may require verification.", file=__import__("sys").stderr)
|
||||
raise SystemExit(1)
|
||||
|
||||
# Also try to grab the article title
|
||||
title_el = soup.find("h1", class_="rich_media_title") or soup.find("h1", id="activity-name")
|
||||
title = title_el.get_text(strip=True) if title_el else ""
|
||||
content._wewrite_title = title # stash on the element for later use
|
||||
|
||||
return content
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Add `extract_styles` function**
|
||||
|
||||
Append:
|
||||
|
||||
```python
|
||||
from collections import Counter
|
||||
|
||||
|
||||
def extract_styles(content: BeautifulSoup) -> dict:
|
||||
"""Extract inline styles from #js_content grouped by element type.
|
||||
|
||||
Returns a dict like:
|
||||
{
|
||||
"p": [{"color": "rgb(0,0,0)", "font-size": "16px", ...}, ...],
|
||||
"section": [...],
|
||||
...
|
||||
}
|
||||
"""
|
||||
TARGET_TAGS = ("p", "section", "span", "strong", "em",
|
||||
"h1", "h2", "h3", "h4",
|
||||
"blockquote", "code", "pre", "img", "a")
|
||||
|
||||
grouped: dict[str, list[dict]] = {tag: [] for tag in TARGET_TAGS}
|
||||
|
||||
for el in content.find_all(True, recursive=True):
|
||||
tag = el.name
|
||||
if tag not in grouped:
|
||||
continue
|
||||
style = el.get("style", "")
|
||||
if not style:
|
||||
continue
|
||||
props = parse_inline_style(style)
|
||||
if props:
|
||||
grouped[tag].append(props)
|
||||
|
||||
return grouped
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Quick smoke test with the saved HTML from earlier exploration**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 -c "
|
||||
from scripts.learn_theme import parse_inline_style, extract_styles
|
||||
from bs4 import BeautifulSoup
|
||||
with open('/tmp/wechat_article.html', 'r') as f:
|
||||
soup = BeautifulSoup(f.read(), 'html.parser')
|
||||
content = soup.find(id='js_content')
|
||||
styles = extract_styles(content)
|
||||
print(f'p: {len(styles[\"p\"])} elements')
|
||||
print(f'section: {len(styles[\"section\"])} elements')
|
||||
print(f'span: {len(styles[\"span\"])} elements')
|
||||
print(f'strong: {len(styles[\"strong\"])} elements')
|
||||
"
|
||||
```
|
||||
|
||||
Expected: Counts matching our earlier exploration (p: ~70, section: ~60, span: ~87, strong: ~16).
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add scripts/learn_theme.py
|
||||
git commit -m "feat(learn-theme): add HTML fetch and style extraction"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Style analysis — infer semantic color roles
|
||||
|
||||
**Files:**
|
||||
- Modify: `scripts/learn_theme.py`
|
||||
|
||||
This is the core logic: take raw extracted styles and infer the theme's semantic color/typography properties.
|
||||
|
||||
- [ ] **Step 1: Add `most_common_value` helper**
|
||||
|
||||
Append to `scripts/learn_theme.py`:
|
||||
|
||||
```python
|
||||
def most_common_value(style_list: list[dict], prop: str) -> str | None:
|
||||
"""Return the most common value of `prop` across a list of parsed style dicts."""
|
||||
counter = Counter()
|
||||
for props in style_list:
|
||||
if prop in props:
|
||||
counter[props[prop]] += 1
|
||||
if not counter:
|
||||
return None
|
||||
return counter.most_common(1)[0][0]
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add `analyze_styles` function**
|
||||
|
||||
Append:
|
||||
|
||||
```python
|
||||
# Defaults from professional-clean, used when extraction finds nothing
|
||||
DEFAULTS = {
|
||||
"primary": "#2563eb",
|
||||
"secondary": "#3b82f6",
|
||||
"text": "#333333",
|
||||
"text_light": "#666666",
|
||||
"background": "#ffffff",
|
||||
"code_bg": "#1e293b",
|
||||
"code_color": "#e2e8f0",
|
||||
"quote_border": "#2563eb",
|
||||
"quote_bg": "#eff6ff",
|
||||
"border_radius": "8px",
|
||||
"font_size": "16px",
|
||||
"line_height": "1.8",
|
||||
"letter_spacing": "0",
|
||||
"font_family": (
|
||||
'-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, '
|
||||
'"Helvetica Neue", Arial, "PingFang SC", "Hiragino Sans GB", '
|
||||
'"Microsoft YaHei", sans-serif'
|
||||
),
|
||||
"p_margin": "12px 0",
|
||||
}
|
||||
|
||||
|
||||
def analyze_styles(grouped: dict) -> dict:
|
||||
"""Infer semantic theme properties from grouped inline styles.
|
||||
|
||||
Returns a flat dict with keys: primary, secondary, text, text_light,
|
||||
background, code_bg, code_color, quote_border, quote_bg, border_radius,
|
||||
font_size, line_height, letter_spacing, font_family, p_margin.
|
||||
"""
|
||||
result = dict(DEFAULTS)
|
||||
|
||||
# --- Layer 1: Colors + Typography ---
|
||||
|
||||
# text: most common color on <p>
|
||||
p_color = most_common_value(grouped.get("p", []), "color")
|
||||
if p_color:
|
||||
result["text"] = rgb_to_hex(p_color)
|
||||
|
||||
# text_light: gray color with lightness between 0.15 and 0.85, excluding text
|
||||
all_colors = Counter()
|
||||
for tag_styles in grouped.values():
|
||||
for props in tag_styles:
|
||||
if "color" in props:
|
||||
h = rgb_to_hex(props["color"])
|
||||
all_colors[h] += 1
|
||||
gray_candidates = [
|
||||
(c, n) for c, n in all_colors.most_common(30)
|
||||
if is_gray(c) and c != result["text"]
|
||||
and 0.15 < lightness(c) < 0.85
|
||||
]
|
||||
if gray_candidates:
|
||||
# Pick the one with highest lightness (= the "lighter" text)
|
||||
gray_candidates.sort(key=lambda x: lightness(x[0]), reverse=True)
|
||||
result["text_light"] = gray_candidates[0][0]
|
||||
|
||||
# primary: non-gray color on strong/section, boost large font-size headings
|
||||
accent_counter = Counter()
|
||||
for tag in ("strong", "section", "h1", "h2", "h3", "span"):
|
||||
for props in grouped.get(tag, []):
|
||||
if "color" not in props:
|
||||
continue
|
||||
h = rgb_to_hex(props["color"])
|
||||
if is_gray(h) or h == result["text"]:
|
||||
continue
|
||||
# Boost heading-sized elements
|
||||
fs = props.get("font-size", "")
|
||||
fs_match = re.search(r"(\d+)", fs)
|
||||
if fs_match and int(fs_match.group(1)) >= 20:
|
||||
accent_counter[h] += 5
|
||||
else:
|
||||
accent_counter[h] += 1
|
||||
if accent_counter:
|
||||
result["primary"] = accent_counter.most_common(1)[0][0]
|
||||
# secondary: second most common accent, or derive from primary
|
||||
rest = [(c, n) for c, n in accent_counter.most_common(5) if c != result["primary"]]
|
||||
if rest:
|
||||
result["secondary"] = rest[0][0]
|
||||
else:
|
||||
result["secondary"] = adjust_lightness(
|
||||
result["primary"],
|
||||
min(lightness(result["primary"]) + 0.10, 0.90),
|
||||
)
|
||||
|
||||
# background
|
||||
content_bg = None
|
||||
for tag in ("section",):
|
||||
# Check top-level sections only (first few)
|
||||
for props in grouped.get(tag, [])[:5]:
|
||||
for k in ("background", "background-color"):
|
||||
if k in props:
|
||||
v = props[k]
|
||||
if "url" not in v and "transparent" not in v and "100%" not in v:
|
||||
candidate = rgb_to_hex(v)
|
||||
if lightness(candidate) > 0.85:
|
||||
content_bg = candidate
|
||||
break
|
||||
if content_bg:
|
||||
result["background"] = content_bg
|
||||
|
||||
# Typography from <p>
|
||||
p_styles = grouped.get("p", [])
|
||||
for prop, key in [
|
||||
("font-size", "font_size"),
|
||||
("line-height", "line_height"),
|
||||
("letter-spacing", "letter_spacing"),
|
||||
("margin", "p_margin"),
|
||||
]:
|
||||
val = most_common_value(p_styles, prop)
|
||||
if val:
|
||||
result[key] = val
|
||||
|
||||
# Font family from <span> (WeChat wraps text in spans with font-family)
|
||||
span_font = most_common_value(grouped.get("span", []), "font-family")
|
||||
if span_font:
|
||||
result["font_family"] = span_font
|
||||
|
||||
# --- Layer 2: Decorative ---
|
||||
|
||||
# quote: look for border-left on blockquote or section
|
||||
for tag in ("blockquote", "section"):
|
||||
for props in grouped.get(tag, []):
|
||||
bl = props.get("border-left", "")
|
||||
if bl:
|
||||
color_match = re.search(r"(#[0-9a-fA-F]{3,6}|rgb\([^)]+\))", bl)
|
||||
if color_match:
|
||||
result["quote_border"] = rgb_to_hex(color_match.group(1))
|
||||
bg = props.get("background", props.get("background-color", ""))
|
||||
if bg and "transparent" not in bg and "url" not in bg:
|
||||
result["quote_bg"] = rgb_to_hex(bg)
|
||||
break
|
||||
# If no explicit quote found, derive from primary
|
||||
if result["quote_border"] == DEFAULTS["quote_border"] and result["primary"] != DEFAULTS["primary"]:
|
||||
result["quote_border"] = result["primary"]
|
||||
# Derive a light tint of primary for quote_bg
|
||||
result["quote_bg"] = adjust_lightness(result["primary"], 0.95)
|
||||
|
||||
# code blocks
|
||||
for tag in ("pre", "code"):
|
||||
for props in grouped.get(tag, []):
|
||||
bg = props.get("background", props.get("background-color", ""))
|
||||
if bg and "none" not in bg and "transparent" not in bg:
|
||||
result["code_bg"] = rgb_to_hex(bg)
|
||||
if "color" in props:
|
||||
result["code_color"] = rgb_to_hex(props["color"])
|
||||
break
|
||||
|
||||
# border-radius: mode across all elements
|
||||
radius_counter = Counter()
|
||||
for tag_styles in grouped.values():
|
||||
for props in tag_styles:
|
||||
if "border-radius" in props:
|
||||
radius_counter[props["border-radius"]] += 1
|
||||
if radius_counter:
|
||||
result["border_radius"] = radius_counter.most_common(1)[0][0]
|
||||
|
||||
return result
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Smoke test analysis against saved HTML**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 -c "
|
||||
from scripts.learn_theme import extract_styles, analyze_styles
|
||||
from bs4 import BeautifulSoup
|
||||
with open('/tmp/wechat_article.html', 'r') as f:
|
||||
soup = BeautifulSoup(f.read(), 'html.parser')
|
||||
content = soup.find(id='js_content')
|
||||
grouped = extract_styles(content)
|
||||
result = analyze_styles(grouped)
|
||||
for k, v in sorted(result.items()):
|
||||
print(f' {k:20s} {v}')
|
||||
"
|
||||
```
|
||||
|
||||
Expected: `primary` ≈ `#2d71d6`, `text` ≈ `#000000`, `font_size` = `16px`, `line_height` = `1.75`, `letter_spacing` = `1px`, `font_family` contains `Optima-Regular`.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add scripts/learn_theme.py
|
||||
git commit -m "feat(learn-theme): add style analysis with semantic role inference"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Theme YAML generation
|
||||
|
||||
**Files:**
|
||||
- Modify: `scripts/learn_theme.py`
|
||||
|
||||
Generate the final YAML theme file using `professional-clean.yaml` as the CSS template.
|
||||
|
||||
- [ ] **Step 1: Add `generate_theme_yaml` function**
|
||||
|
||||
Append to `scripts/learn_theme.py`:
|
||||
|
||||
```python
|
||||
from pathlib import Path
|
||||
import yaml
|
||||
|
||||
|
||||
TEMPLATE_THEME = "professional-clean"
|
||||
THEMES_DIR = Path(__file__).resolve().parent.parent / "toolkit" / "themes"
|
||||
|
||||
|
||||
def _load_template_css() -> str:
|
||||
"""Load the base_css from professional-clean as a template."""
|
||||
template_path = THEMES_DIR / f"{TEMPLATE_THEME}.yaml"
|
||||
with open(template_path, "r", encoding="utf-8") as f:
|
||||
data = yaml.safe_load(f)
|
||||
return data["base_css"]
|
||||
|
||||
|
||||
def generate_theme_yaml(name: str, title: str, analyzed: dict) -> str:
|
||||
"""Generate a complete theme YAML string from analyzed style properties.
|
||||
|
||||
Uses professional-clean's base_css as template, replacing color values
|
||||
and typography properties with the extracted ones.
|
||||
"""
|
||||
css = _load_template_css()
|
||||
|
||||
colors = {
|
||||
"primary": analyzed["primary"],
|
||||
"secondary": analyzed["secondary"],
|
||||
"text": analyzed["text"],
|
||||
"text_light": analyzed["text_light"],
|
||||
"background": analyzed["background"],
|
||||
"code_bg": analyzed["code_bg"],
|
||||
"code_color": analyzed["code_color"],
|
||||
"quote_border": analyzed["quote_border"],
|
||||
"quote_bg": analyzed["quote_bg"],
|
||||
"border_radius": analyzed["border_radius"],
|
||||
}
|
||||
|
||||
# Replace colors in CSS template
|
||||
# Map template default colors -> extracted colors
|
||||
replacements = {
|
||||
"#2563eb": analyzed["primary"], # primary
|
||||
"#3b82f6": analyzed["secondary"], # secondary
|
||||
"#333333": analyzed["text"], # text
|
||||
"#666666": analyzed["text_light"], # text_light
|
||||
"#1e293b": analyzed["code_bg"], # code_bg
|
||||
"#e2e8f0": analyzed["code_color"], # code_color
|
||||
"#eff6ff": analyzed["quote_bg"], # quote_bg
|
||||
}
|
||||
for old, new in replacements.items():
|
||||
css = css.replace(old, new)
|
||||
|
||||
# Replace typography
|
||||
# font-size on body
|
||||
css = re.sub(
|
||||
r"(body\s*\{[^}]*font-size:\s*)\d+px",
|
||||
rf"\g<1>{analyzed['font_size']}",
|
||||
css,
|
||||
)
|
||||
# line-height on body
|
||||
css = re.sub(
|
||||
r"(body\s*\{[^}]*line-height:\s*)[\d.]+",
|
||||
rf"\g<1>{analyzed['line_height']}",
|
||||
css,
|
||||
)
|
||||
# font-family on body
|
||||
css = re.sub(
|
||||
r'(body\s*\{[^}]*font-family:\s*)[^;]+',
|
||||
rf'\g<1>{analyzed["font_family"]}',
|
||||
css,
|
||||
)
|
||||
# p line-height
|
||||
css = re.sub(
|
||||
r"(p\s*\{[^}]*line-height:\s*)[\d.]+",
|
||||
rf"\g<1>{analyzed['line_height']}",
|
||||
css,
|
||||
)
|
||||
# p margin
|
||||
css = re.sub(
|
||||
r"(p\s*\{[^}]*margin:\s*)[^;]+",
|
||||
rf"\g<1>{analyzed['p_margin']}",
|
||||
css,
|
||||
)
|
||||
# li line-height to match p
|
||||
css = re.sub(
|
||||
r"(li\s*\{[^}]*line-height:\s*)[\d.]+",
|
||||
rf"\g<1>{analyzed['line_height']}",
|
||||
css,
|
||||
)
|
||||
# border-radius on relevant selectors (pre, blockquote, img, code)
|
||||
css = css.replace("border-radius: 8px", f"border-radius: {analyzed['border_radius']}")
|
||||
css = css.replace("border-radius: 4px", f"border-radius: {analyzed['border_radius']}")
|
||||
|
||||
dm = derive_darkmode(colors)
|
||||
|
||||
desc = f"从「{title}」学习的排版主题" if title else f"Learned theme: {name}"
|
||||
|
||||
theme_data = {
|
||||
"name": name,
|
||||
"description": desc,
|
||||
"colors": {**colors, "darkmode": dm},
|
||||
"base_css": css,
|
||||
}
|
||||
|
||||
return yaml.dump(theme_data, allow_unicode=True, default_flow_style=False, sort_keys=False)
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Test YAML generation**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 -c "
|
||||
from scripts.learn_theme import extract_styles, analyze_styles, generate_theme_yaml
|
||||
from bs4 import BeautifulSoup
|
||||
with open('/tmp/wechat_article.html', 'r') as f:
|
||||
soup = BeautifulSoup(f.read(), 'html.parser')
|
||||
content = soup.find(id='js_content')
|
||||
grouped = extract_styles(content)
|
||||
analyzed = analyze_styles(grouped)
|
||||
output = generate_theme_yaml('test-learned', 'AI短剧测试', analyzed)
|
||||
print(output[:500])
|
||||
print('...')
|
||||
# Verify it's valid YAML
|
||||
import yaml
|
||||
data = yaml.safe_load(output)
|
||||
assert data['name'] == 'test-learned'
|
||||
assert 'primary' in data['colors']
|
||||
assert 'darkmode' in data['colors']
|
||||
assert 'base_css' in data
|
||||
print('YAML validation passed')
|
||||
"
|
||||
```
|
||||
|
||||
Expected: Valid YAML output with name, colors (including darkmode), and base_css. No traceback.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add scripts/learn_theme.py
|
||||
git commit -m "feat(learn-theme): add theme YAML generation from analyzed styles"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: CLI integration and terminal report
|
||||
|
||||
**Files:**
|
||||
- Modify: `scripts/learn_theme.py` (add `main()` and CLI arg parsing)
|
||||
- Modify: `toolkit/cli.py` (add `learn-theme` subcommand)
|
||||
|
||||
- [ ] **Step 1: Add `main()` with argparse to `scripts/learn_theme.py`**
|
||||
|
||||
Append to `scripts/learn_theme.py`:
|
||||
|
||||
```python
|
||||
import argparse
|
||||
import sys
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Learn a WeChat formatting theme from an article URL.",
|
||||
)
|
||||
parser.add_argument("url", help="WeChat article URL (https://mp.weixin.qq.com/s/...)")
|
||||
parser.add_argument("--name", required=True, help="Theme name (used as filename and reference)")
|
||||
parser.add_argument("--output-dir", default=None, help="Output directory (default: toolkit/themes/)")
|
||||
args = parser.parse_args()
|
||||
|
||||
# Validate name
|
||||
if not re.match(r"^[a-zA-Z0-9_-]+$", args.name):
|
||||
print("Error: --name must contain only letters, digits, hyphens, and underscores.", file=sys.stderr)
|
||||
raise SystemExit(1)
|
||||
|
||||
output_dir = Path(args.output_dir) if args.output_dir else THEMES_DIR
|
||||
output_path = output_dir / f"{args.name}.yaml"
|
||||
|
||||
if output_path.exists():
|
||||
print(f"Warning: {output_path} already exists, will be overwritten.", file=sys.stderr)
|
||||
|
||||
# Step 1: Fetch
|
||||
print(f"Fetching article...")
|
||||
content = fetch_article(args.url)
|
||||
title = getattr(content, "_wewrite_title", "")
|
||||
if title:
|
||||
print(f"Title: {title}")
|
||||
|
||||
# Step 2: Extract
|
||||
grouped = extract_styles(content)
|
||||
styled_count = sum(len(v) for v in grouped.values())
|
||||
print(f"Extracted {styled_count} styled elements.")
|
||||
|
||||
# Step 3: Analyze
|
||||
analyzed = analyze_styles(grouped)
|
||||
|
||||
# Step 4: Generate & write
|
||||
theme_yaml = generate_theme_yaml(args.name, title, analyzed)
|
||||
output_dir.mkdir(parents=True, exist_ok=True)
|
||||
output_path.write_text(theme_yaml, encoding="utf-8")
|
||||
|
||||
# Step 5: Report
|
||||
print()
|
||||
print(f"Learned theme from: {title or args.url}")
|
||||
print(f" text: {analyzed['text']}")
|
||||
print(f" text_light: {analyzed['text_light']}")
|
||||
print(f" primary: {analyzed['primary']}")
|
||||
print(f" secondary: {analyzed['secondary']}")
|
||||
print(f" background: {analyzed['background']}")
|
||||
print(f" font: {analyzed['font_family'][:50]}")
|
||||
print(f" size: {analyzed['font_size']} / line-height {analyzed['line_height']} / spacing {analyzed['letter_spacing']}")
|
||||
print()
|
||||
print(f"Theme saved → {output_path}")
|
||||
print(f"Use it: python3 toolkit/cli.py preview article.md --theme {args.name}")
|
||||
print(f"Or set: theme: {args.name} in style.yaml")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Move imports to top of file**
|
||||
|
||||
Reorganize the imports at the top of `scripts/learn_theme.py` so all imports are at the top (PEP 8):
|
||||
|
||||
```python
|
||||
import argparse
|
||||
import colorsys
|
||||
import re
|
||||
import sys
|
||||
from collections import Counter
|
||||
from pathlib import Path
|
||||
|
||||
import requests
|
||||
import yaml
|
||||
from bs4 import BeautifulSoup
|
||||
```
|
||||
|
||||
Remove the inline `import` statements from `fetch_article` (`__import__("sys").stderr` → `sys.stderr`), `extract_styles` (Counter already imported), and `generate_theme_yaml` (Path, yaml already imported).
|
||||
|
||||
- [ ] **Step 3: Test full CLI flow against saved HTML**
|
||||
|
||||
Since we may not be able to reliably fetch from WeChat during automated testing, test with the already-saved HTML first by temporarily using a local path. But first, verify the script's `--help` works:
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 scripts/learn_theme.py --help
|
||||
```
|
||||
|
||||
Expected: Usage message showing `url` positional arg and `--name` required arg.
|
||||
|
||||
- [ ] **Step 4: Add `learn-theme` subcommand to `toolkit/cli.py`**
|
||||
|
||||
In `toolkit/cli.py`, add the subcommand registration and handler.
|
||||
|
||||
Add after the `cmd_gallery` function (before `_gallery_sample_markdown`):
|
||||
|
||||
```python
|
||||
def cmd_learn_theme(args):
|
||||
"""Learn a theme from a WeChat article URL."""
|
||||
import subprocess
|
||||
script = Path(__file__).parent.parent / "scripts" / "learn_theme.py"
|
||||
cmd = [sys.executable, str(script), args.url, "--name", args.name]
|
||||
result = subprocess.run(cmd)
|
||||
sys.exit(result.returncode)
|
||||
```
|
||||
|
||||
Add subparser registration after the `p_gallery` block (before `args = parser.parse_args()`):
|
||||
|
||||
```python
|
||||
# learn-theme
|
||||
p_learn = sub.add_parser("learn-theme", help="Learn formatting theme from a WeChat article URL")
|
||||
p_learn.add_argument("url", help="WeChat article URL")
|
||||
p_learn.add_argument("--name", required=True, help="Theme name")
|
||||
```
|
||||
|
||||
Add handler in the `try` block, after the `elif args.command == "gallery":` case:
|
||||
|
||||
```python
|
||||
elif args.command == "learn-theme":
|
||||
cmd_learn_theme(args)
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Test CLI integration**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 toolkit/cli.py learn-theme --help
|
||||
```
|
||||
|
||||
Expected: Shows usage for `learn-theme` subcommand with `url` and `--name` args.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add scripts/learn_theme.py toolkit/cli.py
|
||||
git commit -m "feat(learn-theme): add CLI with terminal report"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: End-to-end test with real URL
|
||||
|
||||
**Files:** None (manual verification)
|
||||
|
||||
- [ ] **Step 1: Run full pipeline against the example article**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 scripts/learn_theme.py https://mp.weixin.qq.com/s/WRBHkWP6wdQ6Qznq-KHIYg --name test-learned
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
Fetching article...
|
||||
Title: <article title>
|
||||
Extracted N styled elements.
|
||||
|
||||
Learned theme from: <title>
|
||||
text: #000000
|
||||
primary: #2d71d6
|
||||
...
|
||||
Theme saved → .../toolkit/themes/test-learned.yaml
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Verify the generated theme works with preview**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 toolkit/cli.py preview toolkit/../docs/superpowers/specs/2026-04-01-learn-theme-design.md --theme test-learned --no-open -o /tmp/test-learned-preview.html
|
||||
```
|
||||
|
||||
Expected: HTML file generated without errors. Open manually to inspect if desired.
|
||||
|
||||
- [ ] **Step 3: Verify theme appears in theme list**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 toolkit/cli.py themes | grep test-learned
|
||||
```
|
||||
|
||||
Expected: Shows `test-learned` with its description.
|
||||
|
||||
- [ ] **Step 4: Clean up test theme**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
rm toolkit/themes/test-learned.yaml
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Commit (no code changes — this is a verification task)**
|
||||
|
||||
No commit needed.
|
||||
|
||||
---
|
||||
|
||||
### Task 7: SKILL.md integration
|
||||
|
||||
**Files:**
|
||||
- Modify: `SKILL.md`
|
||||
|
||||
- [ ] **Step 1: Add "学习排版" trigger to auxiliary functions section**
|
||||
|
||||
In `SKILL.md`, locate the auxiliary functions block (line ~46). Add a new entry after the "学习我的修改" line:
|
||||
|
||||
```markdown
|
||||
- 用户说"学习排版"/"学排版" → `python3 {skill_dir}/scripts/learn_theme.py <url> --name <name>`,用户需提供一个公众号文章 URL 和主题名称。提取完成后提示用户设置 `style.yaml` 的 `theme` 字段。
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add to the Step 8.3 trigger table**
|
||||
|
||||
In the trigger-word table near the end of SKILL.md, add a row:
|
||||
|
||||
```markdown
|
||||
| 学习排版 / 学排版 | `python3 {skill_dir}/scripts/learn_theme.py <url> --name <name>` |
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add SKILL.md
|
||||
git commit -m "feat(learn-theme): add trigger words to SKILL.md"
|
||||
```
|
||||
|
|
@ -1,155 +0,0 @@
|
|||
# Learn Theme — 从公众号文章 URL 提取排版主题
|
||||
|
||||
**日期**: 2026-04-01
|
||||
**状态**: 设计完成,待实现
|
||||
|
||||
## 概述
|
||||
|
||||
新增 `learn-theme` 功能:用户提供一个微信公众号文章 URL,脚本自动抓取 HTML、提取 inline style,生成与现有 16 个主题格式一致的 YAML 主题文件,立即可用于排版。
|
||||
|
||||
## 用户接口
|
||||
|
||||
```bash
|
||||
python3 toolkit/cli.py learn-theme https://mp.weixin.qq.com/s/xxxx --name my-style
|
||||
```
|
||||
|
||||
- `url`(必填):微信公众号文章链接
|
||||
- `--name`(必填):主题名称,用于文件名和后续引用
|
||||
- 输出:`toolkit/themes/{name}.yaml`
|
||||
|
||||
成功后终端打印提取摘要(主色、字号、行高等)并提示:
|
||||
```
|
||||
Theme saved to toolkit/themes/my-style.yaml
|
||||
Use it: python3 toolkit/cli.py preview article.md --theme my-style
|
||||
Or set in style.yaml: theme: my-style
|
||||
```
|
||||
|
||||
## 核心流程
|
||||
|
||||
```
|
||||
URL → fetch HTML → parse #js_content → 按元素类型提取 inline style
|
||||
→ 频率统计 + 语义角色推断 → 生成 theme YAML → 写入 toolkit/themes/
|
||||
```
|
||||
|
||||
### Step 1: Fetch
|
||||
|
||||
- `requests.get(url)` + 浏览器 User-Agent header
|
||||
- 强制 UTF-8 解码(微信 API 不声明 charset)
|
||||
- 验证返回的 HTML 包含 `#js_content`,否则报错退出
|
||||
|
||||
### Step 2: Extract
|
||||
|
||||
用 BeautifulSoup 解析 `#js_content` 内所有带 inline style 的元素,按标签类型分组收集 CSS 属性值。
|
||||
|
||||
目标元素:`p`, `section`, `span`, `strong`, `h1`-`h4`, `blockquote`, `code`, `pre`, `img`
|
||||
|
||||
每个元素提取的属性:`color`, `font-size`, `line-height`, `letter-spacing`, `font-family`, `background`, `background-color`, `border-left`, `border-bottom`, `border-radius`, `margin`, `padding`
|
||||
|
||||
### Step 3: Analyze — 语义角色推断
|
||||
|
||||
**层 1 — 配色 + 字号体系(占观感 ~70%):**
|
||||
|
||||
| 目标属性 | 数据来源 | 推断逻辑 |
|
||||
|---------|---------|---------|
|
||||
| `text` | `<p>` 的 `color` | 最高频颜色 |
|
||||
| `text_light` | 所有元素的灰色系 `color` | 排除亮度 >0.85 和 <0.15 的值,取亮度最高的灰色 |
|
||||
| `primary` | `<strong>`, `<section>`(font-size≥20px) 的非灰色 `color` | 大字号标题色权重 ×5,取最高频 |
|
||||
| `secondary` | 同上 | 第二高频非灰色,无则从 primary 调亮 10% 派生 |
|
||||
| `background` | `#js_content` 或顶层 `section` 的 `background` | 直接取,无则默认 `#ffffff` |
|
||||
| 正文字号 | `<p>` 的 `font-size` | 众数 |
|
||||
| 行高 | `<p>` 的 `line-height` | 众数 |
|
||||
| 字间距 | `<p>` 的 `letter-spacing` | 众数,无则不设 |
|
||||
| 字体 | `<span>` 的 `font-family` | 最高频 |
|
||||
|
||||
**层 2 — 装饰细节(占观感 ~20%):**
|
||||
|
||||
| 目标属性 | 数据来源 | 推断逻辑 |
|
||||
|---------|---------|---------|
|
||||
| `quote_border` | 含 `border-left` 的 `blockquote`/`section` | 直接取 border-left-color,无则用 primary |
|
||||
| `quote_bg` | 同上的 `background` | 直接取,无则从 primary 派生浅底色 |
|
||||
| `code_bg` | `<code>`, `<pre>` 的 `background` | 直接取,无则默认深色 `#1e293b` |
|
||||
| `code_color` | `<code>`, `<pre>` 的 `color` | 直接取,无则默认 `#e2e8f0` |
|
||||
| `border_radius` | 所有元素 `border-radius` | 众数,无则默认 `6px` |
|
||||
| 标题装饰 | `<h1>`-`<h3>` 的 border/padding | 直接映射到 CSS |
|
||||
| 段落间距 | `<p>` 的 `margin` | 众数,映射到 body margin 和 p margin |
|
||||
|
||||
### Step 4: Generate
|
||||
|
||||
基于提取结果,以 `professional-clean.yaml` 为模板(确保所有 CSS selector 覆盖完整),替换颜色值和排版参数,生成完整的 theme YAML。
|
||||
|
||||
结构:
|
||||
```yaml
|
||||
name: "{name}"
|
||||
description: "从 {article_title} 学习的排版主题"
|
||||
colors:
|
||||
primary: "{extracted}"
|
||||
secondary: "{extracted}"
|
||||
text: "{extracted}"
|
||||
text_light: "{extracted}"
|
||||
background: "{extracted}"
|
||||
code_bg: "{extracted}"
|
||||
code_color: "{extracted}"
|
||||
quote_border: "{extracted}"
|
||||
quote_bg: "{extracted}"
|
||||
border_radius: "{extracted}"
|
||||
darkmode:
|
||||
# 从 light mode 颜色自动派生
|
||||
base_css: |
|
||||
# 基于 professional-clean 模板,替换提取到的值
|
||||
```
|
||||
|
||||
**Darkmode 派生规则:**
|
||||
- `background` → 取反亮度,钳位到 `#1a1a1a`-`#2a2a2a`
|
||||
- `text` → 亮度提升到 0.8,如 `#c8c8c8`
|
||||
- `primary` → 饱和度不变,亮度提升 15%
|
||||
- 其余属性同理微调
|
||||
|
||||
### Step 5: Report
|
||||
|
||||
终端输出提取摘要:
|
||||
```
|
||||
Learned theme from: 短剧行业的AI重构
|
||||
text: #000000
|
||||
primary: #2d71d6 (blue)
|
||||
secondary: #5f9cef
|
||||
font: Optima-Regular, PingFangTC-light
|
||||
size: 16px / line-height 1.75 / spacing 1px
|
||||
|
||||
Theme saved → toolkit/themes/my-style.yaml
|
||||
```
|
||||
|
||||
## Fallback 策略
|
||||
|
||||
每个属性提取失败时的默认值继承自 `professional-clean` 主题,确保输出始终是一个完整可用的主题文件。不会因为文章结构简单就生成残缺主题。
|
||||
|
||||
## 文件结构
|
||||
|
||||
```
|
||||
scripts/learn_theme.py # 核心逻辑:fetch + extract + analyze + generate
|
||||
toolkit/cli.py # 新增 learn-theme 子命令,调用 scripts/learn_theme.py
|
||||
```
|
||||
|
||||
**不修改的文件:** `theme.py`, `converter.py`, 现有主题文件 — 零侵入。
|
||||
|
||||
## SKILL.md 集成
|
||||
|
||||
在辅助功能列表中新增触发词"学习排版"/"学排版",调用:
|
||||
```bash
|
||||
python3 scripts/learn_theme.py <url> --name <name>
|
||||
```
|
||||
|
||||
## 依赖
|
||||
|
||||
- `requests`(已有)
|
||||
- `beautifulsoup4`(已有)
|
||||
- `pyyaml`(已有)
|
||||
- `colorsys`(标准库)
|
||||
|
||||
无新依赖。
|
||||
|
||||
## 不做的事
|
||||
|
||||
- 不学习结构布局(卡片嵌套、多栏、SVG 分割线)
|
||||
- 不支持本地 HTML 文件输入(仅 URL)
|
||||
- 不自动处理微信登录/验证码(抓取失败直接报错)
|
||||
- 不修改现有主题系统代码
|
||||
Loading…
Reference in a new issue