Commit graph

2315 commits

Author SHA1 Message Date
777genius
d0c64fabb8 fix: export notifyTeamWatchScopeChanged so the committed build resolves
TeamProvisioningService imports notifyTeamWatchScopeChanged (added with the
setAliveRunId/deleteAliveRunId helpers) but the export was missing, so a clean
checkout of the branch failed to typecheck. Add the export plus a test; the
call-site wiring stays as in-progress work.
2026-05-30 12:33:57 +03:00
777genius
ccea3e015d perf: avoid opening bootstrap transcripts on cache hits
readRecentBootstrapTranscriptOutcome opened the session file and stat'd it
through the handle BEFORE checking the per-(file,mtime,size) outcome cache, so
every cache hit still paid a full open()+close(). During a tracked launch the
per-member lookup cache is bypassed, so the project-dir scan re-runs every poll
across every recent session file x every member, turning each cache hit into a
wasted open() syscall. The native sample of a 6-member mixed launch put __open
at ~51-54k; correcting the earlier attribution, this open-before-cache-check
(NOT the watcher rebuilds addressed in the previous commit) is the dominant
source -- removing the per-rebuild watcher churn left __open essentially flat.

Stat the path with fs.promises.stat (no fd) for the cache check and return the
cached outcome without opening. getParsedBootstrapTranscriptTail now opens the
file itself, lazily, only when its shared parse cache also misses (the file
genuinely changed since last parse), so a hit on either the per-member outcome
cache or the shared tail-parse cache avoids the open entirely. The tail read is
wrapped in try/finally to close the handle. Parsing/scan logic is byte-for-byte
unchanged; only the redundant open is removed. Bootstrap-transcript tests pass.
2026-05-30 12:23:48 +03:00
777genius
8b3cec8013 perf: update team watcher incrementally instead of full rebuild
A team launch repeatedly changes the watched target set (new dirs appear), and each
change tore down the chokidar watcher and recreated it over the full target set.
On macOS chokidar uses kqueue with one fd per watched file, so every rebuild
re-opened an fd for EVERY watched file (the large always-watched inbox set plus
scoped dirs). Profiling a 6-member mixed launch showed ~54k open() syscalls dominated
by these rebuilds.

Keep one persistent watcher and apply target-set changes with add()/unwatch() on the
delta only, so a reconcile opens fds for just the newly added dirs. The initial
watcher still uses ignoreInitial for a silent startup baseline, and
emitExistingFilesForNewTargets still backfills files already present in newly added
dirs, so the emitted event surface is unchanged. Because the watcher is no longer
recreated per reconcile, the stale-old-generation and close-throws-during-rebuild
failure modes are gone; their tests are replaced with incremental add/unwatch and
persistent-watcher coverage. All 69 watcher tests pass.
2026-05-30 12:05:23 +03:00
777genius
b3565a0d29 perf: skip stale session files when scanning for bootstrap transcripts
readBootstrapTranscriptOutcomesInProjectRoot iterated every .jsonl in the project
dir, opening + tail-reading each per member per bootstrap poll. A real project dir
(e.g. ~280 session files) made this the dominant file-open churn during launch (the
native sample showed ~56k open() syscalls).

A transcript last modified before the lookup window cannot contain a bootstrap line
at/after sinceMs (append-only logs: a line's timestamp <= its write time <= the file
mtime), so readRecentBootstrapTranscriptOutcome returns null for it. Skip those with
a cheap stat instead of opening them; a 5s slack absorbs clock skew between the line
timestamp source and the filesystem mtime. Behavior is unchanged (only files that
would have returned null are skipped); bootstrap-transcript detection tests still pass.
2026-05-30 11:16:18 +03:00
777genius
a4b9512c7c perf: keep positive transcript team-affinity cached as files grow
fileBelongsToTeam streams a transcript's head lines to decide if it belongs to a
team, cached by (mtime,size). During launch the team's own session transcripts
grow on every poll, invalidating the cache and forcing a re-stream + head re-parse
each time (profiled at ~7-8% main-thread JS after the earlier fixes).

A positive affinity is decided by early head lines that persist as an append-only
transcript grows, so a true result stays valid while the file only grows. Reuse a
cached true when size has not shrunk; a false result is still re-checked on any
change (a short file may grow head lines mentioning the team) and a shrink forces a
re-scan. Existing resolver tests still pass.
2026-05-30 10:15:07 +03:00
777genius
1b4838d422 perf: normalize bootstrap transcript lines once across members
isBootstrapTranscriptContextText and getBootstrapTranscriptSuccessSource each ran
text.replace(/\s+/g,' ').trim().toLowerCase() internally. During launch the
bootstrap scan checks every transcript line against every context member for every
member's poll, so the same line was re-normalized up to (members x contextMembers)
times per cycle. Profiling a 6-member mixed launch showed isBootstrapTranscriptContextText
at ~11% main-thread JS even after the shared-parse cache.

Precompute the normalized form once per parsed line (already cached) and pass it to
both detection helpers via a new optional precomputedNormalizedText parameter. The
value is identical to what the helpers computed internally, so detection is byte-for-byte
unchanged; the helpers stay backward compatible for callers that omit it.
2026-05-30 10:06:47 +03:00
777genius
f79ea145d7 perf: batch per-member task-interval resume into one locked pass
During launch the live-status loop resumes every alive member every audit cycle.
resumeActiveIntervalsForMember runs a synchronous file-lock + full read of every
task file, so for an N-member team with M task files it did N locked passes x M
readFileSync per cycle (e.g. 6 members x 20 task files), blocking the main event
loop. Profiling a 6-member mixed launch showed mutateTeamTasks/withFileLockSync as
a top main-thread cost (~14%).

Add resumeActiveIntervalsForMembers that applies the identical per-member resume
logic against a member set in a single locked pass, and use it in the live-status
loop. Same mutations, but one lock + task read per cycle instead of one per member.
Adds a test covering multi-member resume in one pass.
2026-05-30 10:02:01 +03:00
777genius
aa9a1bba8c perf: debounce team watcher rebuilds during dir-event bursts
A team launch creates many directories/files in quick succession (worktrees,
inboxes, session logs), and each addDir/unlinkDir event triggered a full
TeamTaskWatchRegistry reconcile that tore down and recreated the entire chokidar
watcher (re-opening a kqueue fd per watched file on macOS). Profiling a 6-member
mixed-team launch showed kqueue churn (kevent) as a top native cost and watcher
rebuild as the top remaining main-thread JS cost after the transcript fix.

Debounce the event-driven reconcile (250ms) so a burst collapses into one rebuild.
collectTargets re-reads the current directory state and emitExistingFilesForNewTargets
backfills files created before the rebuild, so no change is missed; requestReconcile,
startup, and the periodic 30s reconcile stay immediate. Adds a test asserting a
burst of addDir events yields a single rebuild.
2026-05-30 09:46:16 +03:00
777genius
e1475deede optimize landing robot images
downscale oversized robot webp assets to fit-in-480px and recompress at q82.
they were shipping at 1024x1536 but render at under 240px, so this cuts the
total robot payload from ~1.08MB to ~163KB without changing aspect ratios,
layout or transparency.
2026-05-30 09:46:12 +03:00
777genius
35b76f1354 perf: share bootstrap transcript tail parse across members
During launch, the bootstrap-wait loop polls each member and, per member, re-read
and re-JSON.parsed the same growing transcript tail (readRecentBootstrapTranscriptOutcome
was the top main-thread JS hotspot at ~21% during bootstrap, ~40% with its helpers).
The same file was parsed once per member per poll.

Memoize the parsed tail by (filePath, mtime, size) in a shared cache so the file is
read + parsed once per change and reused across all members. The per-member filter
and failure/success scan is byte-for-byte the same logic; only the redundant read +
JSON.parse is removed. Cache is bounded (LRU, same cap as the outcome cache) and
invalidated on mtime/size change, matching the existing outcome cache semantics.

Adds a test asserting the tail is parsed once and shared while per-member outcome
detection is unchanged.
2026-05-30 01:05:54 +03:00
777genius
5d63ecfe32 perf: scope team file watching to active and engaged teams
The main process watched every team directory under ~/.claude/teams (one shallow
chokidar target per team root, per team inboxes, and per task dir). On macOS this
falls back to kqueue, which needs one fd per watched file, so a workspace with
many teams kept ~1600 descriptors open and made startup and reconcile work scale
with the number of teams on disk.

Scope the team-root and task watching to teams that are running or currently
engaged in the UI. The teams root and every team's inboxes are still watched for
all teams, so cross-team message delivery, the lead inbox->stdin relay, and
notifications are unchanged. Idle teams are static, so dropping their team-root/
task watches is safe; opening a team (getData) or launching it re-adds it via an
immediate watch-scope refresh. The provider falls back to watching every team
when unset, and the EMFILE polling fallback is intentionally left unscoped so a
scope change can never look like a deletion.

Measured on a 162-team workspace: open team fds 1600 -> 730, with team-root
watching restored the moment a team is opened or goes live.
2026-05-30 00:25:55 +03:00
777genius
d06ea7f265 perf: cache runtime process table to stop ps spawn storm
listRuntimeProcesses spawned a full `ps -ax` on every call with no caching.
It is invoked very frequently while a team runs: runtime liveness/telemetry
snapshots are rebuilt whenever a team file changes (invalidateRuntimeSnapshotCaches
fires from ~25 sites), so the main process ended up forking ps dozens of times
per second, which is expensive from the large Electron main process and pegged
its CPU.

Add a 1s TTL cache plus in-flight coalescing on the single ps spawn point.
Liveness/telemetry callers already tolerate ~2s staleness via their own snapshot
caches and the OS process table changes negligibly within a second, so this caps
ps to <=1/s without affecting liveness correctness. Measured posix_spawn dropped
from ~146/s to ~11/s with a team running.
2026-05-29 23:46:22 +03:00
777genius
c0b9b4ec5d perf: isolate task detail dialog state 2026-05-29 17:02:00 +03:00
777genius
b830a3c53d perf: preload task detail dialog 2026-05-29 16:39:38 +03:00
777genius
5898fbeaa9 perf: slow live runtime polling 2026-05-29 16:15:54 +03:00
777genius
322c63ea8b perf: skip offline runtime polling 2026-05-29 16:11:16 +03:00
777genius
e4483a3ad6 perf: reduce idle team refresh polling 2026-05-29 15:57:12 +03:00
777genius
889e4cc374 perf: avoid sidebar timeline virtualization churn 2026-05-29 15:54:49 +03:00
777genius
0555a5e3be perf: trim kanban resize handles 2026-05-29 15:51:40 +03:00
777genius
b4b9175287 perf: reuse team summary for comment notification init 2026-05-29 15:43:24 +03:00
777genius
e1e5b68e8a perf: simplify kanban task card chrome 2026-05-29 15:43:03 +03:00
777genius
32b7bc3386 perf: unmount hidden team list tab 2026-05-29 15:37:12 +03:00
777genius
0c6bf5dd33 perf: page large team sections 2026-05-29 15:30:27 +03:00
777genius
2793dfc5b2 fix: keep team card text measurable 2026-05-29 15:25:01 +03:00
777genius
7ab0a41728 perf: simplify compact thought previews 2026-05-29 15:20:03 +03:00
777genius
f05aefe097 perf: trim team page render overhead 2026-05-29 15:15:01 +03:00
777genius
ef129f8931 perf: skip unchanged lead log renders 2026-05-29 14:46:54 +03:00
777genius
2303346e9c perf: avoid duplicate bootstrap transcript parsing 2026-05-29 14:42:42 +03:00
777genius
f0514a7d17 perf: skip unverifiable runtime process scans 2026-05-29 14:38:01 +03:00
777genius
9be096f864 perf: cache persisted bootstrap outcome lookups 2026-05-29 14:19:04 +03:00
777genius
d0c6fdd28c perf: extend persisted runtime probe cache 2026-05-29 14:04:28 +03:00
777genius
dd71420314 perf: extend persisted spawn status cache 2026-05-29 13:10:41 +03:00
777genius
35a9b05637 perf: cache persisted spawn status reads 2026-05-29 13:06:26 +03:00
777genius
fa242d9ff6 perf: cache bootstrap transcript outcomes 2026-05-29 12:58:15 +03:00
777genius
0b97985474 perf: cache team transcript affinity checks 2026-05-29 12:46:29 +03:00
777genius
169ac8bb68 perf: include process table usage metrics 2026-05-29 12:34:13 +03:00
777genius
3b0c2ed24b perf: cache runtime usage telemetry 2026-05-29 12:29:37 +03:00
777genius
7d21f9bd76 perf: avoid stale runtime pid sampling 2026-05-29 12:26:15 +03:00
777genius
906942cb7a perf: isolate messages panel logic exports 2026-05-29 12:26:09 +03:00
777genius
3c37b22379 perf: debounce messages scroll persistence 2026-05-29 12:26:04 +03:00
777genius
fa3f8ce85c perf: defer message composer suggestion data 2026-05-29 12:25:53 +03:00
777genius
4b85433afb perf: stop composer orbit idle animation 2026-05-29 12:23:45 +03:00
777genius
4458ec1fd7 fix(opencode): wire junction diagnostics on dev 2026-05-28 13:12:02 +03:00
777genius
8abf4ea7dd fix(opencode): harden Windows junction retry 2026-05-28 13:08:55 +03:00
ComradeSwarog
b12106d8f4 fix(test): use expect.any(String) for junction error message assertions
The failure.message passed to ensureOpenCodeProfileNodeModulesJunction
comes from normalizeCommandFailure which may produce a JSON-escaped
string when the error contains structured JSON in stdout. Using the
raw runtimeMessage literal causes a mismatch in CI. Switch to
expect.any(String) to accept any string value for the errorMessage
parameter while still verifying the call happens.
2026-05-28 13:08:55 +03:00
ComradeSwarog
cc3c9f7dc7 fix(opencode): address code review feedback — extract paths from error message, fix test imports
- Extract symlink source/target paths directly from the error message
  instead of reconstructing them from process.env (Codex P2 review)
- Add extractSymlinkSourcePath and extractSymlinkTargetPath functions
- Update ensureOpenCodeProfileNodeModulesJunction to accept optional
  errorMessage parameter and use extracted paths from it
- Fix unused imports in test (remove 'os', replace 'beforeEach' with
  'afterEach' per CodeRabbit review)
- Widen fs.statSync mock signatures to use Parameters<typeof fs.statSync>
  per CodeRabbit review
- Add tests for new extraction functions
- Pass errorMessage to ensureOpenCodeProfileNodeModulesJunction calls
  in CLI client tests
2026-05-28 13:08:55 +03:00
ComradeSwarog
597c690dbc fix(opencode): add Windows junction fallback for node_modules EPERM symlink error (#187)
On Windows 10 without Developer Mode, the OpenCode runtime fails to create
a symlink from shared-cache/config-node_modules to the profile's
node_modules directory. The EPERM error blocks the entire OpenCode provider
catalog, leaving it unavailable.

Changes:
- New openCodeWindowsNodeModulesJunction module that pre-creates a Windows
  directory junction (no Developer Mode required) before the runtime call
  when an EPERM symlink error is detected
- On Windows, loadView and loadProviderDirectory now detect EPERM symlink
  errors, extract the profile ID, create the junction, and retry the
  runtime command once before falling back to the error response
- Updated diagnostic hints to accurately reflect that the runtime does not
  yet include junction fallback, and that the next runtime update will
  include it
- Added unit tests for the junction module and retry behavior
2026-05-28 13:08:54 +03:00
777genius
1126b1ee38 fix(ci): restore dev validation 2026-05-28 01:47:43 +03:00
infiniti
fa36d7f3c0
fix(opencode): extend summary status timeout 2026-05-28 00:39:53 +03:00
777genius
6fbba5feb9 fix(runtime): default provider panel to providers 2026-05-28 00:27:54 +03:00