agent-ecosystem

Author	SHA1	Message	Date
777genius	2dcd5cb6a8	perf: align ps process-table cache TTL with the 2s consumer window The runtime process table is read through a module-level executor cache (1s TTL) but its consumers rebuild on a 2s snapshot-cache cadence and fire constantly during a launch (every team file change invalidates a per-team snapshot). A 1s table TTL is shorter than the 2s read cadence, so 'ps -ax' was re-spawned on essentially every consumer rebuild for no freshness benefit -- a top main-thread cost in the warm launch profile (~5-8%). Raise the table TTL to 2s to match the consumer window so the spawn coalesces to at most once per consumer cycle. Safe: liveness is identity-matched (team+agent+command), not bare-PID, and OpenCode host cleanup re-validates each PID against live state before killing, so a ~2s-stale table cannot cause a wrong liveness verdict or an unsafe kill -- it only widens an already-tolerated display-staleness window by ~1s. No test asserts the TTL value.	2026-05-30 14:29:21 +03:00
777genius	776298b0e3	perf: reuse caller stat in fileBelongsToTeam, drop duplicate fs.stat On the live resolution path collectRootJsonlSessionIds already stat()s each root jsonl for its mtime-window filter, then fileBelongsToTeam stat()ed the very same file again for its cache validation -- two fs.stat syscalls (plus two Stats allocations) per file, every poll. fileBelongsToTeam now takes an optional precomputed stat and the mtime-filter caller passes the stat it already has, so the file is statted once. Measured 20 files -> 20 stat calls on the mtime path (was ~40). Using a single stat snapshot is also slightly more consistent than two reads that could straddle a concurrent write. The other call site (subagent scan) passes no stat and is unchanged (fileBelongsToTeam stats it itself). Adds a regression test that a caller-supplied stat is the one recorded in the affinity cache.	2026-05-30 14:11:58 +03:00
777genius	635321dd9a	perf: drop per-acquire existsSync stat from file lock fast path tryAcquire() ran fs.existsSync(dir) on every lock acquisition to lazily create the lock directory. The dir almost always already exists (team dirs are created up front), so that stat was wasted on every acquire — and the file-lock path churns heavily during a launch (inbox writes, journals, task mutations all lock per write, showing up as unlink/open/stat syscall load in the launch profile). Attempt the openSync('wx') directly; only on ENOENT create the directory and retry the acquire in the same call. Same locking semantics, same files, same first-acquire latency for a missing dir — just one fewer stat per acquire in the common case.	2026-05-30 13:55:43 +03:00
777genius	c8d40be460	perf: cache negative team-affinity verdicts from a full head window fileBelongsToTeam only cached POSITIVE affinity durably; a negative verdict was re-decided on any change, so during a launch every non-matching transcript in the project dir that grew (mtime+size change from an active session) was re-streamed (createReadStream+readline) and re-parsed (up to 40 head lines) on every bootstrap poll. A live atlas-hq-5 launch profile put this whole subsystem (readline streaming + fileBelongsToTeam + line/team matching) at ~31% of main-thread JS, the single largest launch cost. A team's first 40 head lines are immutable for an append-only transcript, so a `false` decided from a FULL inspected window (>= TEAM_AFFINITY_SCAN_LINES) stays valid while the file only grows. Track headWindowFull on the cache entry and short- circuit such negatives the same way positives are short-circuited (size >= cached). Short files (partial window) are still re-scanned on growth, so a team mention that later lands inside the head window is still detected. A shrink/rewrite (size < cached) forces a re-scan, identical to the positive path. Behavior-preserving for affinity correctness (no new false negatives); only removes redundant re-streams. Adds regression tests for both the durable-negative and the short-file-flips-to-true cases.	2026-05-30 13:17:49 +03:00
777genius	126a485477	wip: team messages panel updates and runtime usage cache refinements Checkpoint of in-progress work: - renderer: team messages panel/composer, messagesPanelLogic, teamSlice, AnimatedHeightReveal plus their tests - main: runtime process usage-stats caching (ignoreCachedMisses, bounded eviction), alive-run-id helpers, team watch-scope notify wiring Note: the getTeamAgentRuntimeSnapshot rssBytes expectation in TeamAgentLaunchMatrix.safe-e2e is environment-dependent and still red.	2026-05-30 12:54:11 +03:00
777genius	d0c64fabb8	fix: export notifyTeamWatchScopeChanged so the committed build resolves TeamProvisioningService imports notifyTeamWatchScopeChanged (added with the setAliveRunId/deleteAliveRunId helpers) but the export was missing, so a clean checkout of the branch failed to typecheck. Add the export plus a test; the call-site wiring stays as in-progress work.	2026-05-30 12:33:57 +03:00
777genius	ccea3e015d	perf: avoid opening bootstrap transcripts on cache hits readRecentBootstrapTranscriptOutcome opened the session file and stat'd it through the handle BEFORE checking the per-(file,mtime,size) outcome cache, so every cache hit still paid a full open()+close(). During a tracked launch the per-member lookup cache is bypassed, so the project-dir scan re-runs every poll across every recent session file x every member, turning each cache hit into a wasted open() syscall. The native sample of a 6-member mixed launch put __open at ~51-54k; correcting the earlier attribution, this open-before-cache-check (NOT the watcher rebuilds addressed in the previous commit) is the dominant source -- removing the per-rebuild watcher churn left __open essentially flat. Stat the path with fs.promises.stat (no fd) for the cache check and return the cached outcome without opening. getParsedBootstrapTranscriptTail now opens the file itself, lazily, only when its shared parse cache also misses (the file genuinely changed since last parse), so a hit on either the per-member outcome cache or the shared tail-parse cache avoids the open entirely. The tail read is wrapped in try/finally to close the handle. Parsing/scan logic is byte-for-byte unchanged; only the redundant open is removed. Bootstrap-transcript tests pass.	2026-05-30 12:23:48 +03:00
777genius	8b3cec8013	perf: update team watcher incrementally instead of full rebuild A team launch repeatedly changes the watched target set (new dirs appear), and each change tore down the chokidar watcher and recreated it over the full target set. On macOS chokidar uses kqueue with one fd per watched file, so every rebuild re-opened an fd for EVERY watched file (the large always-watched inbox set plus scoped dirs). Profiling a 6-member mixed launch showed ~54k open() syscalls dominated by these rebuilds. Keep one persistent watcher and apply target-set changes with add()/unwatch() on the delta only, so a reconcile opens fds for just the newly added dirs. The initial watcher still uses ignoreInitial for a silent startup baseline, and emitExistingFilesForNewTargets still backfills files already present in newly added dirs, so the emitted event surface is unchanged. Because the watcher is no longer recreated per reconcile, the stale-old-generation and close-throws-during-rebuild failure modes are gone; their tests are replaced with incremental add/unwatch and persistent-watcher coverage. All 69 watcher tests pass.	2026-05-30 12:05:23 +03:00
777genius	b3565a0d29	perf: skip stale session files when scanning for bootstrap transcripts readBootstrapTranscriptOutcomesInProjectRoot iterated every .jsonl in the project dir, opening + tail-reading each per member per bootstrap poll. A real project dir (e.g. ~280 session files) made this the dominant file-open churn during launch (the native sample showed ~56k open() syscalls). A transcript last modified before the lookup window cannot contain a bootstrap line at/after sinceMs (append-only logs: a line's timestamp <= its write time <= the file mtime), so readRecentBootstrapTranscriptOutcome returns null for it. Skip those with a cheap stat instead of opening them; a 5s slack absorbs clock skew between the line timestamp source and the filesystem mtime. Behavior is unchanged (only files that would have returned null are skipped); bootstrap-transcript detection tests still pass.	2026-05-30 11:16:18 +03:00
777genius	a4b9512c7c	perf: keep positive transcript team-affinity cached as files grow fileBelongsToTeam streams a transcript's head lines to decide if it belongs to a team, cached by (mtime,size). During launch the team's own session transcripts grow on every poll, invalidating the cache and forcing a re-stream + head re-parse each time (profiled at ~7-8% main-thread JS after the earlier fixes). A positive affinity is decided by early head lines that persist as an append-only transcript grows, so a true result stays valid while the file only grows. Reuse a cached true when size has not shrunk; a false result is still re-checked on any change (a short file may grow head lines mentioning the team) and a shrink forces a re-scan. Existing resolver tests still pass.	2026-05-30 10:15:07 +03:00
777genius	1b4838d422	perf: normalize bootstrap transcript lines once across members isBootstrapTranscriptContextText and getBootstrapTranscriptSuccessSource each ran text.replace(/\s+/g,' ').trim().toLowerCase() internally. During launch the bootstrap scan checks every transcript line against every context member for every member's poll, so the same line was re-normalized up to (members x contextMembers) times per cycle. Profiling a 6-member mixed launch showed isBootstrapTranscriptContextText at ~11% main-thread JS even after the shared-parse cache. Precompute the normalized form once per parsed line (already cached) and pass it to both detection helpers via a new optional precomputedNormalizedText parameter. The value is identical to what the helpers computed internally, so detection is byte-for-byte unchanged; the helpers stay backward compatible for callers that omit it.	2026-05-30 10:06:47 +03:00
777genius	f79ea145d7	perf: batch per-member task-interval resume into one locked pass During launch the live-status loop resumes every alive member every audit cycle. resumeActiveIntervalsForMember runs a synchronous file-lock + full read of every task file, so for an N-member team with M task files it did N locked passes x M readFileSync per cycle (e.g. 6 members x 20 task files), blocking the main event loop. Profiling a 6-member mixed launch showed mutateTeamTasks/withFileLockSync as a top main-thread cost (~14%). Add resumeActiveIntervalsForMembers that applies the identical per-member resume logic against a member set in a single locked pass, and use it in the live-status loop. Same mutations, but one lock + task read per cycle instead of one per member. Adds a test covering multi-member resume in one pass.	2026-05-30 10:02:01 +03:00
777genius	aa9a1bba8c	perf: debounce team watcher rebuilds during dir-event bursts A team launch creates many directories/files in quick succession (worktrees, inboxes, session logs), and each addDir/unlinkDir event triggered a full TeamTaskWatchRegistry reconcile that tore down and recreated the entire chokidar watcher (re-opening a kqueue fd per watched file on macOS). Profiling a 6-member mixed-team launch showed kqueue churn (kevent) as a top native cost and watcher rebuild as the top remaining main-thread JS cost after the transcript fix. Debounce the event-driven reconcile (250ms) so a burst collapses into one rebuild. collectTargets re-reads the current directory state and emitExistingFilesForNewTargets backfills files created before the rebuild, so no change is missed; requestReconcile, startup, and the periodic 30s reconcile stay immediate. Adds a test asserting a burst of addDir events yields a single rebuild.	2026-05-30 09:46:16 +03:00
777genius	e1475deede	optimize landing robot images downscale oversized robot webp assets to fit-in-480px and recompress at q82. they were shipping at 1024x1536 but render at under 240px, so this cuts the total robot payload from ~1.08MB to ~163KB without changing aspect ratios, layout or transparency.	2026-05-30 09:46:12 +03:00
777genius	35b76f1354	perf: share bootstrap transcript tail parse across members During launch, the bootstrap-wait loop polls each member and, per member, re-read and re-JSON.parsed the same growing transcript tail (readRecentBootstrapTranscriptOutcome was the top main-thread JS hotspot at ~21% during bootstrap, ~40% with its helpers). The same file was parsed once per member per poll. Memoize the parsed tail by (filePath, mtime, size) in a shared cache so the file is read + parsed once per change and reused across all members. The per-member filter and failure/success scan is byte-for-byte the same logic; only the redundant read + JSON.parse is removed. Cache is bounded (LRU, same cap as the outcome cache) and invalidated on mtime/size change, matching the existing outcome cache semantics. Adds a test asserting the tail is parsed once and shared while per-member outcome detection is unchanged.	2026-05-30 01:05:54 +03:00
777genius	5d63ecfe32	perf: scope team file watching to active and engaged teams The main process watched every team directory under ~/.claude/teams (one shallow chokidar target per team root, per team inboxes, and per task dir). On macOS this falls back to kqueue, which needs one fd per watched file, so a workspace with many teams kept ~1600 descriptors open and made startup and reconcile work scale with the number of teams on disk. Scope the team-root and task watching to teams that are running or currently engaged in the UI. The teams root and every team's inboxes are still watched for all teams, so cross-team message delivery, the lead inbox->stdin relay, and notifications are unchanged. Idle teams are static, so dropping their team-root/ task watches is safe; opening a team (getData) or launching it re-adds it via an immediate watch-scope refresh. The provider falls back to watching every team when unset, and the EMFILE polling fallback is intentionally left unscoped so a scope change can never look like a deletion. Measured on a 162-team workspace: open team fds 1600 -> 730, with team-root watching restored the moment a team is opened or goes live.	2026-05-30 00:25:55 +03:00
777genius	d06ea7f265	perf: cache runtime process table to stop ps spawn storm listRuntimeProcesses spawned a full `ps -ax` on every call with no caching. It is invoked very frequently while a team runs: runtime liveness/telemetry snapshots are rebuilt whenever a team file changes (invalidateRuntimeSnapshotCaches fires from ~25 sites), so the main process ended up forking ps dozens of times per second, which is expensive from the large Electron main process and pegged its CPU. Add a 1s TTL cache plus in-flight coalescing on the single ps spawn point. Liveness/telemetry callers already tolerate ~2s staleness via their own snapshot caches and the OS process table changes negligibly within a second, so this caps ps to <=1/s without affecting liveness correctness. Measured posix_spawn dropped from ~146/s to ~11/s with a team running.	2026-05-29 23:46:22 +03:00
777genius	c0b9b4ec5d	perf: isolate task detail dialog state	2026-05-29 17:02:00 +03:00
777genius	b830a3c53d	perf: preload task detail dialog	2026-05-29 16:39:38 +03:00
777genius	5898fbeaa9	perf: slow live runtime polling	2026-05-29 16:15:54 +03:00
777genius	322c63ea8b	perf: skip offline runtime polling	2026-05-29 16:11:16 +03:00
777genius	e4483a3ad6	perf: reduce idle team refresh polling	2026-05-29 15:57:12 +03:00
777genius	889e4cc374	perf: avoid sidebar timeline virtualization churn	2026-05-29 15:54:49 +03:00
777genius	0555a5e3be	perf: trim kanban resize handles	2026-05-29 15:51:40 +03:00
777genius	b4b9175287	perf: reuse team summary for comment notification init	2026-05-29 15:43:24 +03:00
777genius	e1e5b68e8a	perf: simplify kanban task card chrome	2026-05-29 15:43:03 +03:00
777genius	32b7bc3386	perf: unmount hidden team list tab	2026-05-29 15:37:12 +03:00
777genius	0c6bf5dd33	perf: page large team sections	2026-05-29 15:30:27 +03:00
777genius	2793dfc5b2	fix: keep team card text measurable	2026-05-29 15:25:01 +03:00
777genius	7ab0a41728	perf: simplify compact thought previews	2026-05-29 15:20:03 +03:00
777genius	f05aefe097	perf: trim team page render overhead	2026-05-29 15:15:01 +03:00
777genius	ef129f8931	perf: skip unchanged lead log renders	2026-05-29 14:46:54 +03:00
777genius	2303346e9c	perf: avoid duplicate bootstrap transcript parsing	2026-05-29 14:42:42 +03:00
777genius	f0514a7d17	perf: skip unverifiable runtime process scans	2026-05-29 14:38:01 +03:00
777genius	9be096f864	perf: cache persisted bootstrap outcome lookups	2026-05-29 14:19:04 +03:00
777genius	d0c6fdd28c	perf: extend persisted runtime probe cache	2026-05-29 14:04:28 +03:00
777genius	dd71420314	perf: extend persisted spawn status cache	2026-05-29 13:10:41 +03:00
777genius	35a9b05637	perf: cache persisted spawn status reads	2026-05-29 13:06:26 +03:00
777genius	fa242d9ff6	perf: cache bootstrap transcript outcomes	2026-05-29 12:58:15 +03:00
777genius	0b97985474	perf: cache team transcript affinity checks	2026-05-29 12:46:29 +03:00
777genius	169ac8bb68	perf: include process table usage metrics	2026-05-29 12:34:13 +03:00
777genius	3b0c2ed24b	perf: cache runtime usage telemetry	2026-05-29 12:29:37 +03:00
777genius	7d21f9bd76	perf: avoid stale runtime pid sampling	2026-05-29 12:26:15 +03:00
777genius	906942cb7a	perf: isolate messages panel logic exports	2026-05-29 12:26:09 +03:00
777genius	3c37b22379	perf: debounce messages scroll persistence	2026-05-29 12:26:04 +03:00
777genius	fa3f8ce85c	perf: defer message composer suggestion data	2026-05-29 12:25:53 +03:00
777genius	4b85433afb	perf: stop composer orbit idle animation	2026-05-29 12:23:45 +03:00
777genius	4458ec1fd7	fix(opencode): wire junction diagnostics on dev	2026-05-28 13:12:02 +03:00
777genius	8abf4ea7dd	fix(opencode): harden Windows junction retry	2026-05-28 13:08:55 +03:00
ComradeSwarog	b12106d8f4	fix(test): use expect.any(String) for junction error message assertions The failure.message passed to ensureOpenCodeProfileNodeModulesJunction comes from normalizeCommandFailure which may produce a JSON-escaped string when the error contains structured JSON in stdout. Using the raw runtimeMessage literal causes a mismatch in CI. Switch to expect.any(String) to accept any string value for the errorMessage parameter while still verifying the call happens.	2026-05-28 13:08:55 +03:00

1 2 3 4 5 ...

2320 commits