agent-ecosystem

Author	SHA1	Message	Date
777genius	e7d7b3014e	perf(main): reduce runtime launch metadata work	2026-05-30 21:52:36 +03:00
777genius	9f8fc6895a	fix(renderer): use context menu open change signal	2026-05-30 21:02:51 +03:00
777genius	b8a53dcc09	perf(renderer): reduce member card render work	2026-05-30 20:58:31 +03:00
777genius	60d806135c	perf(renderer): lazy mount task context menus	2026-05-30 20:54:22 +03:00
777genius	2c13516d9f	perf(renderer): lazy render member hover cards	2026-05-30 20:50:17 +03:00
777genius	d59865f300	perf(team): reduce runtime telemetry polling	2026-05-30 20:03:45 +03:00
777genius	b13ee56359	perf(team): cap global task projections earlier	2026-05-30 18:57:45 +03:00
777genius	304d0a5ef1	perf(team): avoid redundant task cache clone	2026-05-30 18:48:43 +03:00
777genius	7be9158eb3	fix(team): prevent composer recipient overflow	2026-05-30 18:46:21 +03:00
777genius	a7606032fc	fix(runtime): preserve provider status during refresh failures	2026-05-30 18:44:31 +03:00
777genius	a06423a574	fix(team): preserve live overlay in worker message pages	2026-05-30 18:44:12 +03:00
777genius	180bdb7575	perf(team): cache transcript affinity verdicts	2026-05-30 18:39:16 +03:00
777genius	0a8fbc9801	fix: avoid render-time ref access in height reveal	2026-05-30 18:02:19 +03:00
777genius	1ebeba8f6e	fix: satisfy lint after performance fixes	2026-05-30 17:43:08 +03:00
777genius	9ed1988346	fix: refresh watch scope on provider fallback	2026-05-30 17:34:05 +03:00
777genius	06036460e9	fix: keep watch scope safe on provider errors	2026-05-30 17:11:31 +03:00
777genius	3b7b5dfd75	fix: preserve file lock acquire errors	2026-05-30 17:08:29 +03:00
777genius	a18009cc0f	fix: keep created teams in watch scope	2026-05-30 16:51:47 +03:00
777genius	f4155a6742	test: align idle watchdog expectation with stale window	2026-05-30 16:32:20 +03:00
777genius	1af8dd638b	docs: normalize jsonl reader comment punctuation	2026-05-30 16:07:44 +03:00
777genius	d43cd3a0db	test: sort team config reader imports	2026-05-30 16:06:01 +03:00
777genius	8cb44cd793	perf: replace readline with a chunked line generator in team JSONL readers readline.createInterface runs an expensive Unicode line-break regex + extra stream/string-decoder machinery per chunk. The main transcript parser (parseJsonlStream) already uses a buffer + manual newline split; these per-team readers still used readline. Add readJsonlLines(): an async generator that yields a JSONL file's lines via a chunked utf8 stream read + a plain '\n' split (drop-in for 'for await (const line of rl)'), so the consumers' loop bodies are unchanged. Stream is utf8-decoded before splitting, so multi-byte chars across chunk boundaries are safe; trailing CR (CRLF) is stripped; empty lines and a final newline-less line are yielded, matching readline; breaking out of the loop destroys the stream via the generator's finally. Adopt it in MemberStatsComputer, TaskBoundaryParser, and FileContentResolver (file-history scan). Behavior-identical (their existing tests pass: 18 + 6 + 12) plus 6 new tests for the generator (CRLF, empty lines, no-trailing-newline, early break, multi-byte chunk boundary). Note: session-browser readline paths (jsonl metadata extractor, metadataExtraction, SessionContentFilter) are off the launch path and left as-is for now.	2026-05-30 15:58:09 +03:00
777genius	92f1000a4f	test: cover transcript head metadata cache	2026-05-30 15:51:59 +03:00
777genius	127d31ba88	perf: cache unchanged team task reads	2026-05-30 15:51:15 +03:00
777genius	28a55416ca	test: isolate runtime usage stubs from process table	2026-05-30 15:46:27 +03:00
777genius	61e2678a5d	perf: cache team transcript head metadata	2026-05-30 15:26:09 +03:00
777genius	0a750a9fa8	perf: share a frozen team-summary snapshot instead of cloning on every listTeams listTeams() deep-cloned ALL team summaries via structuredClone on every call -- even cache hits and concurrent in-flight awaiters. A heap allocation sample of a launch put this (listTeams -> cloneTeamSummaries -> structuredClone) as the single largest memory allocator, driving heap churn + GC pressure during launch (this stand has ~158 teams, and listTeams is called constantly: startup, notification init, task projection, IPC polls, provisioning). Build ONE deep-frozen, independent snapshot per uncached load and hand the same reference to the cache entry, in-flight awaiters, and every later reader. The single cloneTeamSummaries keeps it independent of any cached config the loader returns; freezing lets all readers share it safely. Audited every listTeams consumer -- all iterate / map / filter / serialize, none mutate -- and the freeze turns any stray future mutation into a loud error rather than silent cross-caller corruption. TeamConfigReader 26/26 (added a frozen + same-reference regression test), and the listTeams consumers (TeamDataService 116, CrossTeamService 26) all pass under frozen summaries.	2026-05-30 15:25:26 +03:00
777genius	58dfac8377	fix: preserve runtime rss fallback when process table misses roots	2026-05-30 15:16:03 +03:00
777genius	f0797e2c12	perf: replace readline with a bounded chunked read in the affinity head scan fileBelongsToTeam streamed the head window via createReadStream + readline. readline's line iterator runs an expensive Unicode line-break regex and stream/string-decoder machinery per chunk, which showed up as a top main-thread cost during launch (the line- split regex alone was ~5.7% in the warm launch profile). Replace it with a bounded chunked fs.read + a plain '\n' split. JSONL is strictly newline-delimited and each line is trim()'d (so a trailing CR from CRLF is dropped), so a '\n' split is cheaper and more correct (it will not split on a bare CR or a Unicode line/paragraph separator inside a JSON string value, which readline would). A StringDecoder preserves multi-byte UTF-8 sequences that straddle a chunk boundary. Byte-identical semantics to the old loop: inspect up to TEAM_AFFINITY_SCAN_LINES non-empty lines, first match wins via early break, and a final line is honored even without a trailing newline. Reads in 64KB chunks so a team decided in its first lines is not penalized by a huge file. Adds tests for CRLF endings + no-trailing-newline, a multi-byte char straddling the 64KB boundary, and the 40-line window bound (21 pass).	2026-05-30 14:54:58 +03:00
777genius	5c1d2e8d92	perf: memoize bootstrap failure-reason per parsed line across members readRecentBootstrapTranscriptOutcome ran extractBootstrapFailureReason (regex + optional JSON.parse + ~40 substring scans per call) for every candidate line, once per team member: an anonymous transcript line (no agentName) passes the per-member filter for ALL members, so an N-member team re-extracted the same line's failure reason N times per scan -- a top main-thread cost in the warm launch profile (~5.4% with isBootstrapInstructionPrompt). Memoize the result on the shared parsed-tail line (parsedBootstrapTranscriptTailCache already reuses the same line objects across members and re-scans while the file is unchanged). The compute stays LAZY inside the newest-first candidate loop, so each line is extracted at most once across all members AND lines past the first match are never extracted -- preserving the early-break that an eager precompute would lose. The memo is keyed implicitly by the line's cache entry (filePath + mtime + size); a file change re-parses into fresh line objects, so it cannot drift from the line text. Pure-function memoization: byte-identical input (candidate.text === parsedLine.text) -> identical output, so failure/success/null outcomes and reasons are unchanged. Bootstrap-transcript outcome tests pass (306), no behavior change.	2026-05-30 14:34:57 +03:00
777genius	2dcd5cb6a8	perf: align ps process-table cache TTL with the 2s consumer window The runtime process table is read through a module-level executor cache (1s TTL) but its consumers rebuild on a 2s snapshot-cache cadence and fire constantly during a launch (every team file change invalidates a per-team snapshot). A 1s table TTL is shorter than the 2s read cadence, so 'ps -ax' was re-spawned on essentially every consumer rebuild for no freshness benefit -- a top main-thread cost in the warm launch profile (~5-8%). Raise the table TTL to 2s to match the consumer window so the spawn coalesces to at most once per consumer cycle. Safe: liveness is identity-matched (team+agent+command), not bare-PID, and OpenCode host cleanup re-validates each PID against live state before killing, so a ~2s-stale table cannot cause a wrong liveness verdict or an unsafe kill -- it only widens an already-tolerated display-staleness window by ~1s. No test asserts the TTL value.	2026-05-30 14:29:21 +03:00
777genius	776298b0e3	perf: reuse caller stat in fileBelongsToTeam, drop duplicate fs.stat On the live resolution path collectRootJsonlSessionIds already stat()s each root jsonl for its mtime-window filter, then fileBelongsToTeam stat()ed the very same file again for its cache validation -- two fs.stat syscalls (plus two Stats allocations) per file, every poll. fileBelongsToTeam now takes an optional precomputed stat and the mtime-filter caller passes the stat it already has, so the file is statted once. Measured 20 files -> 20 stat calls on the mtime path (was ~40). Using a single stat snapshot is also slightly more consistent than two reads that could straddle a concurrent write. The other call site (subagent scan) passes no stat and is unchanged (fileBelongsToTeam stats it itself). Adds a regression test that a caller-supplied stat is the one recorded in the affinity cache.	2026-05-30 14:11:58 +03:00
777genius	635321dd9a	perf: drop per-acquire existsSync stat from file lock fast path tryAcquire() ran fs.existsSync(dir) on every lock acquisition to lazily create the lock directory. The dir almost always already exists (team dirs are created up front), so that stat was wasted on every acquire — and the file-lock path churns heavily during a launch (inbox writes, journals, task mutations all lock per write, showing up as unlink/open/stat syscall load in the launch profile). Attempt the openSync('wx') directly; only on ENOENT create the directory and retry the acquire in the same call. Same locking semantics, same files, same first-acquire latency for a missing dir — just one fewer stat per acquire in the common case.	2026-05-30 13:55:43 +03:00
777genius	c8d40be460	perf: cache negative team-affinity verdicts from a full head window fileBelongsToTeam only cached POSITIVE affinity durably; a negative verdict was re-decided on any change, so during a launch every non-matching transcript in the project dir that grew (mtime+size change from an active session) was re-streamed (createReadStream+readline) and re-parsed (up to 40 head lines) on every bootstrap poll. A live atlas-hq-5 launch profile put this whole subsystem (readline streaming + fileBelongsToTeam + line/team matching) at ~31% of main-thread JS, the single largest launch cost. A team's first 40 head lines are immutable for an append-only transcript, so a `false` decided from a FULL inspected window (>= TEAM_AFFINITY_SCAN_LINES) stays valid while the file only grows. Track headWindowFull on the cache entry and short- circuit such negatives the same way positives are short-circuited (size >= cached). Short files (partial window) are still re-scanned on growth, so a team mention that later lands inside the head window is still detected. A shrink/rewrite (size < cached) forces a re-scan, identical to the positive path. Behavior-preserving for affinity correctness (no new false negatives); only removes redundant re-streams. Adds regression tests for both the durable-negative and the short-file-flips-to-true cases.	2026-05-30 13:17:49 +03:00
777genius	126a485477	wip: team messages panel updates and runtime usage cache refinements Checkpoint of in-progress work: - renderer: team messages panel/composer, messagesPanelLogic, teamSlice, AnimatedHeightReveal plus their tests - main: runtime process usage-stats caching (ignoreCachedMisses, bounded eviction), alive-run-id helpers, team watch-scope notify wiring Note: the getTeamAgentRuntimeSnapshot rssBytes expectation in TeamAgentLaunchMatrix.safe-e2e is environment-dependent and still red.	2026-05-30 12:54:11 +03:00
777genius	d0c64fabb8	fix: export notifyTeamWatchScopeChanged so the committed build resolves TeamProvisioningService imports notifyTeamWatchScopeChanged (added with the setAliveRunId/deleteAliveRunId helpers) but the export was missing, so a clean checkout of the branch failed to typecheck. Add the export plus a test; the call-site wiring stays as in-progress work.	2026-05-30 12:33:57 +03:00
777genius	ccea3e015d	perf: avoid opening bootstrap transcripts on cache hits readRecentBootstrapTranscriptOutcome opened the session file and stat'd it through the handle BEFORE checking the per-(file,mtime,size) outcome cache, so every cache hit still paid a full open()+close(). During a tracked launch the per-member lookup cache is bypassed, so the project-dir scan re-runs every poll across every recent session file x every member, turning each cache hit into a wasted open() syscall. The native sample of a 6-member mixed launch put __open at ~51-54k; correcting the earlier attribution, this open-before-cache-check (NOT the watcher rebuilds addressed in the previous commit) is the dominant source -- removing the per-rebuild watcher churn left __open essentially flat. Stat the path with fs.promises.stat (no fd) for the cache check and return the cached outcome without opening. getParsedBootstrapTranscriptTail now opens the file itself, lazily, only when its shared parse cache also misses (the file genuinely changed since last parse), so a hit on either the per-member outcome cache or the shared tail-parse cache avoids the open entirely. The tail read is wrapped in try/finally to close the handle. Parsing/scan logic is byte-for-byte unchanged; only the redundant open is removed. Bootstrap-transcript tests pass.	2026-05-30 12:23:48 +03:00
777genius	8b3cec8013	perf: update team watcher incrementally instead of full rebuild A team launch repeatedly changes the watched target set (new dirs appear), and each change tore down the chokidar watcher and recreated it over the full target set. On macOS chokidar uses kqueue with one fd per watched file, so every rebuild re-opened an fd for EVERY watched file (the large always-watched inbox set plus scoped dirs). Profiling a 6-member mixed launch showed ~54k open() syscalls dominated by these rebuilds. Keep one persistent watcher and apply target-set changes with add()/unwatch() on the delta only, so a reconcile opens fds for just the newly added dirs. The initial watcher still uses ignoreInitial for a silent startup baseline, and emitExistingFilesForNewTargets still backfills files already present in newly added dirs, so the emitted event surface is unchanged. Because the watcher is no longer recreated per reconcile, the stale-old-generation and close-throws-during-rebuild failure modes are gone; their tests are replaced with incremental add/unwatch and persistent-watcher coverage. All 69 watcher tests pass.	2026-05-30 12:05:23 +03:00
777genius	b3565a0d29	perf: skip stale session files when scanning for bootstrap transcripts readBootstrapTranscriptOutcomesInProjectRoot iterated every .jsonl in the project dir, opening + tail-reading each per member per bootstrap poll. A real project dir (e.g. ~280 session files) made this the dominant file-open churn during launch (the native sample showed ~56k open() syscalls). A transcript last modified before the lookup window cannot contain a bootstrap line at/after sinceMs (append-only logs: a line's timestamp <= its write time <= the file mtime), so readRecentBootstrapTranscriptOutcome returns null for it. Skip those with a cheap stat instead of opening them; a 5s slack absorbs clock skew between the line timestamp source and the filesystem mtime. Behavior is unchanged (only files that would have returned null are skipped); bootstrap-transcript detection tests still pass.	2026-05-30 11:16:18 +03:00
777genius	a4b9512c7c	perf: keep positive transcript team-affinity cached as files grow fileBelongsToTeam streams a transcript's head lines to decide if it belongs to a team, cached by (mtime,size). During launch the team's own session transcripts grow on every poll, invalidating the cache and forcing a re-stream + head re-parse each time (profiled at ~7-8% main-thread JS after the earlier fixes). A positive affinity is decided by early head lines that persist as an append-only transcript grows, so a true result stays valid while the file only grows. Reuse a cached true when size has not shrunk; a false result is still re-checked on any change (a short file may grow head lines mentioning the team) and a shrink forces a re-scan. Existing resolver tests still pass.	2026-05-30 10:15:07 +03:00
777genius	1b4838d422	perf: normalize bootstrap transcript lines once across members isBootstrapTranscriptContextText and getBootstrapTranscriptSuccessSource each ran text.replace(/\s+/g,' ').trim().toLowerCase() internally. During launch the bootstrap scan checks every transcript line against every context member for every member's poll, so the same line was re-normalized up to (members x contextMembers) times per cycle. Profiling a 6-member mixed launch showed isBootstrapTranscriptContextText at ~11% main-thread JS even after the shared-parse cache. Precompute the normalized form once per parsed line (already cached) and pass it to both detection helpers via a new optional precomputedNormalizedText parameter. The value is identical to what the helpers computed internally, so detection is byte-for-byte unchanged; the helpers stay backward compatible for callers that omit it.	2026-05-30 10:06:47 +03:00
777genius	f79ea145d7	perf: batch per-member task-interval resume into one locked pass During launch the live-status loop resumes every alive member every audit cycle. resumeActiveIntervalsForMember runs a synchronous file-lock + full read of every task file, so for an N-member team with M task files it did N locked passes x M readFileSync per cycle (e.g. 6 members x 20 task files), blocking the main event loop. Profiling a 6-member mixed launch showed mutateTeamTasks/withFileLockSync as a top main-thread cost (~14%). Add resumeActiveIntervalsForMembers that applies the identical per-member resume logic against a member set in a single locked pass, and use it in the live-status loop. Same mutations, but one lock + task read per cycle instead of one per member. Adds a test covering multi-member resume in one pass.	2026-05-30 10:02:01 +03:00
777genius	aa9a1bba8c	perf: debounce team watcher rebuilds during dir-event bursts A team launch creates many directories/files in quick succession (worktrees, inboxes, session logs), and each addDir/unlinkDir event triggered a full TeamTaskWatchRegistry reconcile that tore down and recreated the entire chokidar watcher (re-opening a kqueue fd per watched file on macOS). Profiling a 6-member mixed-team launch showed kqueue churn (kevent) as a top native cost and watcher rebuild as the top remaining main-thread JS cost after the transcript fix. Debounce the event-driven reconcile (250ms) so a burst collapses into one rebuild. collectTargets re-reads the current directory state and emitExistingFilesForNewTargets backfills files created before the rebuild, so no change is missed; requestReconcile, startup, and the periodic 30s reconcile stay immediate. Adds a test asserting a burst of addDir events yields a single rebuild.	2026-05-30 09:46:16 +03:00
777genius	e1475deede	optimize landing robot images downscale oversized robot webp assets to fit-in-480px and recompress at q82. they were shipping at 1024x1536 but render at under 240px, so this cuts the total robot payload from ~1.08MB to ~163KB without changing aspect ratios, layout or transparency.	2026-05-30 09:46:12 +03:00
777genius	35b76f1354	perf: share bootstrap transcript tail parse across members During launch, the bootstrap-wait loop polls each member and, per member, re-read and re-JSON.parsed the same growing transcript tail (readRecentBootstrapTranscriptOutcome was the top main-thread JS hotspot at ~21% during bootstrap, ~40% with its helpers). The same file was parsed once per member per poll. Memoize the parsed tail by (filePath, mtime, size) in a shared cache so the file is read + parsed once per change and reused across all members. The per-member filter and failure/success scan is byte-for-byte the same logic; only the redundant read + JSON.parse is removed. Cache is bounded (LRU, same cap as the outcome cache) and invalidated on mtime/size change, matching the existing outcome cache semantics. Adds a test asserting the tail is parsed once and shared while per-member outcome detection is unchanged.	2026-05-30 01:05:54 +03:00
777genius	5d63ecfe32	perf: scope team file watching to active and engaged teams The main process watched every team directory under ~/.claude/teams (one shallow chokidar target per team root, per team inboxes, and per task dir). On macOS this falls back to kqueue, which needs one fd per watched file, so a workspace with many teams kept ~1600 descriptors open and made startup and reconcile work scale with the number of teams on disk. Scope the team-root and task watching to teams that are running or currently engaged in the UI. The teams root and every team's inboxes are still watched for all teams, so cross-team message delivery, the lead inbox->stdin relay, and notifications are unchanged. Idle teams are static, so dropping their team-root/ task watches is safe; opening a team (getData) or launching it re-adds it via an immediate watch-scope refresh. The provider falls back to watching every team when unset, and the EMFILE polling fallback is intentionally left unscoped so a scope change can never look like a deletion. Measured on a 162-team workspace: open team fds 1600 -> 730, with team-root watching restored the moment a team is opened or goes live.	2026-05-30 00:25:55 +03:00
777genius	d06ea7f265	perf: cache runtime process table to stop ps spawn storm listRuntimeProcesses spawned a full `ps -ax` on every call with no caching. It is invoked very frequently while a team runs: runtime liveness/telemetry snapshots are rebuilt whenever a team file changes (invalidateRuntimeSnapshotCaches fires from ~25 sites), so the main process ended up forking ps dozens of times per second, which is expensive from the large Electron main process and pegged its CPU. Add a 1s TTL cache plus in-flight coalescing on the single ps spawn point. Liveness/telemetry callers already tolerate ~2s staleness via their own snapshot caches and the OS process table changes negligibly within a second, so this caps ps to <=1/s without affecting liveness correctness. Measured posix_spawn dropped from ~146/s to ~11/s with a team running.	2026-05-29 23:46:22 +03:00
777genius	c0b9b4ec5d	perf: isolate task detail dialog state	2026-05-29 17:02:00 +03:00
777genius	b830a3c53d	perf: preload task detail dialog	2026-05-29 16:39:38 +03:00
777genius	5898fbeaa9	perf: slow live runtime polling	2026-05-29 16:15:54 +03:00

1 2 3 4 5 ...

2350 commits