Commit graph

794 commits

Author SHA1 Message Date
777genius
d0c64fabb8 fix: export notifyTeamWatchScopeChanged so the committed build resolves
TeamProvisioningService imports notifyTeamWatchScopeChanged (added with the
setAliveRunId/deleteAliveRunId helpers) but the export was missing, so a clean
checkout of the branch failed to typecheck. Add the export plus a test; the
call-site wiring stays as in-progress work.
2026-05-30 12:33:57 +03:00
777genius
8b3cec8013 perf: update team watcher incrementally instead of full rebuild
A team launch repeatedly changes the watched target set (new dirs appear), and each
change tore down the chokidar watcher and recreated it over the full target set.
On macOS chokidar uses kqueue with one fd per watched file, so every rebuild
re-opened an fd for EVERY watched file (the large always-watched inbox set plus
scoped dirs). Profiling a 6-member mixed launch showed ~54k open() syscalls dominated
by these rebuilds.

Keep one persistent watcher and apply target-set changes with add()/unwatch() on the
delta only, so a reconcile opens fds for just the newly added dirs. The initial
watcher still uses ignoreInitial for a silent startup baseline, and
emitExistingFilesForNewTargets still backfills files already present in newly added
dirs, so the emitted event surface is unchanged. Because the watcher is no longer
recreated per reconcile, the stale-old-generation and close-throws-during-rebuild
failure modes are gone; their tests are replaced with incremental add/unwatch and
persistent-watcher coverage. All 69 watcher tests pass.
2026-05-30 12:05:23 +03:00
777genius
f79ea145d7 perf: batch per-member task-interval resume into one locked pass
During launch the live-status loop resumes every alive member every audit cycle.
resumeActiveIntervalsForMember runs a synchronous file-lock + full read of every
task file, so for an N-member team with M task files it did N locked passes x M
readFileSync per cycle (e.g. 6 members x 20 task files), blocking the main event
loop. Profiling a 6-member mixed launch showed mutateTeamTasks/withFileLockSync as
a top main-thread cost (~14%).

Add resumeActiveIntervalsForMembers that applies the identical per-member resume
logic against a member set in a single locked pass, and use it in the live-status
loop. Same mutations, but one lock + task read per cycle instead of one per member.
Adds a test covering multi-member resume in one pass.
2026-05-30 10:02:01 +03:00
777genius
aa9a1bba8c perf: debounce team watcher rebuilds during dir-event bursts
A team launch creates many directories/files in quick succession (worktrees,
inboxes, session logs), and each addDir/unlinkDir event triggered a full
TeamTaskWatchRegistry reconcile that tore down and recreated the entire chokidar
watcher (re-opening a kqueue fd per watched file on macOS). Profiling a 6-member
mixed-team launch showed kqueue churn (kevent) as a top native cost and watcher
rebuild as the top remaining main-thread JS cost after the transcript fix.

Debounce the event-driven reconcile (250ms) so a burst collapses into one rebuild.
collectTargets re-reads the current directory state and emitExistingFilesForNewTargets
backfills files created before the rebuild, so no change is missed; requestReconcile,
startup, and the periodic 30s reconcile stay immediate. Adds a test asserting a
burst of addDir events yields a single rebuild.
2026-05-30 09:46:16 +03:00
777genius
35b76f1354 perf: share bootstrap transcript tail parse across members
During launch, the bootstrap-wait loop polls each member and, per member, re-read
and re-JSON.parsed the same growing transcript tail (readRecentBootstrapTranscriptOutcome
was the top main-thread JS hotspot at ~21% during bootstrap, ~40% with its helpers).
The same file was parsed once per member per poll.

Memoize the parsed tail by (filePath, mtime, size) in a shared cache so the file is
read + parsed once per change and reused across all members. The per-member filter
and failure/success scan is byte-for-byte the same logic; only the redundant read +
JSON.parse is removed. Cache is bounded (LRU, same cap as the outcome cache) and
invalidated on mtime/size change, matching the existing outcome cache semantics.

Adds a test asserting the tail is parsed once and shared while per-member outcome
detection is unchanged.
2026-05-30 01:05:54 +03:00
777genius
5d63ecfe32 perf: scope team file watching to active and engaged teams
The main process watched every team directory under ~/.claude/teams (one shallow
chokidar target per team root, per team inboxes, and per task dir). On macOS this
falls back to kqueue, which needs one fd per watched file, so a workspace with
many teams kept ~1600 descriptors open and made startup and reconcile work scale
with the number of teams on disk.

Scope the team-root and task watching to teams that are running or currently
engaged in the UI. The teams root and every team's inboxes are still watched for
all teams, so cross-team message delivery, the lead inbox->stdin relay, and
notifications are unchanged. Idle teams are static, so dropping their team-root/
task watches is safe; opening a team (getData) or launching it re-adds it via an
immediate watch-scope refresh. The provider falls back to watching every team
when unset, and the EMFILE polling fallback is intentionally left unscoped so a
scope change can never look like a deletion.

Measured on a 162-team workspace: open team fds 1600 -> 730, with team-root
watching restored the moment a team is opened or goes live.
2026-05-30 00:25:55 +03:00
777genius
b4b9175287 perf: reuse team summary for comment notification init 2026-05-29 15:43:24 +03:00
777genius
f0514a7d17 perf: skip unverifiable runtime process scans 2026-05-29 14:38:01 +03:00
777genius
9be096f864 perf: cache persisted bootstrap outcome lookups 2026-05-29 14:19:04 +03:00
777genius
d0c6fdd28c perf: extend persisted runtime probe cache 2026-05-29 14:04:28 +03:00
777genius
35a9b05637 perf: cache persisted spawn status reads 2026-05-29 13:06:26 +03:00
777genius
fa242d9ff6 perf: cache bootstrap transcript outcomes 2026-05-29 12:58:15 +03:00
777genius
0b97985474 perf: cache team transcript affinity checks 2026-05-29 12:46:29 +03:00
777genius
169ac8bb68 perf: include process table usage metrics 2026-05-29 12:34:13 +03:00
777genius
3b0c2ed24b perf: cache runtime usage telemetry 2026-05-29 12:29:37 +03:00
777genius
7d21f9bd76 perf: avoid stale runtime pid sampling 2026-05-29 12:26:15 +03:00
777genius
4458ec1fd7 fix(opencode): wire junction diagnostics on dev 2026-05-28 13:12:02 +03:00
777genius
8abf4ea7dd fix(opencode): harden Windows junction retry 2026-05-28 13:08:55 +03:00
ComradeSwarog
b12106d8f4 fix(test): use expect.any(String) for junction error message assertions
The failure.message passed to ensureOpenCodeProfileNodeModulesJunction
comes from normalizeCommandFailure which may produce a JSON-escaped
string when the error contains structured JSON in stdout. Using the
raw runtimeMessage literal causes a mismatch in CI. Switch to
expect.any(String) to accept any string value for the errorMessage
parameter while still verifying the call happens.
2026-05-28 13:08:55 +03:00
ComradeSwarog
cc3c9f7dc7 fix(opencode): address code review feedback — extract paths from error message, fix test imports
- Extract symlink source/target paths directly from the error message
  instead of reconstructing them from process.env (Codex P2 review)
- Add extractSymlinkSourcePath and extractSymlinkTargetPath functions
- Update ensureOpenCodeProfileNodeModulesJunction to accept optional
  errorMessage parameter and use extracted paths from it
- Fix unused imports in test (remove 'os', replace 'beforeEach' with
  'afterEach' per CodeRabbit review)
- Widen fs.statSync mock signatures to use Parameters<typeof fs.statSync>
  per CodeRabbit review
- Add tests for new extraction functions
- Pass errorMessage to ensureOpenCodeProfileNodeModulesJunction calls
  in CLI client tests
2026-05-28 13:08:55 +03:00
ComradeSwarog
597c690dbc fix(opencode): add Windows junction fallback for node_modules EPERM symlink error (#187)
On Windows 10 without Developer Mode, the OpenCode runtime fails to create
a symlink from shared-cache/config-node_modules to the profile's
node_modules directory. The EPERM error blocks the entire OpenCode provider
catalog, leaving it unavailable.

Changes:
- New openCodeWindowsNodeModulesJunction module that pre-creates a Windows
  directory junction (no Developer Mode required) before the runtime call
  when an EPERM symlink error is detected
- On Windows, loadView and loadProviderDirectory now detect EPERM symlink
  errors, extract the profile ID, create the junction, and retry the
  runtime command once before falling back to the error response
- Updated diagnostic hints to accurately reflect that the runtime does not
  yet include junction fallback, and that the next runtime update will
  include it
- Added unit tests for the junction module and retry behavior
2026-05-28 13:08:54 +03:00
infiniti
fa36d7f3c0
fix(opencode): extend summary status timeout 2026-05-28 00:39:53 +03:00
infiniti
0cbba46083
fix(team): speed up provider runtime preflight 2026-05-27 23:54:10 +03:00
infiniti
e06c24a041
fix: add OpenCode status inventory fallback 2026-05-27 22:41:43 +03:00
iliya
21404894c2 fix: add Windows provider status fallback 2026-05-27 21:54:24 +03:00
777genius
77e08af03f fix(team): propagate managed runtime settings env 2026-05-27 18:56:24 +03:00
777genius
7cc1a59bbc fix(team): preserve mixed provider runtime settings 2026-05-27 18:22:10 +03:00
infiniti
ebcc0e717f
fix(team): reconcile provisioned-but-not-alive bootstrap state 2026-05-27 12:16:41 +03:00
Илия
3849c01955
fix(provenance): classify synthetic user turns
* fix(provenance): classify synthetic user turns

* fix(provenance): keep assistant display rendering intact

* fix(provenance): preserve source tool result rows
2026-05-26 23:51:17 +03:00
777genius
ab6ab1fc4c test(team): cover provisioned runtime recovery 2026-05-26 23:44:40 +03:00
777genius
c79b7d4234 fix(team): suppress unverified relay state claims 2026-05-26 23:44:40 +03:00
777genius
f237318c29 fix(agent-teams): surface OpenCode runtime permissions 2026-05-26 19:46:24 +03:00
777genius
636beb5e42 fix(scripts): quote Windows shell invocations 2026-05-26 19:46:13 +03:00
777genius
58a0eb603d build(runtime): require Node 24 toolchain 2026-05-26 19:44:23 +03:00
777genius
4640e1eea4 fix(startup): ignore stale opencode probe results 2026-05-26 10:39:50 +03:00
777genius
a8ac52b6f3 perf(startup): dedupe opencode version probes 2026-05-26 10:32:46 +03:00
777genius
b88ca42fe3 fix(startup): serialize provider runtime checks 2026-05-26 09:12:05 +03:00
777genius
b5d7da1ea8 fix(attachments): support claude gif delivery 2026-05-25 23:43:29 +03:00
777genius
0d4e6f5047 perf(startup): avoid provider refresh version probe 2026-05-25 23:37:12 +03:00
777genius
33463d3479 perf(startup): skip deferred cli version probe 2026-05-25 23:25:21 +03:00
777genius
a6dd0061a8 perf(startup): defer heavy startup work 2026-05-25 23:14:59 +03:00
777genius
43afc9f907 fix(jsonl): align count-only baseline parsing 2026-05-25 22:58:07 +03:00
777genius
13b3ace4fd test(jsonl): format baseline count imports 2026-05-25 22:36:14 +03:00
777genius
b0b2fa2d13 fix(jsonl): count baseline entries without materializing messages 2026-05-25 22:32:09 +03:00
777genius
e64fff8af0 fix(watcher): baseline large existing jsonl files 2026-05-25 22:26:55 +03:00
777genius
e88d3a1e98 feat(team): open persisted attachments in editor 2026-05-25 21:30:56 +03:00
777genius
63e16d1043 fix(workspace-trust): canonicalize git worktree trust roots 2026-05-25 21:30:56 +03:00
777genius
c033a0cb87 fix(team): persist incomplete launch state before cleanup 2026-05-25 14:53:05 +03:00
infiniti
2cee9cabaf
fix(opencode): harden local runtime bridge support 2026-05-25 14:31:57 +03:00
infiniti
2b3a184bef
fix(opencode): recover empty bridge output sends
* fix(opencode): handle empty readiness bridge output

* fix(opencode): retry read-only bridge no-output

* fix(opencode): recover empty bridge output sends

---------

Co-authored-by: iliya <iliyazelenkog@gmail.com>
2026-05-25 00:41:54 +03:00