readline.createInterface runs an expensive Unicode line-break regex + extra
stream/string-decoder machinery per chunk. The main transcript parser (parseJsonlStream)
already uses a buffer + manual newline split; these per-team readers still used readline.
Add readJsonlLines(): an async generator that yields a JSONL file's lines via a chunked
utf8 stream read + a plain '\n' split (drop-in for 'for await (const line of rl)'), so the
consumers' loop bodies are unchanged. Stream is utf8-decoded before splitting, so multi-byte
chars across chunk boundaries are safe; trailing CR (CRLF) is stripped; empty lines and a
final newline-less line are yielded, matching readline; breaking out of the loop destroys
the stream via the generator's finally.
Adopt it in MemberStatsComputer, TaskBoundaryParser, and FileContentResolver (file-history
scan). Behavior-identical (their existing tests pass: 18 + 6 + 12) plus 6 new tests for the
generator (CRLF, empty lines, no-trailing-newline, early break, multi-byte chunk boundary).
Note: session-browser readline paths (jsonl metadata extractor, metadataExtraction,
SessionContentFilter) are off the launch path and left as-is for now.
listTeams() deep-cloned ALL team summaries via structuredClone on every call -- even
cache hits and concurrent in-flight awaiters. A heap allocation sample of a launch
put this (listTeams -> cloneTeamSummaries -> structuredClone) as the single largest
memory allocator, driving heap churn + GC pressure during launch (this stand has ~158
teams, and listTeams is called constantly: startup, notification init, task projection,
IPC polls, provisioning).
Build ONE deep-frozen, independent snapshot per uncached load and hand the same
reference to the cache entry, in-flight awaiters, and every later reader. The single
cloneTeamSummaries keeps it independent of any cached config the loader returns;
freezing lets all readers share it safely. Audited every listTeams consumer -- all
iterate / map / filter / serialize, none mutate -- and the freeze turns any stray
future mutation into a loud error rather than silent cross-caller corruption.
TeamConfigReader 26/26 (added a frozen + same-reference regression test), and the
listTeams consumers (TeamDataService 116, CrossTeamService 26) all pass under frozen
summaries.
fileBelongsToTeam streamed the head window via createReadStream + readline. readline's
line iterator runs an expensive Unicode line-break regex and stream/string-decoder
machinery per chunk, which showed up as a top main-thread cost during launch (the line-
split regex alone was ~5.7% in the warm launch profile).
Replace it with a bounded chunked fs.read + a plain '\n' split. JSONL is strictly
newline-delimited and each line is trim()'d (so a trailing CR from CRLF is dropped),
so a '\n' split is cheaper and more correct (it will not split on a bare CR or a
Unicode line/paragraph separator inside a JSON string value, which readline would). A
StringDecoder preserves multi-byte UTF-8 sequences that straddle a chunk boundary.
Byte-identical semantics to the old loop: inspect up to TEAM_AFFINITY_SCAN_LINES
non-empty lines, first match wins via early break, and a final line is honored even
without a trailing newline. Reads in 64KB chunks so a team decided in its first lines
is not penalized by a huge file. Adds tests for CRLF endings + no-trailing-newline,
a multi-byte char straddling the 64KB boundary, and the 40-line window bound (21 pass).
readRecentBootstrapTranscriptOutcome ran extractBootstrapFailureReason (regex +
optional JSON.parse + ~40 substring scans per call) for every candidate line, once
per team member: an anonymous transcript line (no agentName) passes the per-member
filter for ALL members, so an N-member team re-extracted the same line's failure
reason N times per scan -- a top main-thread cost in the warm launch profile (~5.4%
with isBootstrapInstructionPrompt).
Memoize the result on the shared parsed-tail line (parsedBootstrapTranscriptTailCache
already reuses the same line objects across members and re-scans while the file is
unchanged). The compute stays LAZY inside the newest-first candidate loop, so each
line is extracted at most once across all members AND lines past the first match are
never extracted -- preserving the early-break that an eager precompute would lose.
The memo is keyed implicitly by the line's cache entry (filePath + mtime + size); a
file change re-parses into fresh line objects, so it cannot drift from the line text.
Pure-function memoization: byte-identical input (candidate.text === parsedLine.text)
-> identical output, so failure/success/null outcomes and reasons are unchanged.
Bootstrap-transcript outcome tests pass (306), no behavior change.