diff --git a/docs/research/context-usage-audit.md b/docs/research/context-usage-audit.md new file mode 100644 index 00000000..a6b34267 --- /dev/null +++ b/docs/research/context-usage-audit.md @@ -0,0 +1,496 @@ +# Context Usage Audit + +**Дата**: 2026-04-18 +**Статус**: Research +**Goal**: проверить, как в проекте сейчас считается usage контекста, сверить это с official docs и с реальными логами, и зафиксировать, что нужно менять для понятного и точного UI + +## Executive Summary + +Главный вывод: + +- ✅ Для **Anthropic prompt-side input** текущая базовая формула `input_tokens + cache_creation_input_tokens + cache_read_input_tokens` корректна. +- ❌ Для **"процент занятого контекста"** текущий UI смешивает несколько разных сущностей: + - total prompt input + - visible/debuggable context + - full context used in the turn + - guessed context window +- ❌ Кнопка открытия context panel на team screen сейчас показывает **не процент занятого контекста**, а смесь `visible context / total tokens`, при этом подписывает это как `of input`. +- ❌ Live lead context usage в team runtime **не учитывает `output_tokens`**, хотя Anthropic docs явно пишут, что input и output components count toward the context window. +- ⚠️ Для **Codex** текущие локальные session logs часто вообще не содержат usable input-side token telemetry: в `.jsonl` виден `output_tokens`, а `input_tokens/cache_*` остаются нулями. То есть "точный процент" для Codex из текущего источника правды пока получить нельзя. +- ⚠️ Для **Anthropic context window size** нельзя опираться только на `"[1m]"` suffix. По актуальным docs/релиз-ноутам окно зависит от конкретной модели: native `1M` уже есть у новых raw model ids вроде `claude-opus-4-7`, `claude-opus-4-6`, `claude-sonnet-4-6`, тогда как часть legacy путей остаётся на `200k` или временном beta-path. + +## 1. Что сейчас считается в коде + +### 1.1 Live lead context в team runtime + +Источник: + +- `src/main/services/team/TeamProvisioningService.ts` + +Текущая формула: + +```ts +currentTokens = input_tokens + cache_creation_input_tokens + cache_read_input_tokens +percent = currentTokens / contextWindow +``` + +Это значение эмитится как `lead-context`. + +Что важно: + +- это **total prompt input** +- это **не full context used for the completed turn** +- `output_tokens` сейчас исключены + +### 1.2 Context button на экране команды + +Источник: + +- `src/renderer/components/team/TeamDetailView.tsx` + +Текущее поведение: + +- собирается `visibleContextTokens = sumContextInjectionTokens(allContextInjections)` +- затем считается `visibleContextPercentLabel = formatPercentOfTotal(visibleContextTokens, lastAiGroupTotalTokens)` +- при этом `lastAiGroupTotalTokens` сейчас = `input + cache_read + cache_creation + output` +- но helper `formatPercentOfTotal()` возвращает строку вида `"X% of input"` + +Итог: + +- знаменатель уже **не input** +- числитель это вообще **visible subset** +- label говорит **of input** +- кнопка выглядит как будто это **общий context usage** + +То есть тут сразу 3 semantic mismatch. + +### 1.3 Session Context Panel / Token popover + +Источники: + +- `src/renderer/components/chat/SessionContextPanel/components/SessionContextHeader.tsx` +- `src/renderer/components/common/TokenUsageDisplay.tsx` + +Сейчас в проекте одновременно существуют 3 разных процента: + +1. `visible_estimated / total_input` +2. `visible_estimated / (input + output + cache)` +3. `prompt_input / context_window` + +Но в UI они местами называются почти одинаково. + +## 2. Что говорят official docs + +### 2.1 Anthropic: что такое `input_tokens` при caching + +Official docs: + +- [Anthropic prompt caching](https://docs.anthropic.com/ru/docs/build-with-claude/prompt-caching) + +Ключевые факты: + +- `input_tokens` - это только токены **после последней cache breakpoint** +- total prompt input считается как: + +```text +total_input_tokens = cache_read_input_tokens + cache_creation_input_tokens + input_tokens +``` + +Источник: + +- docs lines 491-500, 493-500, 495: + - `input_tokens` представляет только токены после последней точки разрыва кэша + - `total_input_tokens = cache_read_input_tokens + cache_creation_input_tokens + input_tokens` + +Вывод: + +- текущая базовая формула runtime для **Anthropic prompt input** правильная +- жалоба пользователя на "input percent" логична, потому что **`input_tokens` alone действительно не равен общему prompt input** + +### 2.2 Anthropic: что вообще считается context window + +Official docs: + +- [Anthropic context windows](https://docs.anthropic.com/en/docs/build-with-claude/context-windows) + +Ключевые факты: + +- context window refers to all text model can reference, **including the response itself** +- при tool use docs прямо говорят: + - **all input and output components count toward the context window** + +Источник: + +- lines 194-197 +- lines 215-220 +- lines 255-262 + +Вывод: + +- если UI обещает показать именно **"сколько контекста занято"**, то `output_tokens` игнорировать нельзя +- текущий live team formula under-reports occupied context for completed turn + +### 2.3 Anthropic: thinking blocks + +Official docs: + +- [Anthropic context windows](https://docs.anthropic.com/en/docs/build-with-claude/context-windows) + +Ключевой факт: + +- previous thinking blocks are automatically stripped from future context + +Источник: + +- lines 225-239, especially 228 and 237 + +Вывод: + +- есть важная разница между: + - **full context used during current turn** + - **context that will carry into future prompt** +- usage fields alone не дают perfectly exact "future carried context" без доп. нормализации thinking + +### 2.4 Anthropic: какие модели сейчас имеют 1M context window + +Official docs: + +- [Anthropic models overview](https://platform.claude.com/docs/en/about-claude/models/overview) +- [Anthropic release notes](https://platform.claude.com/docs/en/release-notes/overview) +- [Anthropic context windows](https://platform.claude.com/docs/en/build-with-claude/context-windows) + +Ключевые факты на дату проверки: + +- current models overview показывает: + - `claude-opus-4-7` - `1M` + - `claude-sonnet-4-6` - `1M` + - `claude-haiku-4-5` - `200k` +- release notes отдельно фиксируют: + - с `2026-03-13` `1M` GA для `Claude Opus 4.6` и `Claude Sonnet 4.6` + - `2026-03-30` объявлен retirement beta-path для `Claude Sonnet 4.5` и `Claude Sonnet 4` на `2026-04-30` +- context windows page также указывает, что native long-context matrix уже не сводится к одному beta-header сценарию + +Вывод: + +- inference размера окна для Anthropic надо делать по **model matrix**, а не только по `"[1m]"` suffix +- internal app-alias `"[1m]"` всё ещё полезен как явный сигнал team UX, но для raw session model ids этого уже недостаточно + +## 3. Что показывают реальные локальные логи + +Проверены реальные `~/.claude/projects/*.jsonl`. + +### 3.1 Claude / Anthropic + +Типичный реальный кейс: + +```json +"usage": { + "input_tokens": 3, + "cache_creation_input_tokens": 9284, + "cache_read_input_tokens": 63347, + "output_tokens": 8 +} +``` + +Это значит: + +- `input_tokens = 3` совсем не означает "в prompt было 3 токена" +- реальный total prompt input здесь: + +```text +3 + 9284 + 63347 = 72634 +``` + +То есть UI, который визуально намекает на "input %" без явного объяснения caching breakdown, будет выглядеть багованным даже если арифметика частично правильная. + +### 3.2 Codex / OpenAI path в локальных session logs + +Проверены реальные Codex entries в `~/.claude/projects/-Users-belief-dev-projects-claude-claude-team/**/*.jsonl`. + +Типичный кейс: + +```json +"usage": { + "input_tokens": 0, + "cache_creation_input_tokens": 0, + "cache_read_input_tokens": 0, + "output_tokens": 650 +} +``` + +Повторяется много раз на `msg_codex_*`. + +Вывод: + +- текущий `.jsonl` source для Codex у нас часто не даёт usable prompt-side usage +- значит из **текущих session logs** нельзя честно строить accurate Codex context percent +- сначала нужен новый telemetry source или нормализация raw usage + +## 4. Codex: что говорят official OpenAI docs + +### 4.1 Context windows + +Official docs: + +- [GPT-5-Codex model](https://developers.openai.com/api/docs/models/gpt-5-codex) +- [codex-mini-latest model](https://developers.openai.com/api/docs/models/codex-mini-latest) + +Ключевые факты на дату проверки: + +- `GPT-5-Codex` - `400,000 context window` +- `codex-mini-latest` - `200,000 context window` + +### 4.2 Cached prompt accounting + +Official docs: + +- [OpenAI prompt caching](https://developers.openai.com/api/docs/guides/prompt-caching) + +Ключевой факт: + +- usage exposes `prompt_tokens_details.cached_tokens` + +Это означает: + +- на уровне OpenAI API нужная prompt-side telemetry в принципе существует +- но наш текущий local session source её, похоже, не сохраняет/не нормализует + +## 5. Конкретные проблемы в текущем проекте + +### 5.1 Semantic mismatch: "visible context" vs "context used" + +Сейчас рядом живут две разные сущности: + +- **Visible Context** - то, что мы можем debug/reduce +- **Context Used** - сколько окна реально занято + +Это не одно и то же. + +Visible Context: + +- это subset prompt-side content +- может сравниваться с total prompt input + +Context Used: + +- это usage against context window +- для Anthropic completed turn это ближе к `total_input + output` + +### 5.2 Неправильный label на context button + +Текущая button label на team screen: + +- выглядит как общий context usage +- но фактически это visible subset percent + +Это и есть один из главных user-facing bugs. + +### 5.3 Inconsistent denominators + +Сейчас по коду используются разные denominators: + +- `totalInputTokens` +- `input + output + cache` +- `contextWindow` + +Без явного переименования метрик UI всегда будет путать. + +### 5.4 Early-run guessed context window + +В `TeamProvisioningService` размер окна сначала может быть guessed: + +- `200K` для `limitContext=true` +- иначе по model-specific matrix: + - internal Anthropic `"[1m]"` alias -> `1M` + - native long-context Anthropic raw ids (`claude-opus-4-7`, `claude-opus-4-6`, `claude-sonnet-4-6`) -> `1M` + - `GPT-5.4` / `GPT-5.4 pro` -> `1.05M` + - `codex-mini-latest` -> `200K` + - остальные текущие GPT-5/Codex team models -> `400K` + +Потом он обновляется из `modelUsage.contextWindow`, если это поле пришло. + +Значит: + +- ранний live percent может быть временно неточным + +### 5.5 Shared default drift + +В shared utils есть: + +```ts +DEFAULT_CONTEXT_WINDOW = 200_000 +``` + +Но team Anthropic UX по умолчанию исходит из `1M`. + +Это не обязательно immediate arithmetic bug, но это source of drift для разных экранов и helper'ов. + +## 6. Рекомендованная metric model + +Если делать UI понятным и точным, нужно разделить **минимум 3 разные метрики**. + +### 6.1 Prompt Input Used + +Для Anthropic: + +```text +prompt_input_used = + input_tokens + + cache_creation_input_tokens + + cache_read_input_tokens +``` + +Назначение: + +- честный size текущего prompt +- хорошая база для Visible Context % + +### 6.2 Context Window Used + +Для Anthropic completed turn: + +```text +context_window_used_approx = + prompt_input_used + + output_tokens +``` + +Почему `approx`: + +- previous thinking blocks auto-strip from future turns +- exact future carried context нельзя получить из raw usage perfectly + +Но если UI обещает "занятое окно прямо сейчас/на этом ходе", эта формула ближе к docs, чем текущая. + +### 6.3 Visible Context Share + +```text +visible_context_share = visible_context_estimated / prompt_input_used +``` + +Назначение: + +- debug metric +- объясняет, какая часть prompt-а понятна и управляемая пользователю + +Это **не** percent occupied context window. + +## 7. Рекомендованный UI language + +Вместо одного размыто слова `Context` лучше использовать разные подписи: + +- `Context Used` - percent of context window +- `Prompt Input` - current prompt-side tokens +- `Visible Context` - debuggable subset of prompt + +Тогда пользователь сразу видит: + +- сколько занято всего +- сколько из этого prompt +- сколько из prompt мы реально понимаем по breakdown + +## 8. Top 3 implementation options + +### 1. Развести 3 разные метрики и переименовать UI честно + +`🎯 10 🛡️ 9 🧠 7` +Примерно `180-260` строк изменений + +Что сделать: + +- team button показывает только `Context Used` +- panel header отдельно показывает: + - `Visible Context` + - `Prompt Input` + - `Context Window Used` +- `Visible Context` всегда считается только как доля prompt input + +Плюсы: + +- минимальный semantic debt +- почти все пользовательские жалобы закрываются сразу +- легче потом добавить Codex + +Минусы: + +- надо аккуратно переподписать UI в нескольких местах + +### 2. Оставить один главный процент, но считать его по docs как `prompt + output` + +`🎯 8 🛡️ 8 🧠 6` +Примерно `120-180` строк изменений + +Что сделать: + +- live team percent = `(input + cache_read + cache_creation + output) / contextWindow` +- `Visible Context` оставить только внутри sidebar/panel + +Плюсы: + +- очень понятная одна главная цифра +- максимально близко к official Anthropic context-window semantics + +Минусы: + +- future carried context всё равно не perfectly exact из-за thinking blocks +- нужен fallback wording, когда usage incomplete + +### 3. Минимальный fix только label-ов и знаменателей + +`🎯 6 🛡️ 6 🧠 3` +Примерно `40-90` строк изменений + +Что сделать: + +- перестать писать `of input`, если denominator не input +- button переименовать в `Visible` +- panel header явно разделить `Visible` и `Total` + +Плюсы: + +- быстро +- дешево + +Минусы: + +- не решает core semantic debt +- live lead percent всё ещё останется under-reported + +## 9. Recommended next step + +Рекомендую идти по **варианту 1**. + +Почему: + +- он закрывает и math, и naming, и UX confusion +- он не завязан только на Anthropic +- он даёт clean foundation для будущего Codex support + +### Practical plan + +1. Вынести явные type/terms для 3 метрик: + - `promptInputTokens` + - `contextWindowUsedTokens` + - `visibleContextTokens` +2. Исправить live Anthropic runtime formula и wording. +3. Перестать использовать label `of input` там, где denominator не `prompt input`. +4. Для Codex временно показывать: + - window size, если модель известна + - `context usage unavailable` или `output only` + - пока не появится raw prompt telemetry + +## 10. Bottom line + +Главная проблема сейчас не в одной строчке арифметики, а в том, что проект смешал: + +- **prompt input** +- **visible debuggable context** +- **full context window usage** + +В Anthropic path базовая input formula уже в целом нормальная, но UI поверх неё даёт неправильный смысл. + +В Codex path проблема глубже: + +- official API supports cached prompt accounting +- но наш текущий local session telemetry этого не доносит +- поэтому "точный % занятого контекста" для Codex пока нельзя обещать без нового data source diff --git a/eslint.config.js b/eslint.config.js index 5a191478..e207d1da 100644 --- a/eslint.config.js +++ b/eslint.config.js @@ -459,6 +459,14 @@ export default defineConfig([ }, }, + { + name: 'team-transcript-project-resolver-sonar-override', + files: ['src/main/services/team/TeamTranscriptProjectResolver.ts'], + rules: { + 'sonarjs/no-identical-functions': 'off', + }, + }, + // Preload script (Electron bridge) { name: 'electron-preload', diff --git a/src/features/agent-graph/renderer/adapters/TeamGraphAdapter.ts b/src/features/agent-graph/renderer/adapters/TeamGraphAdapter.ts index 5e9322b8..63002e5d 100644 --- a/src/features/agent-graph/renderer/adapters/TeamGraphAdapter.ts +++ b/src/features/agent-graph/renderer/adapters/TeamGraphAdapter.ts @@ -362,7 +362,7 @@ export class TeamGraphAdapter { toolHistory?: Record, isTeamProvisioning = false ): void { - const percent = leadContext?.percent; + const percent = leadContext?.contextUsedPercent; const leadMember = data.members.find((member) => member.name === leadName); const activeTool = TeamGraphAdapter.#selectVisibleTool( activeTools?.[leadName], diff --git a/src/features/agent-graph/renderer/ui/GraphNodePopover.tsx b/src/features/agent-graph/renderer/ui/GraphNodePopover.tsx index a341ec4a..a9f9d441 100644 --- a/src/features/agent-graph/renderer/ui/GraphNodePopover.tsx +++ b/src/features/agent-graph/renderer/ui/GraphNodePopover.tsx @@ -425,7 +425,7 @@ const MemberPopoverContent = ({ )} - {/* Context usage stays hidden for now because LeadContextUsage.percent is unreliable. */} + {/* Context usage stays hidden for now because lead context telemetry is still incomplete. */} {/* Current task indicator — reuses same pattern as MemberCard */} {node.currentTaskId && node.currentTaskSubject && ( diff --git a/src/main/services/team/TeamProvisioningService.ts b/src/main/services/team/TeamProvisioningService.ts index 7204faa5..78852a60 100644 --- a/src/main/services/team/TeamProvisioningService.ts +++ b/src/main/services/team/TeamProvisioningService.ts @@ -35,6 +35,7 @@ import { DEFAULT_TOOL_APPROVAL_SETTINGS } from '@shared/types/team'; import { resolveLanguageName } from '@shared/utils/agentLanguage'; import { getAnthropicDefaultTeamModel } from '@shared/utils/anthropicModelDefaults'; import { parseCliArgs } from '@shared/utils/cliArgsParser'; +import { deriveContextMetrics, inferContextWindowTokens } from '@shared/utils/contextMetrics'; import { isInboxNoiseMessage, isMeaningfulBootstrapCheckInMessage, @@ -82,6 +83,7 @@ import { resolveTeamProviderId } from '../runtime/providerRuntimeEnv'; import { buildActionModeProtocol } from './actionModeInstructions'; import { atomicWriteAsync } from './atomicWrite'; +import { peekAutoResumeService } from './AutoResumeService'; import { ClaudeBinaryResolver } from './ClaudeBinaryResolver'; import { withFileLock } from './fileLock'; import { @@ -112,7 +114,6 @@ import { TeamMembersMetaStore } from './TeamMembersMetaStore'; import { TeamMetaStore } from './TeamMetaStore'; import { TeamSentMessagesStore } from './TeamSentMessagesStore'; import { TeamTaskReader } from './TeamTaskReader'; -import { peekAutoResumeService } from './AutoResumeService'; /** * Kill a team CLI process using SIGKILL (uncatchable). @@ -649,8 +650,11 @@ interface ProvisioningRun { authRetryInProgress: boolean; /** Tracks lead process context window usage from stream-json usage data. */ leadContextUsage: { - currentTokens: number; - contextWindow: number; + promptInputTokens: number | null; + outputTokens: number | null; + contextUsedTokens: number | null; + contextWindowTokens: number | null; + promptInputSource: LeadContextUsage['promptInputSource']; lastUsageMessageId: string | null; lastEmittedAt: number; } | null; @@ -3181,15 +3185,95 @@ export class TeamProvisioningService { if (!run?.leadContextUsage || run.processKilled || run.cancelRequested) { return { usage: null, runId: null }; } - const { currentTokens, contextWindow } = run.leadContextUsage; - const percentRaw = contextWindow > 0 ? Math.round((currentTokens / contextWindow) * 100) : 0; - const percent = Math.max(0, Math.min(100, percentRaw)); return { - usage: { currentTokens, contextWindow, percent, updatedAt: new Date().toISOString() }, + usage: this.buildLeadContextUsagePayload(run), runId, }; } + private getInitialLeadContextWindowTokens(run: ProvisioningRun): number | null { + const providerId = normalizeOptionalTeamProviderId(run.request.providerId); + const modelName = + typeof run.request.model === 'string' && run.request.model.trim().length > 0 + ? run.request.model.trim() + : providerId === 'anthropic' + ? getAnthropicDefaultTeamModel(run.request.limitContext === true) + : undefined; + + return inferContextWindowTokens({ + providerId, + modelName, + limitContext: run.request.limitContext === true, + }); + } + + private buildLeadContextUsagePayload(run: ProvisioningRun): LeadContextUsage { + const usage = run.leadContextUsage; + if (!usage) { + return { + promptInputTokens: null, + outputTokens: null, + contextUsedTokens: null, + contextWindowTokens: null, + contextUsedPercent: null, + promptInputSource: 'unavailable', + updatedAt: new Date().toISOString(), + }; + } + + const { contextUsedTokens, contextWindowTokens } = usage; + const percentRaw = + contextUsedTokens !== null && contextWindowTokens !== null && contextWindowTokens > 0 + ? Math.round((contextUsedTokens / contextWindowTokens) * 100) + : null; + + return { + promptInputTokens: usage.promptInputTokens, + outputTokens: usage.outputTokens, + contextUsedTokens: usage.contextUsedTokens, + contextWindowTokens: usage.contextWindowTokens, + contextUsedPercent: percentRaw === null ? null : Math.max(0, Math.min(100, percentRaw)), + promptInputSource: usage.promptInputSource, + updatedAt: new Date().toISOString(), + }; + } + + private updateLeadContextUsageFromUsage( + run: ProvisioningRun, + usage: Record, + modelName: string | undefined + ): void { + const existingContextWindowTokens = + run.leadContextUsage?.contextWindowTokens ?? this.getInitialLeadContextWindowTokens(run); + const metrics = deriveContextMetrics({ + usage, + providerId: normalizeOptionalTeamProviderId(run.request.providerId), + modelName, + contextWindowTokens: existingContextWindowTokens, + limitContext: run.request.limitContext === true, + }); + + if (!run.leadContextUsage) { + run.leadContextUsage = { + promptInputTokens: metrics.promptInputTokens, + outputTokens: metrics.outputTokens, + contextUsedTokens: metrics.contextUsedTokens, + contextWindowTokens: metrics.contextWindowTokens, + promptInputSource: metrics.promptInputSource, + lastUsageMessageId: null, + lastEmittedAt: 0, + }; + return; + } + + run.leadContextUsage.promptInputTokens = metrics.promptInputTokens; + run.leadContextUsage.outputTokens = metrics.outputTokens; + run.leadContextUsage.contextUsedTokens = metrics.contextUsedTokens; + run.leadContextUsage.contextWindowTokens = + metrics.contextWindowTokens ?? run.leadContextUsage.contextWindowTokens; + run.leadContextUsage.promptInputSource = metrics.promptInputSource; + } + private isCurrentTrackedRun(run: ProvisioningRun): boolean { return this.getTrackedRunId(run.teamName) === run.runId; } @@ -3711,15 +3795,7 @@ export class TeamProvisioningService { return; } run.leadContextUsage.lastEmittedAt = now; - const { currentTokens, contextWindow } = run.leadContextUsage; - const percentRaw = contextWindow > 0 ? Math.round((currentTokens / contextWindow) * 100) : 0; - const percent = Math.max(0, Math.min(100, percentRaw)); - const payload: LeadContextUsage = { - currentTokens, - contextWindow, - percent, - updatedAt: new Date().toISOString(), - }; + const payload = this.buildLeadContextUsagePayload(run); this.teamChangeEmitter?.({ type: 'lead-context', teamName: run.teamName, @@ -8430,36 +8506,12 @@ export class TeamProvisioningService { if (usage && typeof usage === 'object') { // Dedup: skip if same message.id (SDK bug: multi-block = same usage repeated) if (!msgId || run.leadContextUsage?.lastUsageMessageId !== msgId) { - const inputTokens = typeof usage.input_tokens === 'number' ? usage.input_tokens : 0; - const cacheCreation = - typeof usage.cache_creation_input_tokens === 'number' - ? usage.cache_creation_input_tokens - : 0; - const cacheRead = - typeof usage.cache_read_input_tokens === 'number' ? usage.cache_read_input_tokens : 0; - // Total context window usage = all three token categories - // input_tokens = tokens AFTER last cache breakpoint (small) - // cache_creation = tokens written to cache (first request) - // cache_read = tokens read from cache (subsequent requests) — these ARE in context window - const currentTokens = inputTokens + cacheCreation + cacheRead; - - if (!run.leadContextUsage) { - // Determine initial context window from model selection - // computeEffectiveTeamModel() defaults to 'opus[1m]' when no model selected - const modelStr = (run.request.model ?? '').toLowerCase(); - const isHaiku = modelStr.includes('haiku'); - const isLimitedContext = run.request.limitContext === true; - // limitContext=true → 200K, haiku → 200K, [1m] → 1M, default → 1M (opus[1m]) - const initialContextWindow = isLimitedContext || isHaiku ? 200_000 : 1_000_000; - - run.leadContextUsage = { - currentTokens, - contextWindow: initialContextWindow, - lastUsageMessageId: msgId, - lastEmittedAt: 0, - }; - } else { - run.leadContextUsage.currentTokens = currentTokens; + this.updateLeadContextUsageFromUsage( + run, + usage, + typeof messageObj.model === 'string' ? messageObj.model : undefined + ); + if (run.leadContextUsage) { run.leadContextUsage.lastUsageMessageId = msgId; } this.emitLeadContextUsage(run); @@ -8506,13 +8558,16 @@ export class TeamProvisioningService { ) { if (!run.leadContextUsage) { run.leadContextUsage = { - currentTokens: 0, - contextWindow: modelData.contextWindow, + promptInputTokens: null, + outputTokens: null, + contextUsedTokens: null, + contextWindowTokens: modelData.contextWindow, + promptInputSource: 'unavailable', lastUsageMessageId: null, lastEmittedAt: 0, }; } else { - run.leadContextUsage.contextWindow = modelData.contextWindow; + run.leadContextUsage.contextWindowTokens = modelData.contextWindow; run.leadContextUsage.lastEmittedAt = 0; // force re-emit } this.emitLeadContextUsage(run); @@ -8527,30 +8582,17 @@ export class TeamProvisioningService { | Record | undefined; if (resultUsage && typeof resultUsage === 'object') { - const inp = typeof resultUsage.input_tokens === 'number' ? resultUsage.input_tokens : 0; - const cc = - typeof resultUsage.cache_creation_input_tokens === 'number' - ? resultUsage.cache_creation_input_tokens - : 0; - const cr = - typeof resultUsage.cache_read_input_tokens === 'number' - ? resultUsage.cache_read_input_tokens - : 0; - const total = inp + cc + cr; - if (total > 0) { - if (!run.leadContextUsage) { - run.leadContextUsage = { - currentTokens: total, - contextWindow: 0, - lastUsageMessageId: null, - lastEmittedAt: 0, - }; - } else { - run.leadContextUsage.currentTokens = total; - run.leadContextUsage.lastEmittedAt = 0; - } - this.emitLeadContextUsage(run); + this.updateLeadContextUsageFromUsage( + run, + resultUsage, + typeof (msg.result as Record | undefined)?.model === 'string' + ? ((msg.result as Record).model as string) + : undefined + ); + if (run.leadContextUsage) { + run.leadContextUsage.lastEmittedAt = 0; } + this.emitLeadContextUsage(run); } if (run.provisioningComplete) { diff --git a/src/main/services/team/TeamTranscriptProjectResolver.ts b/src/main/services/team/TeamTranscriptProjectResolver.ts index 5bbe4edd..062f8ff0 100644 --- a/src/main/services/team/TeamTranscriptProjectResolver.ts +++ b/src/main/services/team/TeamTranscriptProjectResolver.ts @@ -695,37 +695,37 @@ export class TeamTranscriptProjectResolver { } } - private async listSessionDirIds(projectDir: string): Promise { + private async readProjectDirEntries(projectDir: string): Promise { try { - const dirEntries = await fs.readdir(projectDir, { withFileTypes: true }); - return dirEntries - .filter((entry) => entry.isDirectory() && isSessionDirectoryName(entry.name)) - .map((entry) => entry.name); + return await fs.readdir(projectDir, { withFileTypes: true }); } catch { logger.debug(`Cannot read transcript project dir: ${projectDir}`); - return []; + return null; } } - private async listTeamRootSessionIds(projectDir: string, teamName: string): Promise { - let dirEntries: Dirent[]; - try { - dirEntries = await fs.readdir(projectDir, { withFileTypes: true }); - } catch { - logger.debug(`Cannot read transcript project dir: ${projectDir}`); + private async listSessionDirIds(projectDir: string): Promise { + const dirEntries = await this.readProjectDirEntries(projectDir); + if (!dirEntries) { return []; } - const rootJsonlEntries = dirEntries.filter( - (entry) => entry.isFile() && entry.name.endsWith('.jsonl') - ); + return dirEntries + .filter((entry) => entry.isDirectory() && isSessionDirectoryName(entry.name)) + .map((entry) => entry.name); + } + + private async collectRootJsonlSessionIds( + rootJsonlEntries: Dirent[], + projectDir: string, + teamName: string + ): Promise { const discovered = new Set(); let nextIndex = 0; - const worker = async (): Promise => { + const scanNextRootEntry = async (): Promise => { while (nextIndex < rootJsonlEntries.length) { - const index = nextIndex++; - const entry = rootJsonlEntries[index]; + const entry = rootJsonlEntries[nextIndex++]; const filePath = path.join(projectDir, entry.name); if (!(await this.fileBelongsToTeam(filePath, teamName))) { continue; @@ -736,13 +736,25 @@ export class TeamTranscriptProjectResolver { await Promise.all( Array.from({ length: Math.min(ROOT_DISCOVERY_CONCURRENCY, rootJsonlEntries.length) }, () => - worker() + scanNextRootEntry() ) ); return [...discovered]; } + private async listTeamRootSessionIds(projectDir: string, teamName: string): Promise { + const dirEntries = await this.readProjectDirEntries(projectDir); + if (!dirEntries) { + return []; + } + + const rootJsonlEntries = dirEntries.filter( + (entry) => entry.isFile() && entry.name.endsWith('.jsonl') + ); + return this.collectRootJsonlSessionIds(rootJsonlEntries, projectDir, teamName); + } + private async fileBelongsToTeam(filePath: string, teamName: string): Promise { const stream = createReadStream(filePath, { encoding: 'utf8' }); const rl = readline.createInterface({ input: stream, crlfDelay: Infinity }); diff --git a/src/main/types/jsonl.ts b/src/main/types/jsonl.ts index 6435a707..bc5c2da3 100644 --- a/src/main/types/jsonl.ts +++ b/src/main/types/jsonl.ts @@ -82,6 +82,21 @@ export interface UsageMetadata { output_tokens: number; cache_read_input_tokens?: number; cache_creation_input_tokens?: number; + input_tokens_details?: { + cached_tokens?: number; + }; + output_tokens_details?: { + reasoning_tokens?: number; + }; + prompt_tokens?: number; + completion_tokens?: number; + total_tokens?: number; + prompt_tokens_details?: { + cached_tokens?: number; + }; + completion_tokens_details?: { + reasoning_tokens?: number; + }; } // ============================================================================= diff --git a/src/renderer/components/chat/ChatHistory.tsx b/src/renderer/components/chat/ChatHistory.tsx index ea523dbf..ea7d0e5e 100644 --- a/src/renderer/components/chat/ChatHistory.tsx +++ b/src/renderer/components/chat/ChatHistory.tsx @@ -14,17 +14,15 @@ import { SessionContextPanel } from './SessionContextPanel/index'; /** Pixels from bottom considered "near bottom" for scroll-button visibility and auto-scroll. */ const SCROLL_THRESHOLD = 300; -import { - computeRemainingContext, - formatPercentOfTotal, - sumContextInjectionTokens, -} from '@renderer/utils/contextMath'; +import { computeRemainingContext, sumContextInjectionTokens } from '@renderer/utils/contextMath'; +import { deriveContextMetrics } from '@shared/utils/contextMetrics'; import { ChatHistoryEmptyState } from './ChatHistoryEmptyState'; import { ChatHistoryItem } from './ChatHistoryItem'; import { ChatHistoryLoadingState } from './ChatHistoryLoadingState'; import type { ContextInjection } from '@renderer/types/contextInjection'; +import type { ContextUsageLike } from '@shared/utils/contextMetrics'; /** * Waits for two requestAnimationFrame cycles, allowing the virtualizer to render. @@ -129,6 +127,7 @@ export const ChatHistory = ({ tabId }: ChatHistoryProps): JSX.Element => { const pendingNavigation = thisTab?.pendingNavigation; const teamBySessionId = useStore(useShallow((s) => s.teamBySessionId)); + const leadContextByTeam = useStore(useShallow((s) => s.leadContextByTeam)); // Look up whether this session belongs to a team const sessionTeam = useMemo(() => { @@ -138,9 +137,13 @@ export const ChatHistory = ({ tabId }: ChatHistoryProps): JSX.Element => { }, [teamBySessionId, sessionDetail?.session?.id]); // Compute all accumulated context injections (phase-aware) - const { allContextInjections, lastAiGroupTotalTokens } = useMemo(() => { + const { allContextInjections, lastAssistantUsage, lastAssistantModelName } = useMemo(() => { if (!sessionContextStats || !conversation?.items.length) { - return { allContextInjections: [] as ContextInjection[], lastAiGroupTotalTokens: undefined }; + return { + allContextInjections: [] as ContextInjection[], + lastAssistantUsage: null as ContextUsageLike | null, + lastAssistantModelName: undefined as string | undefined, + }; } // Determine which phase to show @@ -161,7 +164,8 @@ export const ChatHistory = ({ tabId }: ChatHistoryProps): JSX.Element => { if (lastAiItem?.type !== 'ai') { return { allContextInjections: [] as ContextInjection[], - lastAiGroupTotalTokens: undefined, + lastAssistantUsage: null, + lastAssistantModelName: undefined, }; } targetAiGroupId = lastAiItem.group.id; @@ -170,9 +174,8 @@ export const ChatHistory = ({ tabId }: ChatHistoryProps): JSX.Element => { const stats = sessionContextStats.get(targetAiGroupId); const injections = stats?.accumulatedInjections ?? []; - // Get total INPUT tokens from the target AI group (excluding output tokens, - // since visible context is part of input only) - let totalTokens: number | undefined; + let lastUsage: ContextUsageLike | null = null; + let lastModelName: string | undefined; const targetItem = conversation.items.find( (item) => item.type === 'ai' && item.group.id === targetAiGroupId ); @@ -181,27 +184,51 @@ export const ChatHistory = ({ tabId }: ChatHistoryProps): JSX.Element => { for (let i = responses.length - 1; i >= 0; i--) { const msg = responses[i]; if (msg.type === 'assistant' && msg.usage) { - const usage = msg.usage; - totalTokens = - (usage.input_tokens ?? 0) + - (usage.cache_read_input_tokens ?? 0) + - (usage.cache_creation_input_tokens ?? 0); + lastUsage = msg.usage; + lastModelName = msg.model; break; } } } - return { allContextInjections: injections, lastAiGroupTotalTokens: totalTokens }; + return { + allContextInjections: injections, + lastAssistantUsage: lastUsage, + lastAssistantModelName: lastModelName, + }; }, [sessionContextStats, conversation, selectedContextPhase, sessionPhaseInfo]); - - const visibleContextPercentLabel = useMemo(() => { - const visibleTokens = sumContextInjectionTokens(allContextInjections); - return formatPercentOfTotal(visibleTokens, lastAiGroupTotalTokens); - }, [allContextInjections, lastAiGroupTotalTokens]); + const visibleContextTokens = useMemo( + () => sumContextInjectionTokens(allContextInjections), + [allContextInjections] + ); + const sessionLeadContext = sessionTeam ? (leadContextByTeam[sessionTeam.teamName] ?? null) : null; + const contextMetrics = useMemo( + () => + deriveContextMetrics({ + usage: lastAssistantUsage, + modelName: lastAssistantModelName, + contextWindowTokens: sessionLeadContext?.contextWindowTokens ?? null, + visibleContextTokens, + }), + [ + lastAssistantModelName, + lastAssistantUsage, + sessionLeadContext?.contextWindowTokens, + visibleContextTokens, + ] + ); + const contextUsedPercentLabel = useMemo(() => { + const percent = contextMetrics.contextUsedPercentOfContextWindow; + return percent === null ? null : `${percent.toFixed(1)}%`; + }, [contextMetrics.contextUsedPercentOfContextWindow]); const remainingContext = useMemo( - () => computeRemainingContext(lastAiGroupTotalTokens), - [lastAiGroupTotalTokens] + () => + computeRemainingContext( + contextMetrics.contextUsedTokens ?? undefined, + contextMetrics.contextWindowTokens ?? undefined + ), + [contextMetrics.contextUsedTokens, contextMetrics.contextWindowTokens] ); // State for navigation highlight (blue, used for Turn navigation from CLAUDE.md panel) @@ -839,7 +866,7 @@ export const ChatHistory = ({ tabId }: ChatHistoryProps): JSX.Element => { onNavigateToTurn={handleNavigateToTurn} onNavigateToTool={handleNavigateToTool} onNavigateToUserGroup={handleNavigateToUserGroup} - totalSessionTokens={lastAiGroupTotalTokens} + contextMetrics={contextMetrics} sessionMetrics={sessionDetail?.metrics} subagentCostUsd={subagentCostUsd} onViewReport={effectiveTabId ? () => openSessionReport(effectiveTabId) : undefined} @@ -877,9 +904,9 @@ export const ChatHistory = ({ tabId }: ChatHistoryProps): JSX.Element => { : 'var(--color-text-secondary)', }} > - {visibleContextPercentLabel ? ( + {contextUsedPercentLabel ? ( <> - {visibleContextPercentLabel} + {contextUsedPercentLabel} {remainingContext && remainingContext.urgency !== 'normal' && ( void; @@ -42,7 +42,7 @@ interface SessionContextHeaderProps { export const SessionContextHeader = ({ injectionCount, totalTokens, - totalSessionTokens, + contextMetrics, sessionMetrics, subagentCostUsd, onClose, @@ -53,6 +53,45 @@ export const SessionContextHeader = ({ viewMode, onViewModeChange, }: Readonly): React.ReactElement => { + const formatPercentLabel = (percent: number | null, suffix: string): string | null => { + if (percent === null) { + return null; + } + return `${percent.toFixed(1)}% ${suffix}`; + }; + + const renderMetricValue = ( + label: string, + tokens: number | null, + percentLabel: string | null, + options?: { + approximate?: boolean; + unavailableLabel?: string; + } + ): React.ReactElement => ( +
+ {label} +
+
+ {tokens === null + ? (options?.unavailableLabel ?? 'Unavailable') + : `${options?.approximate ? '~' : ''}${formatTokens(tokens)}`} +
+ {percentLabel && ( +
+ {percentLabel} +
+ )} +
+
+ ); + + const codexTelemetryUnavailable = + contextMetrics?.providerId === 'codex' && contextMetrics.promptInputSource === 'unavailable'; + return (
{/* Title row */} @@ -60,7 +99,7 @@ export const SessionContextHeader = ({

- Visible Context + Context

- {/* Token comparison stats */} + {/* Primary metrics */}
-
- {/* Visible Context tokens */} -
- Visible: - - ~{formatTokens(totalTokens)} - -
- {/* Total Input tokens (if provided) */} - {totalSessionTokens !== undefined && totalSessionTokens > 0 && ( -
- Input: - - {formatTokens(totalSessionTokens)} - -
- )} -
- {/* Percentage of total */} - {formatPercentOfTotal(totalTokens, totalSessionTokens) && ( - - {formatPercentOfTotal(totalTokens, totalSessionTokens)} - + {renderMetricValue( + 'Context Used', + contextMetrics?.contextUsedTokens ?? null, + formatPercentLabel( + contextMetrics?.contextUsedPercentOfContextWindow ?? null, + 'of context' + ) + )} + {renderMetricValue( + 'Prompt Input', + contextMetrics?.promptInputTokens ?? null, + formatPercentLabel( + contextMetrics?.promptInputPercentOfContextWindow ?? null, + 'of context' + ) + )} + {renderMetricValue( + 'Visible Context', + totalTokens, + formatPercentLabel( + contextMetrics?.visibleContextPercentOfPromptInput ?? null, + 'of prompt' + ), + { approximate: true } )}
+ {codexTelemetryUnavailable && ( +
+ Codex prompt-side usage is not exposed by the current runtime telemetry yet, so Prompt + Input and Context Used stay unavailable instead of showing a fake zero. +
+ )} + {/* Session Metrics Breakdown */} {sessionMetrics && (
{
- {/* What is Visible Context */} + {/* Metric definitions */}
- What is Visible Context? + Context Used

- Tokens consumed by file reads, tool outputs, and configuration files (CLAUDE.md) - that are injected into the conversation. + Prompt input plus output tokens currently occupying the model's context + window.

- {/* Difference with Total */}
- Total Context vs Visible Context -
-
-
- - Total: - - - Total tokens that are injected into the conversation - -
-
- - Visible: - - - Subset of tokens that you can optimize & debug - -
+ Prompt Input
+

+ Tokens sent to the model before generation. For Claude this includes `input_tokens + + cache_creation_input_tokens + cache_read_input_tokens`. +

- {/* Tips */}
- Optimization Tips + Visible Context
-
    -
  • Shorten large CLAUDE.md files
  • -
  • Split large @-mentioned files
  • -
  • Adjust MCP tool output verbosity
  • -
+

+ The inspectable subset of prompt input: files, CLAUDE.md, tool outputs, user + messages, and similar injections that you can optimize directly. +

+
+ +
+
+ Availability +
+

+ If a provider runtime does not expose prompt-side usage yet, the panel shows + metrics as unavailable instead of pretending they are zero. +

, diff --git a/src/renderer/components/chat/SessionContextPanel/index.tsx b/src/renderer/components/chat/SessionContextPanel/index.tsx index 60754d7e..e44e25e3 100644 --- a/src/renderer/components/chat/SessionContextPanel/index.tsx +++ b/src/renderer/components/chat/SessionContextPanel/index.tsx @@ -48,7 +48,7 @@ export const SessionContextPanel = ({ onNavigateToTurn, onNavigateToTool, onNavigateToUserGroup, - totalSessionTokens, + contextMetrics, sessionMetrics, subagentCostUsd, onViewReport, @@ -193,7 +193,7 @@ export const SessionContextPanel = ({ void; /** Navigate to the user message group preceding the AI group at turnIndex */ onNavigateToUserGroup?: (turnIndex: number) => void; - /** Total session tokens (input + output + cache) for comparison */ - totalSessionTokens?: number; + /** Unified context metrics for the selected AI group */ + contextMetrics?: DerivedContextMetrics; /** Full session metrics (input, output, cache tokens, cost) */ sessionMetrics?: SessionMetrics; /** Combined cost of all subagent processes */ diff --git a/src/renderer/components/common/TokenUsageDisplay.tsx b/src/renderer/components/common/TokenUsageDisplay.tsx index 7bd72fbd..a1c52f23 100644 --- a/src/renderer/components/common/TokenUsageDisplay.tsx +++ b/src/renderer/components/common/TokenUsageDisplay.tsx @@ -48,8 +48,6 @@ interface TokenUsageDisplayProps { totalPhases?: number; /** Optional USD cost for this usage */ costUsd?: number; - /** Context window size (e.g., 200000 or 1000000). When provided, shows "X% context used" instead of "X% of input". */ - contextWindowSize?: number; } /** @@ -59,27 +57,22 @@ interface TokenUsageDisplayProps { const SessionContextSection = ({ contextStats, totalInputTokens, - contextWindowSize, }: Readonly<{ contextStats: ContextStats; totalInputTokens: number; - contextWindowSize?: number; }>): React.JSX.Element => { const [expanded, setExpanded] = useState(false); const { tokensByCategory } = contextStats; // contextStats.totalEstimatedTokens already includes all categories (CLAUDE.md, @files, - // tool outputs, thinking+text, task coordination, user messages) — no manual adjustment needed. - // Show context window usage % when contextWindowSize is available (more useful), - // otherwise fall back to visible context / total input ratio. + // tool outputs, thinking+text, task coordination, user messages) - no manual adjustment needed. + // Visible Context is always shown as a share of prompt-side input tokens so this section + // stays aligned with the unified context contract instead of silently switching semantics. const contextPercent = - contextWindowSize && contextWindowSize > 0 - ? Math.min((totalInputTokens / contextWindowSize) * 100, 100).toFixed(1) - : totalInputTokens > 0 - ? Math.min((contextStats.totalEstimatedTokens / totalInputTokens) * 100, 100).toFixed(1) - : '0.0'; - const contextLabel = contextWindowSize ? 'of context' : 'of input'; + totalInputTokens > 0 + ? Math.min((contextStats.totalEstimatedTokens / totalInputTokens) * 100, 100).toFixed(1) + : '0.0'; // Count accumulated injections by category const claudeMdCount = contextStats.accumulatedInjections.filter( @@ -152,7 +145,7 @@ const SessionContextSection = ({ className="whitespace-nowrap text-[10px] tabular-nums" style={{ color: COLOR_TEXT_MUTED }} > - {formatTokens(contextStats.totalEstimatedTokens)} ({contextPercent}% {contextLabel}) + {formatTokens(contextStats.totalEstimatedTokens)} ({contextPercent}% of prompt input)
@@ -261,10 +254,9 @@ export const TokenUsageDisplay = ({ phaseNumber, totalPhases, costUsd, - contextWindowSize, }: Readonly): React.JSX.Element => { const totalTokens = inputTokens + cacheReadTokens + cacheCreationTokens + outputTokens; - // Total input tokens only (without output) — used as denominator for visible context % + // Total prompt-side tokens only (without output) - used as denominator for visible context % const totalInputTokens = inputTokens + cacheReadTokens + cacheCreationTokens; const formattedTotal = formatTokens(totalTokens); @@ -540,7 +532,6 @@ export const TokenUsageDisplay = ({ )} diff --git a/src/renderer/components/team/TeamDetailView.tsx b/src/renderer/components/team/TeamDetailView.tsx index c53f03c4..98582a4c 100644 --- a/src/renderer/components/team/TeamDetailView.tsx +++ b/src/renderer/components/team/TeamDetailView.tsx @@ -25,7 +25,7 @@ import { isTeamProvisioningActive, } from '@renderer/store/slices/teamSlice'; import { createChipFromSelection } from '@renderer/utils/chipUtils'; -import { formatPercentOfTotal, sumContextInjectionTokens } from '@renderer/utils/contextMath'; +import { sumContextInjectionTokens } from '@renderer/utils/contextMath'; import { formatProjectPath } from '@renderer/utils/pathDisplay'; import { buildTaskCountsByOwner, normalizePath } from '@renderer/utils/pathNormalize'; import { nameColorSet } from '@renderer/utils/projectColor'; @@ -35,6 +35,7 @@ import { type TaskChangeRequestOptions, } from '@renderer/utils/taskChangeRequest'; import { stripAgentBlocks } from '@shared/constants/agentBlocks'; +import { deriveContextMetrics } from '@shared/utils/contextMetrics'; import { isLeadAgentType, isLeadMember } from '@shared/utils/leadDetection'; import { createLogger } from '@shared/utils/logger'; import { deriveTaskDisplayId, formatTaskDisplayLabel } from '@shared/utils/taskIdentity'; @@ -114,6 +115,7 @@ import type { TeamTaskWithKanban, } from '@shared/types'; import type { EditorSelectionAction } from '@shared/types/editor'; +import type { ContextUsageLike } from '@shared/utils/contextMetrics'; interface TeamDetailViewProps { teamName: string; @@ -445,6 +447,7 @@ const LeadContextBridge = memo(function LeadContextBridge({ }: LeadContextBridgeProps): React.JSX.Element | null { const { leadTabData, + leadContextSnapshot, isContextPanelVisible, selectedContextPhase, setContextPanelVisibleForTab, @@ -453,6 +456,7 @@ const LeadContextBridge = memo(function LeadContextBridge({ } = useStore( useShallow((s) => ({ leadTabData: tabId ? (s.tabSessionData[tabId] ?? null) : null, + leadContextSnapshot: s.leadContextByTeam[teamName] ?? null, isContextPanelVisible: tabId ? (s.tabUIStates.get(tabId)?.showContextPanel ?? false) : false, selectedContextPhase: tabId ? (s.tabUIStates.get(tabId)?.selectedContextPhase ?? null) : null, setContextPanelVisibleForTab: s.setContextPanelVisibleForTab, @@ -491,9 +495,13 @@ const LeadContextBridge = memo(function LeadContextBridge({ const total = processes.reduce((sum, p) => sum + (p.metrics.costUsd ?? 0), 0); return total > 0 ? total : undefined; }, [leadSessionDetail?.processes]); - const { allContextInjections, lastAiGroupTotalTokens } = useMemo(() => { + const { allContextInjections, lastAssistantUsage, lastAssistantModelName } = useMemo(() => { if (!leadSessionLoaded || !leadSessionContextStats || !leadConversation?.items.length) { - return { allContextInjections: [] as ContextInjection[], lastAiGroupTotalTokens: undefined }; + return { + allContextInjections: [] as ContextInjection[], + lastAssistantUsage: null as ContextUsageLike | null, + lastAssistantModelName: undefined as string | undefined, + }; } const effectivePhase = selectedContextPhase; @@ -511,7 +519,8 @@ const LeadContextBridge = memo(function LeadContextBridge({ if (lastAiItem?.type !== 'ai') { return { allContextInjections: [] as ContextInjection[], - lastAiGroupTotalTokens: undefined, + lastAssistantUsage: null, + lastAssistantModelName: undefined, }; } targetAiGroupId = lastAiItem.group.id; @@ -520,7 +529,8 @@ const LeadContextBridge = memo(function LeadContextBridge({ const stats = leadSessionContextStats.get(targetAiGroupId); const injections = stats?.accumulatedInjections ?? []; - let totalTokens: number | undefined; + let lastUsage: ContextUsageLike | null = null; + let lastModelName: string | undefined; const targetItem = leadConversation.items.find( (item) => item.type === 'ai' && item.group.id === targetAiGroupId ); @@ -529,18 +539,18 @@ const LeadContextBridge = memo(function LeadContextBridge({ for (let i = responses.length - 1; i >= 0; i--) { const msg = responses[i]; if (msg.type === 'assistant' && msg.usage) { - const usage = msg.usage; - totalTokens = - (usage.input_tokens ?? 0) + - (usage.output_tokens ?? 0) + - (usage.cache_read_input_tokens ?? 0) + - (usage.cache_creation_input_tokens ?? 0); + lastUsage = msg.usage; + lastModelName = msg.model; break; } } } - return { allContextInjections: injections, lastAiGroupTotalTokens: totalTokens }; + return { + allContextInjections: injections, + lastAssistantUsage: lastUsage, + lastAssistantModelName: lastModelName, + }; }, [ leadConversation, leadSessionContextStats, @@ -552,10 +562,26 @@ const LeadContextBridge = memo(function LeadContextBridge({ () => sumContextInjectionTokens(allContextInjections), [allContextInjections] ); - const visibleContextPercentLabel = useMemo( - () => formatPercentOfTotal(visibleContextTokens, lastAiGroupTotalTokens), - [visibleContextTokens, lastAiGroupTotalTokens] + const contextMetrics = useMemo( + () => + deriveContextMetrics({ + usage: lastAssistantUsage, + modelName: lastAssistantModelName, + contextWindowTokens: leadContextSnapshot?.contextWindowTokens ?? null, + visibleContextTokens, + }), + [ + lastAssistantModelName, + lastAssistantUsage, + leadContextSnapshot?.contextWindowTokens, + visibleContextTokens, + ] ); + const contextUsedPercentLabel = useMemo(() => { + const percent = + contextMetrics.contextUsedPercentOfContextWindow ?? leadContextSnapshot?.contextUsedPercent; + return percent === null || percent === undefined ? null : `${percent.toFixed(1)}%`; + }, [contextMetrics.contextUsedPercentOfContextWindow, leadContextSnapshot?.contextUsedPercent]); if (!leadSessionId) { return null; @@ -570,7 +596,7 @@ const LeadContextBridge = memo(function LeadContextBridge({ injections={allContextInjections} onClose={() => setContextPanelVisible(false)} projectRoot={leadSessionDetail?.session?.projectPath ?? fallbackProjectRoot} - totalSessionTokens={lastAiGroupTotalTokens} + contextMetrics={contextMetrics} sessionMetrics={leadSessionDetail?.metrics} subagentCostUsd={leadSubagentCostUsd} phaseInfo={leadSessionPhaseInfo ?? undefined} @@ -585,7 +611,7 @@ const LeadContextBridge = memo(function LeadContextBridge({ >
-

Visible Context

+

Context

{leadSessionLoading ? 'Loading…' : 'No session loaded'}

@@ -644,7 +670,7 @@ const LeadContextBridge = memo(function LeadContextBridge({ : leadSessionId } > - {visibleContextPercentLabel ?? 'Context'} + {contextUsedPercentLabel ?? 'Context'}
diff --git a/src/renderer/components/team/activity/ActivityItem.tsx b/src/renderer/components/team/activity/ActivityItem.tsx index f29ff996..429cbc27 100644 --- a/src/renderer/components/team/activity/ActivityItem.tsx +++ b/src/renderer/components/team/activity/ActivityItem.tsx @@ -668,11 +668,8 @@ export const ActivityItem = memo( }, [message.timestamp]); const structured = parseStructuredAgentMessage(message.text); - const bootstrapDisplay = useMemo(() => getBootstrapPromptDisplay(message), [message]); - const bootstrapAcknowledgement = useMemo( - () => getBootstrapAcknowledgementDisplay(message), - [message] - ); + const bootstrapDisplay = getBootstrapPromptDisplay(message); + const bootstrapAcknowledgement = getBootstrapAcknowledgementDisplay(message); // Only flag agent messages as rate-limited, not user's own quotes const rateLimited = message.from !== 'user' && isRateLimitMessage(message.text); // Highlight messages containing API errors @@ -681,22 +678,16 @@ export const ActivityItem = memo( const isAuthError = isApiError && AUTH_ERROR_PATTERNS.some((p) => p.test(message.text)); // Never collapse rate limit messages as noise — they must be visible const noiseLabel = structured && !rateLimited ? getNoiseLabel(structured) : null; - const idleSemantic = useMemo(() => classifyIdleNotification(message), [message]); + const idleSemantic = classifyIdleNotification(message); const systemLabel = !structured && !rateLimited ? getSystemMessageLabel(message.text) : null; const isManaged = collapseMode === 'managed'; const isExpanded = isManaged ? !isCollapsed : true; - const parsedCrossTeamPrefix = useMemo(() => parseCrossTeamPrefix(message.text), [message.text]); - const qualifiedRecipient = useMemo(() => parseQualifiedRecipient(message.to), [message.to]); - const crossTeamSentTarget = useMemo( - () => getCrossTeamSentTarget(message.to, teamName, localMemberNames), - [message.to, teamName, localMemberNames] - ); - const crossTeamSentMemberName = useMemo( - () => getCrossTeamSentMemberName(message.to), - [message.to] - ); + const parsedCrossTeamPrefix = parseCrossTeamPrefix(message.text); + const qualifiedRecipient = parseQualifiedRecipient(message.to); + const crossTeamSentTarget = getCrossTeamSentTarget(message.to, teamName, localMemberNames); + const crossTeamSentMemberName = getCrossTeamSentMemberName(message.to); const isCrossTeam = message.source === CROSS_TEAM_SOURCE || parsedCrossTeamPrefix !== null; const isCrossTeamSent = message.source === CROSS_TEAM_SENT_SOURCE || crossTeamSentTarget !== null; @@ -827,7 +818,7 @@ export const ActivityItem = memo( slashCommandMeta, structured, ]); - const summaryText = useMemo(() => extractMarkdownPlainText(rawSummary), [rawSummary]); + const summaryText = extractMarkdownPlainText(rawSummary); const commentTaskRef = message.messageKind === 'task_comment_notification' ? (message.taskRefs?.[0] ?? null) : null; const commentTaskDisplayId = diff --git a/src/renderer/utils/contextMath.ts b/src/renderer/utils/contextMath.ts index f83c5474..17987f95 100644 --- a/src/renderer/utils/contextMath.ts +++ b/src/renderer/utils/contextMath.ts @@ -40,11 +40,15 @@ export interface RemainingContext { * Returns null if input data is unavailable. */ export function computeRemainingContext( - totalInputTokens: number | undefined, - contextWindow: number = DEFAULT_CONTEXT_WINDOW + usedContextTokens: number | undefined, + contextWindow?: number ): RemainingContext | null { - if (totalInputTokens === undefined || totalInputTokens <= 0) return null; - const remainingPct = Math.max(((contextWindow - totalInputTokens) / contextWindow) * 100, 0); + if (usedContextTokens === undefined || usedContextTokens < 0) return null; + const resolvedContextWindow = contextWindow ?? DEFAULT_CONTEXT_WINDOW; + const remainingPct = Math.max( + ((resolvedContextWindow - usedContextTokens) / resolvedContextWindow) * 100, + 0 + ); const urgency: ContextUrgency = remainingPct < 20 ? 'critical' : remainingPct < 40 ? 'warning' : 'normal'; return { remainingPct, urgency }; diff --git a/src/shared/types/team.ts b/src/shared/types/team.ts index bbdaa950..308e1a2e 100644 --- a/src/shared/types/team.ts +++ b/src/shared/types/team.ts @@ -787,12 +787,22 @@ export interface LeadActivitySnapshot { } export interface LeadContextUsage { - /** Total tokens currently in context (input + cache_creation + cache_read) */ - currentTokens: number; + /** Prompt-side tokens currently occupying the context window. */ + promptInputTokens: number | null; + /** Tokens generated in the latest response. */ + outputTokens: number | null; + /** Total occupied context window tokens (prompt input + output). */ + contextUsedTokens: number | null; /** Model's context window size */ - contextWindow: number; - /** Usage percentage (0-100) */ - percent: number; + contextWindowTokens: number | null; + /** Context usage percentage (0-100) */ + contextUsedPercent: number | null; + /** Which usage contract produced the prompt-side numbers. */ + promptInputSource: + | 'anthropic_usage' + | 'openai_responses_usage' + | 'openai_chat_usage' + | 'unavailable'; /** ISO timestamp of last update */ updatedAt: string; } diff --git a/src/shared/utils/__tests__/contextMetrics.test.ts b/src/shared/utils/__tests__/contextMetrics.test.ts new file mode 100644 index 00000000..73405d9d --- /dev/null +++ b/src/shared/utils/__tests__/contextMetrics.test.ts @@ -0,0 +1,260 @@ +import { describe, expect, it } from 'vitest'; + +import { deriveContextMetrics, inferContextWindowTokens } from '../contextMetrics'; + +describe('contextMetrics', () => { + it('derives exact Anthropic prompt and context usage', () => { + const metrics = deriveContextMetrics({ + providerId: 'anthropic', + modelName: 'claude-sonnet-4-5-20250929', + usage: { + input_tokens: 1_200, + cache_creation_input_tokens: 400, + cache_read_input_tokens: 600, + output_tokens: 200, + }, + visibleContextTokens: 550, + }); + + expect(metrics.contextWindowTokens).toBe(200_000); + expect(metrics.promptInputTokens).toBe(2_200); + expect(metrics.contextUsedTokens).toBe(2_400); + expect(metrics.promptInputSource).toBe('anthropic_usage'); + expect(metrics.contextUsedPercentOfContextWindow).toBeCloseTo(1.2); + expect(metrics.visibleContextPercentOfPromptInput).toBeCloseTo(25); + }); + + it('derives exact OpenAI Responses usage', () => { + const metrics = deriveContextMetrics({ + modelName: 'gpt-5.4', + usage: { + input_tokens: 5_000, + output_tokens: 250, + }, + visibleContextTokens: 900, + }); + + expect(metrics.contextWindowTokens).toBe(1_050_000); + expect(metrics.promptInputTokens).toBe(5_000); + expect(metrics.contextUsedTokens).toBe(5_250); + expect(metrics.promptInputSource).toBe('openai_responses_usage'); + expect(metrics.promptInputPercentOfContextWindow).toBeCloseTo(0.47619, 4); + }); + + it('derives exact OpenAI chat usage without double-counting cache or reasoning breakdowns', () => { + const metrics = deriveContextMetrics({ + providerId: 'codex', + modelName: 'gpt-5.4', + usage: { + prompt_tokens: 2_006, + completion_tokens: 300, + prompt_tokens_details: { + cached_tokens: 1_920, + }, + completion_tokens_details: { + reasoning_tokens: 120, + }, + }, + visibleContextTokens: 900, + }); + + expect(metrics.contextWindowTokens).toBe(1_050_000); + expect(metrics.promptInputTokens).toBe(2_006); + expect(metrics.outputTokens).toBe(300); + expect(metrics.contextUsedTokens).toBe(2_306); + expect(metrics.promptInputSource).toBe('openai_chat_usage'); + }); + + it('does not double-count OpenAI cached-token breakdowns in Responses usage', () => { + const metrics = deriveContextMetrics({ + providerId: 'codex', + modelName: 'gpt-5.2-codex', + usage: { + input_tokens: 7_500, + output_tokens: 120, + input_tokens_details: { + cached_tokens: 7_168, + }, + output_tokens_details: { + reasoning_tokens: 80, + }, + }, + }); + + expect(metrics.contextWindowTokens).toBe(400_000); + expect(metrics.promptInputTokens).toBe(7_500); + expect(metrics.outputTokens).toBe(120); + expect(metrics.contextUsedTokens).toBe(7_620); + expect(metrics.promptInputSource).toBe('openai_responses_usage'); + }); + + it('marks Codex prompt-side usage unavailable when telemetry reports fake zeros', () => { + const metrics = deriveContextMetrics({ + providerId: 'codex', + modelName: 'gpt-5.4-mini', + usage: { + input_tokens: 0, + cache_creation_input_tokens: 0, + cache_read_input_tokens: 0, + output_tokens: 35, + }, + visibleContextTokens: 700, + }); + + expect(metrics.contextWindowTokens).toBe(400_000); + expect(metrics.promptInputTokens).toBeNull(); + expect(metrics.contextUsedTokens).toBeNull(); + expect(metrics.promptInputSource).toBe('unavailable'); + expect(metrics.visibleContextPercentOfPromptInput).toBeNull(); + }); + + it('infers Anthropic native 1M windows from current raw model ids', () => { + expect( + inferContextWindowTokens({ + providerId: 'anthropic', + modelName: 'claude-opus-4-7', + }) + ).toBe(1_000_000); + expect( + inferContextWindowTokens({ + providerId: 'anthropic', + modelName: 'claude-opus-4-6', + }) + ).toBe(1_000_000); + expect( + inferContextWindowTokens({ + providerId: 'anthropic', + modelName: 'claude-sonnet-4-6', + }) + ).toBe(1_000_000); + }); + + it('keeps older raw Anthropic models at 200K unless 1M is explicitly requested', () => { + expect( + inferContextWindowTokens({ + providerId: 'anthropic', + modelName: 'claude-sonnet-4-5-20250929', + }) + ).toBe(200_000); + expect( + inferContextWindowTokens({ + providerId: 'anthropic', + modelName: 'opus[1m]', + }) + ).toBe(1_000_000); + expect( + inferContextWindowTokens({ + providerId: 'anthropic', + modelName: 'claude-sonnet-4-5-20250929[1m]', + }) + ).toBe(1_000_000); + }); + + it('respects limitContext for Anthropic even when the raw model supports 1M', () => { + expect( + inferContextWindowTokens({ + providerId: 'anthropic', + modelName: 'claude-opus-4-6', + limitContext: true, + }) + ).toBe(200_000); + }); + + it('infers Anthropic correctly from 1M aliases even when providerId is omitted', () => { + const metrics = deriveContextMetrics({ + modelName: 'opus[1m]', + usage: { + input_tokens: 1_500, + output_tokens: 100, + }, + }); + + expect(metrics.providerId).toBe('anthropic'); + expect(metrics.contextWindowTokens).toBe(1_000_000); + expect(metrics.promptInputTokens).toBe(1_500); + expect(metrics.contextUsedTokens).toBe(1_600); + expect(metrics.promptInputSource).toBe('anthropic_usage'); + }); + + it('supports Codex/OpenAI model-specific context windows', () => { + expect( + inferContextWindowTokens({ + providerId: 'codex', + modelName: 'gpt-5.4-pro', + }) + ).toBe(1_050_000); + expect( + inferContextWindowTokens({ + providerId: 'codex', + modelName: 'gpt-5.4-mini', + }) + ).toBe(400_000); + expect( + inferContextWindowTokens({ + providerId: 'codex', + modelName: 'codex-mini-latest', + }) + ).toBe(200_000); + }); + + it('covers the current team Codex model matrix with documented context windows', () => { + expect( + inferContextWindowTokens({ + providerId: 'codex', + modelName: 'gpt-5.4-mini', + }) + ).toBe(400_000); + expect( + inferContextWindowTokens({ + providerId: 'codex', + modelName: 'gpt-5.3-codex', + }) + ).toBe(400_000); + expect( + inferContextWindowTokens({ + providerId: 'codex', + modelName: 'gpt-5.3-codex-spark', + }) + ).toBe(400_000); + expect( + inferContextWindowTokens({ + providerId: 'codex', + modelName: 'gpt-5.2', + }) + ).toBe(400_000); + expect( + inferContextWindowTokens({ + providerId: 'codex', + modelName: 'gpt-5.2-codex', + }) + ).toBe(400_000); + expect( + inferContextWindowTokens({ + providerId: 'codex', + modelName: 'gpt-5.1-codex-mini', + }) + ).toBe(400_000); + expect( + inferContextWindowTokens({ + providerId: 'codex', + modelName: 'gpt-5.1-codex-max', + }) + ).toBe(400_000); + }); + + it('prefers an explicit context window override over inferred model defaults', () => { + const metrics = deriveContextMetrics({ + providerId: 'anthropic', + modelName: 'claude-opus-4-6', + contextWindowTokens: 200_000, + usage: { + input_tokens: 1_000, + output_tokens: 100, + }, + }); + + expect(metrics.contextWindowTokens).toBe(200_000); + expect(metrics.promptInputTokens).toBe(1_000); + expect(metrics.contextUsedTokens).toBe(1_100); + }); +}); diff --git a/src/shared/utils/__tests__/teamProvider.test.ts b/src/shared/utils/__tests__/teamProvider.test.ts new file mode 100644 index 00000000..50c60c1c --- /dev/null +++ b/src/shared/utils/__tests__/teamProvider.test.ts @@ -0,0 +1,17 @@ +import { describe, expect, it } from 'vitest'; + +import { inferTeamProviderIdFromModel } from '../teamProvider'; + +describe('inferTeamProviderIdFromModel', () => { + it('recognizes Anthropic aliases with 1m suffixes', () => { + expect(inferTeamProviderIdFromModel('opus[1m]')).toBe('anthropic'); + expect(inferTeamProviderIdFromModel('sonnet[1m]')).toBe('anthropic'); + expect(inferTeamProviderIdFromModel('haiku[1m]')).toBe('anthropic'); + }); + + it('recognizes full provider-scoped model ids', () => { + expect(inferTeamProviderIdFromModel('claude-opus-4-6')).toBe('anthropic'); + expect(inferTeamProviderIdFromModel('gpt-5.4')).toBe('codex'); + expect(inferTeamProviderIdFromModel('gemini-2.5-pro')).toBe('gemini'); + }); +}); diff --git a/src/shared/utils/contextMetrics.ts b/src/shared/utils/contextMetrics.ts new file mode 100644 index 00000000..f37ff56f --- /dev/null +++ b/src/shared/utils/contextMetrics.ts @@ -0,0 +1,236 @@ +import { inferTeamProviderIdFromModel } from './teamProvider'; + +import type { TeamProviderId } from '@shared/types/team'; + +const ANTHROPIC_DEFAULT_CONTEXT_WINDOW = 200_000; +const ANTHROPIC_EXTENDED_CONTEXT_WINDOW = 1_000_000; +const OPENAI_COMPACT_CONTEXT_WINDOW = 200_000; +const OPENAI_DEFAULT_CONTEXT_WINDOW = 400_000; +const OPENAI_LONG_CONTEXT_WINDOW = 1_050_000; + +export interface ContextUsageLike { + input_tokens?: number; + output_tokens?: number; + cache_read_input_tokens?: number; + cache_creation_input_tokens?: number; + prompt_tokens?: number; + completion_tokens?: number; + total_tokens?: number; + input_tokens_details?: { + cached_tokens?: number; + }; + prompt_tokens_details?: { + cached_tokens?: number; + }; + output_tokens_details?: { + reasoning_tokens?: number; + }; + completion_tokens_details?: { + reasoning_tokens?: number; + }; +} + +export type PromptInputSource = + | 'anthropic_usage' + | 'openai_responses_usage' + | 'openai_chat_usage' + | 'unavailable'; + +export interface DerivedContextMetrics { + providerId: TeamProviderId | undefined; + modelName: string | undefined; + contextWindowTokens: number | null; + promptInputTokens: number | null; + outputTokens: number | null; + contextUsedTokens: number | null; + visibleContextTokens: number; + promptInputSource: PromptInputSource; + contextUsedSource: PromptInputSource | 'unavailable'; + promptInputPercentOfContextWindow: number | null; + contextUsedPercentOfContextWindow: number | null; + visibleContextPercentOfPromptInput: number | null; +} + +interface InferContextWindowTokensParams { + providerId?: TeamProviderId; + modelName?: string; + limitContext?: boolean; +} + +interface DeriveContextMetricsParams extends InferContextWindowTokensParams { + usage?: ContextUsageLike | null; + contextWindowTokens?: number | null; + visibleContextTokens?: number; +} + +function readFiniteNumber(value: unknown): number | null { + return typeof value === 'number' && Number.isFinite(value) ? value : null; +} + +function readPositiveNumber(value: unknown): number | null { + const num = readFiniteNumber(value); + return num !== null && num > 0 ? num : null; +} + +function computePercent(tokens: number | null, totalTokens: number | null): number | null { + if (tokens === null || totalTokens === null || totalTokens <= 0) { + return null; + } + if (!Number.isFinite(tokens) || tokens <= 0) { + return 0; + } + return Math.min((tokens / totalTokens) * 100, 100); +} + +function isOpenAiLongContextModel(modelName: string | undefined): boolean { + const normalized = modelName?.trim().toLowerCase(); + if (!normalized) { + return false; + } + + return ( + normalized === 'gpt-5.4' || + normalized.startsWith('gpt-5.4-202') || + normalized === 'gpt-5.4-pro' || + normalized.startsWith('gpt-5.4-pro-202') + ); +} + +function isOpenAiCompactContextModel(modelName: string | undefined): boolean { + const normalized = modelName?.trim().toLowerCase(); + if (!normalized) { + return false; + } + + return normalized === 'codex-mini-latest' || normalized.startsWith('codex-mini-latest-'); +} + +function isAnthropicNativeLongContextModel(modelName: string | undefined): boolean { + const normalized = modelName?.trim().toLowerCase(); + if (!normalized) { + return false; + } + + return ( + normalized.startsWith('claude-opus-4-7') || + normalized.startsWith('claude-opus-4-6') || + normalized.startsWith('claude-sonnet-4-6') || + normalized.startsWith('claude-mythos') + ); +} + +function hasOpenAiPromptDetails(usage: ContextUsageLike): boolean { + return ( + readFiniteNumber(usage.input_tokens_details?.cached_tokens) !== null || + readFiniteNumber(usage.prompt_tokens_details?.cached_tokens) !== null + ); +} + +export function inferContextWindowTokens({ + providerId, + modelName, + limitContext, +}: InferContextWindowTokensParams): number | null { + const resolvedProviderId = providerId ?? inferTeamProviderIdFromModel(modelName); + const normalizedModel = modelName?.trim().toLowerCase(); + + if (resolvedProviderId === 'anthropic') { + if (limitContext) { + return ANTHROPIC_DEFAULT_CONTEXT_WINDOW; + } + if (normalizedModel?.includes('[1m]') || isAnthropicNativeLongContextModel(normalizedModel)) { + return ANTHROPIC_EXTENDED_CONTEXT_WINDOW; + } + return ANTHROPIC_DEFAULT_CONTEXT_WINDOW; + } + + if (resolvedProviderId === 'codex') { + if (isOpenAiCompactContextModel(normalizedModel)) { + return OPENAI_COMPACT_CONTEXT_WINDOW; + } + return isOpenAiLongContextModel(normalizedModel) + ? OPENAI_LONG_CONTEXT_WINDOW + : OPENAI_DEFAULT_CONTEXT_WINDOW; + } + + return null; +} + +export function deriveContextMetrics({ + usage, + providerId, + modelName, + contextWindowTokens, + visibleContextTokens = 0, + limitContext, +}: DeriveContextMetricsParams): DerivedContextMetrics { + const resolvedProviderId = providerId ?? inferTeamProviderIdFromModel(modelName); + const resolvedContextWindowTokens = + readPositiveNumber(contextWindowTokens) ?? + inferContextWindowTokens({ + providerId: resolvedProviderId, + modelName, + limitContext, + }); + const safeVisibleContextTokens = + Number.isFinite(visibleContextTokens) && visibleContextTokens > 0 ? visibleContextTokens : 0; + const safeUsage = usage ?? {}; + const outputTokens = + readFiniteNumber(safeUsage.output_tokens) ?? readFiniteNumber(safeUsage.completion_tokens); + const promptTokens = readFiniteNumber(safeUsage.prompt_tokens); + const inputTokens = readFiniteNumber(safeUsage.input_tokens); + const cacheReadTokens = readFiniteNumber(safeUsage.cache_read_input_tokens) ?? 0; + const cacheCreationTokens = readFiniteNumber(safeUsage.cache_creation_input_tokens) ?? 0; + + let promptInputTokens: number | null = null; + let promptInputSource: PromptInputSource = 'unavailable'; + + if (promptTokens !== null) { + promptInputTokens = promptTokens; + promptInputSource = 'openai_chat_usage'; + } else if (inputTokens !== null) { + const shouldUseAnthropicFormula = + resolvedProviderId === 'anthropic' || cacheReadTokens > 0 || cacheCreationTokens > 0; + + if (shouldUseAnthropicFormula) { + promptInputTokens = inputTokens + cacheReadTokens + cacheCreationTokens; + promptInputSource = 'anthropic_usage'; + } else { + const missingOpenAiPromptTelemetry = + resolvedProviderId === 'codex' && + inputTokens === 0 && + cacheReadTokens === 0 && + cacheCreationTokens === 0 && + !hasOpenAiPromptDetails(safeUsage); + + if (!missingOpenAiPromptTelemetry) { + promptInputTokens = inputTokens; + promptInputSource = 'openai_responses_usage'; + } + } + } + + const contextUsedTokens = + promptInputTokens !== null && outputTokens !== null ? promptInputTokens + outputTokens : null; + + return { + providerId: resolvedProviderId, + modelName, + contextWindowTokens: resolvedContextWindowTokens, + promptInputTokens, + outputTokens, + contextUsedTokens, + visibleContextTokens: safeVisibleContextTokens, + promptInputSource, + contextUsedSource: contextUsedTokens !== null ? promptInputSource : 'unavailable', + promptInputPercentOfContextWindow: computePercent( + promptInputTokens, + resolvedContextWindowTokens + ), + contextUsedPercentOfContextWindow: computePercent( + contextUsedTokens, + resolvedContextWindowTokens + ), + visibleContextPercentOfPromptInput: computePercent(safeVisibleContextTokens, promptInputTokens), + }; +} diff --git a/src/shared/utils/modelParser.ts b/src/shared/utils/modelParser.ts index 9bfdb686..c2b10711 100644 --- a/src/shared/utils/modelParser.ts +++ b/src/shared/utils/modelParser.ts @@ -3,7 +3,7 @@ * Parses model identifiers into friendly display names and metadata. */ -/** Default context window size for Claude models (all current models use 200K) */ +/** Fallback context window size when a more exact model-specific window is unavailable. */ export const DEFAULT_CONTEXT_WINDOW = 200_000; /** Known model families with specific styling */ diff --git a/src/shared/utils/teamProvider.ts b/src/shared/utils/teamProvider.ts index d1acdfff..b4890da4 100644 --- a/src/shared/utils/teamProvider.ts +++ b/src/shared/utils/teamProvider.ts @@ -22,20 +22,33 @@ export function inferTeamProviderIdFromModel( if (!normalized) { return undefined; } + const normalizedWithoutExtendedContextSuffix = normalized.replace(/(?:\[1m\])+$/, ''); - if (normalized.startsWith('gpt-') || normalized.startsWith('codex')) { + if ( + normalized.startsWith('gpt-') || + normalized.startsWith('codex') || + normalizedWithoutExtendedContextSuffix.startsWith('gpt-') || + normalizedWithoutExtendedContextSuffix.startsWith('codex') + ) { return 'codex'; } - if (normalized.startsWith('gemini')) { + if ( + normalized.startsWith('gemini') || + normalizedWithoutExtendedContextSuffix.startsWith('gemini') + ) { return 'gemini'; } if ( normalized.startsWith('claude') || + normalizedWithoutExtendedContextSuffix.startsWith('claude') || normalized === 'opus' || + normalizedWithoutExtendedContextSuffix === 'opus' || normalized === 'sonnet' || - normalized === 'haiku' + normalizedWithoutExtendedContextSuffix === 'sonnet' || + normalized === 'haiku' || + normalizedWithoutExtendedContextSuffix === 'haiku' ) { return 'anthropic'; } diff --git a/test/main/services/team/TeamProvisioningServicePrepare.test.ts b/test/main/services/team/TeamProvisioningServicePrepare.test.ts index bdd11285..6d641ce9 100644 --- a/test/main/services/team/TeamProvisioningServicePrepare.test.ts +++ b/test/main/services/team/TeamProvisioningServicePrepare.test.ts @@ -505,6 +505,68 @@ describe('TeamProvisioningService prepare/auth behavior', () => { ); }); + it('preserves a requested 1M Anthropic window when runtime logs strip the [1m] suffix', () => { + const svc = new TeamProvisioningService(); + const run = { + request: { + providerId: 'anthropic', + model: 'opus[1m]', + limitContext: false, + }, + leadContextUsage: null, + } as any; + + (svc as any).updateLeadContextUsageFromUsage( + run, + { + input_tokens: 12, + cache_creation_input_tokens: 34, + cache_read_input_tokens: 56, + output_tokens: 7, + }, + 'claude-opus-4-6' + ); + + expect(run.leadContextUsage).toMatchObject({ + promptInputTokens: 102, + outputTokens: 7, + contextUsedTokens: 109, + contextWindowTokens: 1_000_000, + promptInputSource: 'anthropic_usage', + }); + }); + + it('preserves a limited 200K Anthropic window when runtime logs strip the [1m] suffix', () => { + const svc = new TeamProvisioningService(); + const run = { + request: { + providerId: 'anthropic', + model: 'opus', + limitContext: true, + }, + leadContextUsage: null, + } as any; + + (svc as any).updateLeadContextUsageFromUsage( + run, + { + input_tokens: 12, + cache_creation_input_tokens: 34, + cache_read_input_tokens: 56, + output_tokens: 7, + }, + 'claude-opus-4-6' + ); + + expect(run.leadContextUsage).toMatchObject({ + promptInputTokens: 102, + outputTokens: 7, + contextUsedTokens: 109, + contextWindowTokens: 200_000, + promptInputSource: 'anthropic_usage', + }); + }); + it('emits a lead-message refresh after provisioning reaches ready', async () => { const svc = new TeamProvisioningService(); const emitter = vi.fn(); diff --git a/test/renderer/components/common/TokenUsageDisplay.test.ts b/test/renderer/components/common/TokenUsageDisplay.test.ts new file mode 100644 index 00000000..ce584d6d --- /dev/null +++ b/test/renderer/components/common/TokenUsageDisplay.test.ts @@ -0,0 +1,117 @@ +import React, { act } from 'react'; +import { createRoot } from 'react-dom/client'; +import { afterEach, describe, expect, it, vi } from 'vitest'; + +import { TokenUsageDisplay } from '../../../../src/renderer/components/common/TokenUsageDisplay'; + +import type { ContextStats } from '../../../../src/renderer/types/contextInjection'; + +const contextStats: ContextStats = { + newInjections: [], + accumulatedInjections: [ + { + id: 'claude-md-1', + category: 'claude-md', + path: '/workspace/CLAUDE.md', + source: 'project-local', + displayName: 'CLAUDE.md', + isGlobal: false, + estimatedTokens: 200, + firstSeenInGroup: 'ai-0', + }, + { + id: 'mentioned-file-1', + category: 'mentioned-file', + path: '/workspace/file.ts', + displayName: 'file.ts', + estimatedTokens: 300, + firstSeenTurnIndex: 0, + firstSeenInGroup: 'ai-0', + exists: true, + }, + ], + totalEstimatedTokens: 500, + tokensByCategory: { + claudeMd: 200, + mentionedFiles: 300, + toolOutputs: 0, + thinkingText: 0, + taskCoordination: 0, + userMessages: 0, + }, + newCounts: { + claudeMd: 0, + mentionedFiles: 0, + toolOutputs: 0, + thinkingText: 0, + taskCoordination: 0, + userMessages: 0, + }, +}; + +async function flushReact(): Promise { + await Promise.resolve(); + await Promise.resolve(); +} + +describe('TokenUsageDisplay', () => { + afterEach(() => { + document.body.innerHTML = ''; + vi.restoreAllMocks(); + }); + + it('keeps visible context scoped to prompt input instead of context window semantics', async () => { + vi.stubGlobal('IS_REACT_ACT_ENVIRONMENT', true); + + const host = document.createElement('div'); + document.body.appendChild(host); + const root = createRoot(host); + + await act(async () => { + root.render( + React.createElement(TokenUsageDisplay, { + inputTokens: 1000, + cacheReadTokens: 500, + cacheCreationTokens: 500, + outputTokens: 250, + contextStats, + }) + ); + await flushReact(); + }); + + const trigger = host.querySelector('[aria-haspopup="true"]'); + expect(trigger).toBeInstanceOf(HTMLElement); + + await act(async () => { + trigger?.dispatchEvent(new KeyboardEvent('keydown', { key: 'Enter', bubbles: true })); + await flushReact(); + }); + + const popover = document.querySelector('[role="tooltip"]'); + expect(popover).toBeTruthy(); + expect(popover?.textContent).toContain('2,250'); + expect(popover?.textContent).toContain('500 (25.0% of prompt input)'); + expect(popover?.textContent).not.toContain('of context'); + + const visibleContextToggle = Array.from(document.querySelectorAll('[role="button"]')).find( + (element) => element.textContent?.includes('Visible Context') + ); + expect(visibleContextToggle).toBeTruthy(); + + await act(async () => { + visibleContextToggle?.dispatchEvent(new MouseEvent('click', { bubbles: true })); + await flushReact(); + }); + + expect(popover?.textContent).toContain('CLAUDE.md ×1'); + expect(popover?.textContent).toContain('(10.0%)'); + expect(popover?.textContent).toContain('@files ×1'); + expect(popover?.textContent).toContain('(15.0%)'); + + await act(async () => { + root.unmount(); + await flushReact(); + }); + }); +});