agent-ecosystem/docs/research/agent-launch-architecture-comparison-2026-05-07.md
2026-05-07 23:26:37 +03:00

15 KiB

Agent launch architecture comparison research

Research date: 2026-05-07

Purpose: record factual research on how different systems launch or execute agents. This is informational context only, not an implementation recommendation.

Scope

Systems compared:

System Repository / source Snapshot
Our Agent Teams Local claude_team + agent_teams_orchestrator Local working tree, 2026-05-07
Paperclip paperclipai docs/code research from earlier pass Public docs / local research
Gastown github.com/gastownhall/gastown cloned cfbdf3c
GoClaw Enterprise / Teams github.com/nextlevelbuilder/goclaw cloned a97e502
GoClaw OpenClaw-compatible gateway github.com/roelfdiedericks/goclaw cloned 6a7ccdb

Primary external references:

Topic Source
Gastown README https://github.com/gastownhall/gastown/blob/main/README.md
Gastown agent provider integration https://github.com/gastownhall/gastown/blob/main/docs/agent-provider-integration.md
GoClaw agent loop https://github.com/nextlevelbuilder/goclaw/blob/main/docs/01-agent-loop.md
GoClaw agent teams https://github.com/nextlevelbuilder/goclaw/blob/main/docs/11-agent-teams.md
GoClaw team WS events https://github.com/nextlevelbuilder/goclaw/blob/main/docs/13-ws-team-events.md
Paperclip agent runtime https://github.com/paperclipai/docs/blob/main/agents-runtime.md
Paperclip adapters overview https://paperclip.inc/docs/adapters/overview/

Short answer

There are four distinct launch/execution models:

Model Used by Essence
External live CLI process Our Agent Teams App/orchestrator launches real teammate runtimes and tracks bootstrap, PID, stderr, process health, runtime evidence, task/message state.
Bounded adapter run Paperclip A heartbeat or job starts a short agent run, adapter invokes CLI/provider, result is captured, run exits or times out.
Tmux session orchestration Gastown tmux is the universal runtime adapter. Agents run in terminal sessions, receive input through tmux, and are observed through panes/session state.
In-process agent loop GoClaw Enterprise / Teams Agent execution is a Go Loop.Run(ctx, RunRequest) scheduled through lanes. The agent is a logical loop inside the gateway, not necessarily a separate CLI teammate process.

What “in-process agent loop” removes from our live teammate product

In-process loop does not mean “bad”. It is often cleaner. But compared to our external process teammate model, it removes or changes several product properties.

Product property External process teammate, our model In-process GoClaw-style loop
Real process identity Each teammate can have PID/RSS/stdout/stderr/process lifetime. Agent run is a gateway invocation; no independent teammate PID by default.
CLI-realism Claude/Codex/OpenCode behave as their real CLI runtimes, including auth, prompts, provider errors, stderr quirks. Provider/driver behavior is normalized inside gateway; fewer raw CLI lifecycle surfaces.
Per-member restart semantics Restart means kill/relaunch or reattach a concrete runtime for that member. Restart is usually cancel/reschedule a logical run/session.
Bootstrap evidence We can distinguish process alive, bootstrap submitted, bootstrap confirmed, delivery proof, task proof. The loop itself is already the controlled runtime; less need for low-level bootstrap proof.
UI runtime cards UI can show memory, process state, liveness source, failed/stalled bootstrap, exact runtime diagnostics. UI tends to show run/session/task status rather than OS/process-level teammate state.
TTY/process debugging Process/tmux mode can expose raw CLI behavior when needed. Debugging is gateway traces/events/logs, not a live CLI pane/process per member.
Failure classes Auth prompt, no stdin, CLI did not submit bootstrap, process died, stale PID, provider CLI stderr. Mostly provider/tool/session/run errors inside the loop.
Isolation boundary OS process boundary per teammate. Mostly logical/session isolation inside one gateway process, unless it delegates to external providers/tools.

Important distinction: in-process loop is simpler and can be more stable for gateway/chat products. It is not a drop-in replacement for a desktop product whose value includes live external teammate runtimes.

Our Agent Teams launch/execution model

Our current direction is app-managed live external teammate runtime.

Observed local architecture:

Layer Role
claude_team Electron app UI, provisioning, runtime projection, team messages, tasks, diagnostics, retries.
agent_teams_orchestrator Multi-agent runtime orchestration, teammate spawning, provider/runtime bridging.
Process backend Default for app-launched teammates after recent changes. Launch-owned processes are tracked as runtime entities.
Optional tmux mode Debug/manual mode, not production default. Useful for real TTY inspection.
App-managed bootstrap Backend injects/records startup context and requires durable readiness evidence instead of trusting “process exists”.
Runtime projection Maps launch state, process liveness, bootstrap proof, delivery proof, task state and diagnostics to UI.

Key properties:

Dimension Current behavior
Agent lifetime Long-lived teammate process/session, not just one request.
Availability proof Process alive is not enough. Need bootstrap/runtime evidence.
Provider mix Claude, Codex, OpenCode can coexist in one team.
User experience Live team room: cards, memory, tasks, messages, runtime errors, restart/retry controls.
Complexity cost High. Many edge cases around launch, cleanup, stale state, delivery, work-sync, retries.

Technical assessment:

Criterion Score
Live team product fit 9.2/10
Mixed provider fidelity 8.7/10
Runtime proof strictness 8.8/10
Simplicity 5.8/10
Maintainability today 7.2/10
Overall technical score 8.5/10

Paperclip launch/execution model

Paperclip is closest to a bounded job/heartbeat runner.

Research summary from earlier pass:

Piece Behavior
Agent invocation Heartbeat or scheduled run calls adapter execution.
Runtime Adapter starts/calls CLI or provider, captures output/status/errors.
Lifecycle Run exits, times out, or is cancelled.
Concurrency Wakeups coalesce if agent is already running.
Persistence Status/logs/tokens/errors are stored per run.

This is operationally clean because there is no expectation that every teammate is a continuously alive process with card-level runtime state.

Technical assessment:

Criterion Score
Bounded execution design 9.1/10
Simplicity 8.8/10
Failure boundedness 8.7/10
Live teammate room fit 6.3/10
External CLI fidelity 7.5/10
Overall technical score 8.2/10

Gastown launch/execution model

Gastown is tmux-first.

Facts from gastownhall/gastown:

Piece Behavior
Main runtime adapter tmux sessions.
Universal integration Any CLI that runs in terminal can be started and controlled.
Work unit Beads/issues and convoys.
Worker identity Polecats have persistent identity and reusable worktrees.
Session lifetime Sessions are ephemeral; identity and sandbox can persist.
Communication Mail, nudges, hooks, Beads state, tmux input/output.
Monitoring Witness, Deacon, Dogs, Doctor, cleanup commands.
Provider integration Built-in/custom presets with command, args, env, process names, hooks, readiness delay/prompt.

Gastown explicitly documents a Tier 0 tmux shim: start CLI in tmux, send work through keystrokes, detect liveness through pane process, read output through captured pane. It also notes that this level is timing-sensitive and lacks delivery confirmation.

Core model:

gt sling <bead> <rig>
  -> allocate or reuse polecat identity/worktree
  -> create tmux session
  -> set env: GT_ROLE, GT_RIG, GT_POLECAT, BD_ACTOR, GT_AGENT, etc.
  -> inject startup beacon / prompt / hook context
  -> nudge with instructions if provider needs fallback
  -> Witness/Deacon patrol health and cleanup

Technical assessment:

Criterion Score
Terminal-native ops 9.0/10
Persistent worker identity 8.7/10
Cleanup / doctor culture 8.8/10
Delivery proof strictness 6.4/10
Live product state consistency 6.8/10
Overall technical score 8.0/10

GoClaw Enterprise / Teams launch/execution model

This is nextlevelbuilder/goclaw, the relevant GoClaw for agent teams.

Core architecture from docs/code:

Piece Behavior
Agent unit Loop configured with provider, model, tools, workspace and agent type.
Run entrypoint Loop.Run(ctx, RunRequest).
Loop pattern Think -> Act -> Observe, with max iterations and tool execution.
Scheduler First-class lane scheduler.
Lanes main, subagent, team, cron.
Queueing Per-session queues with debounce, drop policy, max concurrent.
Team model Lead/member, task board, mailbox, delegation.
Task semantics Atomic claim, status lifecycle, dependencies, blocker escalation, task events.
Events Typed WS events for delegation, tasks, team messages and agent lifecycle.

Core execution shape:

Inbound message / teammate message / cron / delegation
  -> Scheduler.Schedule(lane, RunRequest)
  -> SessionQueue serializes or bounds per session
  -> Lane worker admits execution
  -> Router.Get(agentID)
  -> Loop.Run(ctx, req)
  -> Provider call + tools + finalization
  -> Events + stored session/task/trace state

GoClaw team member execution is conceptually a scheduled agent run, not an externally spawned teammate CLI process with bootstrap/check-in.

Technical assessment:

Criterion Score
Scheduler architecture 9.2/10
Agent loop clarity 8.9/10
Team task model 8.8/10
Typed event model 8.8/10
Real external teammate runtime fidelity 6.6/10
Live process UI fit 6.5/10
Overall technical score 8.7/10

GoClaw OpenClaw-compatible gateway model

This is roelfdiedericks/goclaw. It is a different project than nextlevelbuilder/goclaw.

High-level facts:

Piece Behavior
Product class Personal AI gateway / OpenClaw-compatible bot runtime.
Main strengths Transcript search, memory graph, channels, persistent memory, delegated runs, ACP sessions.
Delegated work subagent_spawn, subagent_fanout, subagent_status, subagent_cancel.
Runner DefaultRunner starts active runs as goroutines with run IDs, timeout/cancel, optional concurrency lane semaphore.
UI/control /runners dashboard, SSE events, Telegram/TUI summaries.
Cursor integration ACP attachment to live Cursor session.

Runner shape:

subagent_spawn / fanout
  -> DefaultRunner.Start(ctx, RunSpec)
  -> create RunRecord queued
  -> goroutine waits for lane admission
  -> execute function runs child work
  -> registry records completed/failed/canceled/timeout
  -> events emitted

This is closer to Paperclip-style delegated bounded runs than to our live teammate process model.

Technical assessment:

Criterion Score
Personal gateway/memory architecture 8.8/10
Delegated run boundedness 8.5/10
Channel/memory richness 9.0/10
Live external teammate fidelity 5.8/10
Team room fit 6.4/10
Overall technical score 8.1/10

Direct comparison table

System Launch/execution primitive Separate OS process per agent? Long-lived teammate? Task board Team messages Scheduler Tmux Best fit
Our Agent Teams Launch-owned external CLI/process runtime Yes Yes Yes Yes Partial/ad-hoc today Optional/debug Desktop live mixed-provider team room
Paperclip Bounded adapter heartbeat run Usually per run No Limited/not central Not team-room focused Yes, job-like No core tmux Reliable background/job agents
Gastown Tmux session + worktree + Beads Yes, through tmux Session ephemeral, identity persistent Beads/convoys Mail/nudges Scheduler/capacity exists Core Terminal-native multi-agent ops
GoClaw Enterprise In-process scheduled agent loop Not by default Logical sessions/runs Yes Yes First-class lanes No core tmux Multi-agent gateway/platform
GoClaw OpenClaw-compatible Delegated goroutine runner + gateway sessions Not by default Logical runs/sessions Not primary team board in same way Channels Runner lane semaphore No core tmux Personal gateway, memory, delegated runs

Honest overall scores

System Overall technical score Why
GoClaw Enterprise / Teams 8.7/10 Cleanest scheduler/team/task/event architecture among compared systems.
Our Agent Teams 8.5/10 Best fit for real live external Claude/Codex/OpenCode teammate product, but high complexity.
Paperclip 8.2/10 Very clean bounded runtime model, but not a live team-room system.
GoClaw OpenClaw-compatible 8.1/10 Strong personal gateway/memory/delegated run model, less comparable to our team runtime.
Gastown 8.0/10 Strong terminal ops and lifecycle culture, but tmux-first delivery/readiness is less proof-strict.

Research conclusions

The systems optimize for different truths:

System Optimized for
Our Agent Teams User-visible live team of real external coding agents.
Paperclip Bounded, simple, resumable background agent runs.
Gastown Terminal-native agent ops at scale with durable work identity.
GoClaw Enterprise Clean gateway-native multi-agent scheduling and team task orchestration.
GoClaw OpenClaw-compatible Long-memory personal agent gateway with delegated subruns.

Most useful conceptual takeaways for future reference:

Idea Source Why it matters
First-class scheduler lanes GoClaw Enterprise Separates main/team/subagent/cron load and makes cancellation/backpressure more deterministic.
Typed team event catalog GoClaw Enterprise Makes UI and state transitions easier to reason about.
Persistent identity vs ephemeral session Gastown Useful framing for member identity, runtime session, task ownership and cleanup.
Bounded adapter runs Paperclip Good model for cron, background checks and non-live workers.
Patrol/doctor cleanup culture Gastown Good operational model for stale runtime/process/data cleanup.

Non-recommendation note: this document intentionally does not propose changing our architecture. It records observed models for future design discussions.