- Introduced new planning files including PROJECT.md, REQUIREMENTS.md, ROADMAP.md, and STATE.md to outline the vision and requirements for SSH multi-context workspaces. - Added ARCHITECTURE.md and CONCERNS.md to detail the codebase structure and address technical debt, known bugs, and security considerations. - Created CONVENTIONS.md to establish coding standards and practices for the project. - Updated .gitignore to exclude demo files and added configuration for planning tools. This commit lays the groundwork for enhancing SSH functionality and user experience in managing multiple workspaces.
24 KiB
Pitfalls Research
Domain: Multi-context workspace switching for Electron + Zustand apps Researched: 2026-02-12 Confidence: HIGH
Critical Pitfalls
Pitfall 1: Destructive Context Switching with Incomplete State Snapshots
What goes wrong:
Switching from SSH to local (or vice versa) calls getFullResetState() which wipes ALL selections and loaded data, destroying the user's previous workspace state. When switching back, users must re-navigate to their project, re-open sessions, and restore their scroll position manually.
Why it happens:
Current implementation treats context switch as a "hard reset" instead of maintaining separate state snapshots per context. The connectionSlice spreads getFullResetState() on connect/disconnect, clearing selectedProjectId, selectedSessionId, conversation, tabs, and all derived state.
How to avoid:
- Snapshot before switching: Capture full AppState before connect/disconnect
- Scope snapshots by context: Use
Map<'local' | SshHostKey, AppState>to store separate state per context - Partial restoration: Only restore state compatible with the new context (project IDs may differ between local/SSH)
- Preserve UI preferences: Tab layout, expanded groups, scroll positions are context-agnostic and should always restore
Warning signs:
- User complaints about "losing their place" when switching contexts
- Empty dashboard after SSH connect, despite having viewed sessions before
- No tabs open after switching back to local mode
- Context panel expansions reset on every switch
Phase to address: Phase 1 (Core Infrastructure) — state snapshot/restore must be foundational before adding multi-context features.
Pitfall 2: EventEmitter Listener Accumulation on Repeated Context Switches
What goes wrong:
FileWatcher, SshConnectionManager, and NotificationManager are EventEmitters. Each context switch may call setFileSystemProvider() or re-initialize services without removing old listeners, causing:
- Memory leaks (listeners hold references to previous provider instances)
- Duplicate event emissions (one event triggers 2x, 3x handlers)
- Stale event handlers executing against wrong context
Why it happens:
Services use .on() to attach listeners but don't call .removeAllListeners() or track cleanup functions. The initializeNotificationListeners() function in store/index.ts returns a cleanup function, but there's no guarantee it's called before re-initialization.
How to avoid:
- Centralized cleanup: Create
dispose()methods for all EventEmitter services - Lifecycle tracking: Maintain
Map<string, () => void>of cleanup functions keyed by service name - Pre-switch cleanup: Call all cleanup functions BEFORE provider swap
- Idempotent initialization: Check
if (this.isInitialized) returnto prevent double-init - Avoid
removeAllListeners()on ipcRenderer: Always specify channel name — blanket removal breaks Electron internals
Warning signs:
- Process memory usage (RSS) grows 50-100MB per context switch
- DevTools heap snapshot shows multiple FileWatcher/SshConnectionManager instances
- Console logs show duplicate "file-change" events for same file
- IPC handlers fire 2x for single user action
Phase to address: Phase 1 (Core Infrastructure) — service lifecycle management is critical before multi-context.
Sources:
- Error: Removing all listeners from ipcRenderer will make Electron internals stop working
- IPC in Electron - Ray
Pitfall 3: Stale Closures in Zustand Actions Capturing Old Provider References
What goes wrong:
Zustand actions like fetchProjects() capture the FileSystemProvider in their closure at store creation time. After calling sshConnectionManager.getProvider() which returns a new SshFileSystemProvider, existing actions still reference the old LocalFileSystemProvider, causing operations to execute against the wrong context.
Why it happens:
JavaScript closures capture variables at definition time. When a Zustand action calls window.electronAPI.getProjects(), the IPC handler calls projectScanner.scan(), which calls this.fsProvider.readdir(). If the scanner's provider reference wasn't updated, it uses the stale provider.
How to avoid:
- Late binding via getter: Services should call
getProvider()on every operation, not cache at construction - Provider as parameter: Pass provider explicitly to service methods instead of storing as instance variable
- Re-initialize services: After provider swap, call
fileWatcher.setFileSystemProvider(newProvider)on ALL services - Service registry pattern: Centralize provider injection so swap happens in one place
- Verify in tests: Mock provider swap and assert service calls hit new provider
Warning signs:
- SSH connect succeeds but dashboard shows local projects
- File operations fail with "ENOENT" after provider swap
- Console shows "LocalFileSystemProvider" operations when SSH is connected
- User switches to SSH but sees local data until app restart
Phase to address: Phase 1 (Core Infrastructure) — provider injection architecture must be correct from the start.
Sources:
Pitfall 4: IPC Race Conditions During Context Switch with In-Flight Requests
What goes wrong: User clicks "Connect SSH" → IPC handler starts scanning remote projects → user quickly clicks "Disconnect" → scan completes and populates store with SSH data → store now shows SSH projects in local mode, creating data corruption.
Why it happens: Async IPC calls (getProjects, getSessions) don't track which context initiated them. By the time a response arrives, the context may have switched, but Zustand applies the stale response anyway.
How to avoid:
- Context ID stamping: Include
contextId(UUID per connection) in every IPC request/response - Response validation: Check
if (currentContextId !== response.contextId) returnbefore applying - Request cancellation: Track in-flight requests via AbortController and cancel on context switch
- State machine: Use FSM with states:
idle → connecting → connected → disconnecting → idle. Reject operations in wrong states. - Sequential transitions: Wait for disconnect cleanup to complete before starting connect
Warning signs:
- Dashboard briefly flickers between local and SSH data during transitions
- Error notifications appear for wrong context after switching
- Search results from previous context appear after switch
- Console shows "Cannot read property of null" after rapid connect/disconnect
Phase to address: Phase 1 (Core Infrastructure) — race condition prevention is non-negotiable for reliability.
Sources:
Pitfall 5: FileWatcher Polling Timer Not Cleared on SSH Disconnect
What goes wrong:
SSH mode uses setInterval() polling (5s) instead of fs.watch(). When user disconnects, stop() clears pollingTimer, but the cleanup happens in finally block — if an exception occurs, timer keeps running. After 10 switches, 10 polling timers consume CPU checking a disconnected SSH provider.
Why it happens:
Interval timers require explicit clearInterval(), and error paths may skip cleanup. The pollingTimer property is set to null without actually clearing the interval, causing orphaned timers.
How to avoid:
- Cleanup in finally: Always clear timers in
finallyblock, not just normal path - Defensive clearing: Before creating new timer, clear existing:
if (this.timer) clearInterval(this.timer) - Timer registry: Track all active timers in Set<NodeJS.Timeout> and clear all on dispose
- Polling refactor: Consider using a single interval that checks
isRemote()flag instead of separate modes - Memory profiling: Add setInterval/clearInterval tracking to detect orphaned timers
Warning signs:
- CPU usage increases by 5-10% per context switch
- Chrome DevTools → Performance shows multiple "pollForChanges" concurrent executions
- Memory usage grows 20MB per switch (timer closures hold provider references)
- App becomes sluggish after 5-10 SSH switches
Phase to address: Phase 1 (Core Infrastructure) — timer lifecycle bugs cause performance degradation.
Pitfall 6: SSH Connection Not Properly Disposed, Keeping Socket Open
What goes wrong:
SshConnectionManager.disconnect() calls client.end(), but if the SFTP channel is still processing operations, the socket remains open (TCP FIN not sent). After 10 context switches, 10 SSH connections consume file descriptors and remote server resources.
Why it happens:
ssh2 Client .end() is graceful (waits for channel closure), but .destroy() is immediate. If you don't wait for SFTP operations to complete before calling .end(), the connection may linger. The dispose() method swallows errors, hiding cleanup failures.
How to avoid:
- Force close on disconnect: Use
client.destroy()instead of.end()for immediate termination - SFTP channel close: Call
sftp.end()explicitly beforeclient.end() - Connection timeout: Set 5s timeout on disconnect — if client doesn't close, force destroy
- Socket tracking: Expose
net.Socketfrom Client and verifysocket.destroyed === truepost-cleanup - Resource monitoring: Log open file descriptors (lsof) to detect leaks
Warning signs:
lsof -p <pid>shows 10+ TCP connections to SSH host- SSH host reports "Max sessions exceeded" after many switches
netstatshows multiple ESTABLISHED connections to port 22- Disconnect takes 5+ seconds (waiting for timeout)
Phase to address: Phase 1 (Core Infrastructure) — connection leaks cause operational issues at scale.
Sources:
- Diagnosing and Fixing Memory Leaks in Electron Applications
- Viacheslav Eremin | Memory Leaks in Electron application
Pitfall 7: Tab State Restoration Without Context Validation
What goes wrong:
Snapshot captures open tabs with projectId: "abc-local-path". User switches to SSH where projects have different IDs. Restoration tries to open tab with local projectId against SSH provider → tab shows "Project not found" error or stale local data.
Why it happens:
ProjectId encoding includes file path (-Users-name-project), so local and SSH projects for the same logical directory have different IDs. Tab restoration doesn't validate whether a project exists in the new context.
How to avoid:
- Path normalization: Introduce
logicalProjectPath(e.g.,/home/user/project) separate from encoded ID - Cross-context mapping: Maintain
Map<logicalPath, {localId, sshId}>to translate IDs during restore - Graceful degradation: If project doesn't exist in new context, close tab or show empty state instead of error
- Restore validation: Before applying snapshot, call
window.electronAPI.getProjects()and filter tabs to existing projects - User notification: Show toast "3 tabs closed - projects not available in SSH mode"
Warning signs:
- User switches to SSH, tabs show "Failed to load session"
- Console errors "Project ID abc-local not found" after context switch
- Tab bar shows ghost tabs (visible but non-functional)
- Click on restored tab triggers navigation to dashboard instead
Phase to address: Phase 2 (State Persistence) — after Phase 1 establishes snapshots, validation ensures safe restoration.
Pitfall 8: Partial Snapshot Creates Inconsistent Derived State
What goes wrong:
Snapshot captures selectedProjectId, selectedSessionId, but omits sessionDetail, conversation, chunks. Restoration sets selections without corresponding data → store state says "session XYZ selected" but detail is null → UI renders empty or crashes on null access.
Why it happens: Developers manually list fields to snapshot instead of capturing full AppState. Derived state (detail, conversation) is computed from selections, but if selections restore before data loads, the UI sees inconsistent state.
How to avoid:
- Snapshot everything: Capture
AppStatewholesale, excluding only ephemeral UI flags (loading, error) - Atomic restoration: Use
setState(() => snapshot)so all fields update simultaneously - Lazy detail loading: On restoration, if
selectedSessionIdexists butsessionDetailis null, trigger background fetch - State versioning: Include
snapshotVersion: 1to handle schema changes (e.g., new fields added) - Whitelist vs blacklist: Easier to exclude ephemeral fields (
loading,error) than enumerate 50+ persistent fields
Warning signs:
- After switch, session viewer shows spinner indefinitely
- Console error "Cannot read property 'chunks' of null"
- Dashboard shows selected project but no sessions loaded
- Tab labels show "undefined" or "[object Object]"
Phase to address: Phase 1 (Core Infrastructure) — snapshot completeness is foundational to state restoration.
Sources:
Technical Debt Patterns
Shortcuts that seem reasonable but create long-term problems.
| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|---|---|---|---|
| Storing provider as instance variable | Simple, no getter needed | Stale closure bugs on provider swap | Never — always use getter |
| Not tracking cleanup functions | Less boilerplate code | Memory leaks accumulate with usage | Never — cleanup is mandatory |
| Hard reset on context switch | Easy to implement (3 lines) | User loses workspace state | Never — destroys UX |
Using removeAllListeners() |
Clears everything at once | Breaks Electron IPC internals | Never — specify channel |
| Sync IPC for connect | Blocks until complete | UI freezes 2-5s on slow networks | Never — SSH is async |
| Single global DataCache | Shared cache across contexts | Cache pollution (local data in SSH mode) | Never — scope cache per context |
| Manual field-by-field snapshot | Fine-grained control | Fragile, breaks when state schema evolves | Only for specific performance tuning |
| Not versioning snapshots | Don't need migration code | Can't safely restore after app updates | Only in MVP — add versioning in Phase 2 |
Integration Gotchas
Common mistakes when connecting to external services.
| Integration | Common Mistake | Correct Approach |
|---|---|---|
| SSH2 Client | Using .end() without waiting for SFTP close |
Call sftp.end(), wait 100ms, then client.destroy() |
| SFTP operations | Not catching ENOENT on remote path access | Wrap all sftp.stat() in try/catch, validate paths exist |
| IPC context bridge | Passing large objects (10MB+ session JSON) | Stream via chunks or use temp file + file path |
| FileWatcher SSH polling | Polling every 1s (too aggressive) | 5s minimum — SSH has latency |
| Zustand subscriptions | Subscribing to full store on every render | Use selectors: useStore((s) => s.field, shallow) |
| React useEffect | Empty deps array with state access | Include all state in deps or use functional updates |
Performance Traps
Patterns that work at small scale but fail as usage grows.
| Trap | Symptoms | Prevention | When It Breaks |
|---|---|---|---|
| Cache not scoped by context | Search returns SSH results in local mode | Separate caches: localCache, sshCache |
After first context switch |
| Snapshot entire 50MB state | 500ms freeze during switch | Exclude non-serializable (functions, DOM refs) | 10+ open sessions |
| No debounce on provider swap events | 10 re-renders per switch | Debounce 100ms, batch provider updates | Rapid switching (QA testing) |
| Loading all tabs eagerly | UI freezes restoring 20 tabs | Virtual scrolling + lazy load tab content | 10+ open tabs |
| Not canceling in-flight IPC | Stale responses overwrite new data | AbortController per IPC call | Slow network (SSH over VPN) |
| Synchronous validation during snapshot | Blocks UI during switch | Validate asynchronously post-restoration | Large state (100+ projects) |
Security Mistakes
Domain-specific security issues beyond general web security.
| Mistake | Risk | Prevention |
|---|---|---|
| Storing SSH passwords in snapshot | Plaintext credentials in memory dump | Never snapshot credentials — only host/port/user |
| Not validating SSH host keys | MITM attacks | Use ssh2 hostVerifier callback, check known_hosts |
| Exposing SSH provider to renderer | Renderer can execute arbitrary commands | Keep provider in main process only, expose via IPC |
| Logging full AppState snapshots | Sensitive data in logs | Redact: password, privateKey, sessionContent |
| Not sanitizing project paths | Path traversal attacks | Validate paths with path.resolve(), reject ".." |
UX Pitfalls
Common user experience mistakes in this domain.
| Pitfall | User Impact | Better Approach |
|---|---|---|
| No visual feedback during switch | User doesn't know if click worked | Show connecting spinner + progress bar |
| Immediate switch without confirmation | Accidental clicks lose workspace | "Switch will close X tabs. Continue?" dialog |
| No indication of current context | User forgets if in local/SSH | Persistent badge in header: "Local" or "SSH: hostname" |
| Context switch closes all tabs | User loses 10+ open sessions | Restore tabs if projects exist in new context |
| No way to view both contexts | Can't compare local vs SSH data | Split view or dual panes (future enhancement) |
| Errors shown as raw exceptions | "ENOTFOUND" means nothing to user | Friendly: "Cannot connect to SSH host. Check network." |
"Looks Done But Isn't" Checklist
Things that appear complete but are missing critical pieces.
- Context switch — Looks like it works because SSH connects, but missing: cleanup old listeners, cancel in-flight IPC, validate restored tabs
- State snapshot — Appears to save state, but missing: exclude ephemeral fields, version snapshots, validate restorable
- Provider swap — FileWatcher updated, but missing: update ProjectScanner, SessionParser, ErrorDetector, NotificationManager
- Tab restoration — Tabs re-open, but missing: validate project exists, load session detail, restore scroll position
- IPC request tracking — Responses return, but missing: context ID validation, abort on disconnect, timeout handling
- Service dispose — dispose() method exists, but missing: remove EventEmitter listeners, clear timers, close connections
- Cache invalidation — Cache cleared on switch, but missing: scope cache per context, invalidate on reconnect
- Error handling — Try/catch present, but missing: rollback state on failure, show user-friendly message, log to sentry
Recovery Strategies
When pitfalls occur despite prevention, how to recover.
| Pitfall | Recovery Cost | Recovery Steps |
|---|---|---|
| Destructive switch lost state | MEDIUM | Phase 2: Add persistent storage (localStorage) to survive restarts |
| Listener leaks | LOW | Add dispose tracking, expose debug.listenerCount() IPC for diagnosis |
| Stale closure | HIGH | Refactor service architecture to inject provider per call (breaking change) |
| IPC race condition | MEDIUM | Add context ID retroactively, reject responses with mismatched ID |
| Timer leak | LOW | DevTools audit: setInterval/clearInterval coverage, add cleanup tests |
| SSH connection leak | LOW | Add connection pool with max limit, force destroy after 30s timeout |
| Tab validation missing | MEDIUM | Add migration: filter invalid tabs on app startup |
| Partial snapshot | HIGH | Ship hotfix: capture full state, deprecate manual field list |
Pitfall-to-Phase Mapping
How roadmap phases should address these pitfalls.
| Pitfall | Prevention Phase | Verification |
|---|---|---|
| Destructive context switching | Phase 1 | Unit test: switch → switch back → assert state restored |
| EventEmitter listener accumulation | Phase 1 | Memory profiler: assert RSS stable after 10 switches |
| Stale closures | Phase 1 | Integration test: mock provider swap, assert service calls new provider |
| IPC race conditions | Phase 1 | E2E test: rapid connect/disconnect, assert no stale data |
| FileWatcher polling timer | Phase 1 | Unit test: call stop(), assert pollingTimer cleared |
| SSH connection disposal | Phase 1 | Integration test: assert socket.destroyed after disconnect |
| Tab restoration validation | Phase 2 | E2E test: open tabs → switch SSH → assert only valid tabs restored |
| Partial snapshot | Phase 1 | Integration test: snapshot → restore → assert detail populated |
Sources
Official Documentation
Electron State Management
- Syncing State between Electron Contexts - Bruno Scheufler
- Advanced Electron.js architecture - LogRocket Blog
- Zutron: Streamlined Electron State Management
- Notes on Electron Processes, Context Isolation, and IPC
Memory Leaks and Cleanup
- Diagnosing and Fixing Memory Leaks in Electron Applications
- Viacheslav Eremin | Memory Leaks in Electron application
- Error: Removing all listeners from ipcRenderer
- IPC in Electron - Ray
React Stale Closures
- Be Aware of Stale Closures when Using React Hooks
- How to Fix "Stale Closure" Issues in React Hooks
- React Stale Closure: Common Problems and Easy Solutions
- Hooks, Dependencies and Stale Closures | TkDodo's blog
Zustand and State Management
- Zustand - react state management made easy
- Stores with React Context: Memory Leak Issue or Redux DevTools Bug?
- In an ESModule, will not cleaning up subscriber result in a memory leak?
VS Code Multi-Root Workspaces (Reference Architecture)
- Adopting Multi Root Workspace APIs
- Multi-Root Workspaces in VS Code
- Workspace Management: Multi-project VS Code Setup
Dependency Injection and Service Lifecycle
- Dependency injection guidelines - .NET
- Service lifetimes (dependency injection) - .NET
- TSyringe: Lightweight dependency injection for TypeScript
Pitfalls research for: Multi-context workspace switching in Electron + Zustand Researched: 2026-02-12