# Butterfly Desktop Environment — Progress ## Overview A remote desktop environment with a Rust (Actix) backend, Angular 21 frontend, and Rust VM agent. The system mimics a traditional Windows-like desktop in the browser, receiving display/audio from VM agents with minimal lag. **Full remote control** — viewers can move the mouse, click, type, and scroll on the remote machine in real time. ## Architecture ``` ┌─────────────┐ WebSocket ┌──────────────────┐ WebSocket ┌─────────────┐ │ Angular 21 │◄──────────────────►│ Rust Actix Server│◄──────────────────►│ VM Agent exe │ │ (Browser) │ display frames │ (REST + WS) │ display frames │ (Rust) │ │ Viewer │ HUD commands ▸ │ relay hub │ HUD commands ▸ │ captures │ │ controls │ │ │ │ screen + │ │ remote │ │ │ │ injects │ │ desktop │ │ │ │ input │ └─────────────┘ └──────────────────┘ └─────────────┘ ``` ## Checklist ### Phase 1: Rust Backend ✅ (builds & runs) - [x] `server/Cargo.toml` — Dependencies: actix-web 4, actix-ws 0.4, actix-cors, dashmap, parking_lot, serde, uuid, chrono - [x] `server/src/main.rs` — Actix HTTP server with CORS, compression, static file serving, SPA fallback - [x] `server/src/config.rs` — Env-based config (BUTTERFLY_HOST, BUTTERFLY_PORT, etc.) - [x] `server/src/models.rs` — Session, AgentConnection, WsMessage enum (serde-tagged), ApiResponse, HealthInfo (with connected_viewers) - [x] `server/src/state.rs` — AppState with DashMap sessions/agents/viewers/agent_channels, FrameBuffer ring buffer, broadcast/forward methods - [x] `server/src/api/` — REST endpoints: GET/POST/DELETE /api/sessions, GET /api/health, POST /api/sessions/{id}/hud (wired to agent channel) - [x] `server/src/ws/` — WebSocket handler: agent/viewer connect, per-instance mpsc channels, real bidirectional relay, viewer catch-up, heartbeat timeout - [x] `server/src/stream/` — StreamStats tracker (frame count, byte relay, uptime) - [x] `server/static/index.html` — Placeholder loading page - [x] `cargo build` succeeds ### Phase 1.5: Backend Relay Fix ✅ - [x] Replaced stub `broadcast_to_viewers()` with real mpsc channel-based broadcast to all connected viewers - [x] Replaced stub `forward_to_agent()` with real mpsc channel send to agent WS task - [x] Added viewer registry (DashMap>) per session - [x] Added agent channel registry (DashMap>) per session - [x] New viewers receive the latest buffered frame immediately on connect - [x] HUD command REST endpoint now forwards through agent channel - [x] Refactored WS handler to use `tokio::select!` for multiplexed read/write with resettable idle timeout - [x] Fixed `stats()` to count only active sessions (not all sessions) - [x] Added `connected_viewers` to HealthInfo - [x] Removed unused dependencies (rand, tokio-stream) ### Phase 2: Angular 21 Frontend ✅ (builds & serves) - [x] Project scaffold with Angular CLI 21 - [x] Windows-like desktop shell (taskbar, start menu, window manager) - [x] Remote display component (per-instance WebSocket, canvas frame rendering, FPS counter) - [x] HUD overlay (mouse click/move/wheel, keyboard down/up forwarding) - [x] Window Manager service (open, close, focus, minimize, maximize, drag, resize) - [x] WebSocket service (typed message streams, heartbeat) - [x] API service (health, sessions CRUD, HUD command forwarding) - [x] Built-in apps: File Explorer, Terminal, Text Editor, Settings, Web Browser - [x] Session picker dialog (create/connect to remote sessions) - [x] Production build: 322KB total (84KB gzipped), output to `dist/browser/` - [x] Dark theme with animated gradient desktop background ### Phase 2.5: Frontend Bug Fixes ✅ - [x] Taskbar clock now updates every second (was static computed signal) - [x] Terminal auto-scrolls on output (added AfterViewChecked hook) - [x] Remote display uses per-instance WebSocket (was shared singleton — broke multi-session) - [x] Remote display canvas resizes with container via ResizeObserver - [x] Browser iframe uses DomSanitizer for safe URL binding - [x] Browser refresh uses key-based reload instead of URL hack - [x] Removed unused imports (RouterModule, ViewChild, etc.) - [x] HealthInfo interface updated with connected_viewers ### Phase 3: VM Agent Executable ✅ - [x] `agent/Cargo.toml` — Dependencies: scrap, enigo, tokio-tungstenite, image, base64, clap, reqwest, serde - [x] `agent/src/protocol.rs` — AgentWsMessage enum matching server WsMessage, builder helpers - [x] `agent/src/config.rs` — CLI args: --server, --session, --fps, --quality, --display, --audio, --heartbeat, --reconnect - [x] `agent/src/capture.rs` — Screen capture via `scrap` (DXGI/X11/CoreGraphics), BGRA→RGB, JPEG encoding, base64, frame stats - [x] `agent/src/input.rs` — **Full remote control**: mouse move/click/dblclick/scroll, keyboard with 60+ key mappings (browser code→enigo Key), modifier handling, key_type for strings - [x] `agent/src/main.rs` — Entry point: auto session creation via REST, WebSocket connect, tokio::select! loop (capture + receive + heartbeat), auto-reconnect, graceful shutdown #### Remote Control Commands Supported | Command | Description | Params | |---------|-------------|--------| | `mouse_move` | Move cursor | `x`, `y` | | `mouse_down` | Press button | `button` (0=left, 1=mid, 2=right) | | `mouse_up` | Release button | `button` | | `mouse_click` | Click button | `button` | | `mouse_dblclick` | Double-click | `button` | | `scroll` | Scroll wheel | `deltaX`, `deltaY` | | `key_down` | Press key | `key`, `code`, `ctrl`, `shift`, `alt`, `meta` | | `key_up` | Release key | `key`, `code`, `ctrl`, `shift`, `alt`, `meta` | | `key_click` | Type key | `key`, `code` | | `key_type` | Type string | `text` | ### Phase 4: Integration & Polish 🔲 (next) - [ ] End-to-end testing (agent → server → browser) - [ ] Audio capture and playback (cpal + Web Audio API) - [ ] Authentication (JWT / API keys) - [ ] Performance optimization (binary WS frames, delta encoding) - [ ] Start menu search filtering - [ ] Window snap/edge-docking - [ ] Touch support for mobile viewers - [ ] Clipboard forwarding - [ ] Multi-monitor support ## Recent Commits - `0961634` agent: main.rs — entry point, WS client, capture loop, input dispatch, auto-reconnect - `e1e6442` agent: input.rs — full remote control with 60+ key mappings - `4c93b47` agent: capture.rs — screen capture, BGRA→RGB, JPEG encoding, base64 - `5a26c7c` agent: config.rs — CLI args and configuration - `56f6e88` agent: protocol.rs — WsMessage types matching server - `50e5df0` agent: Cargo.toml — project dependencies - `dd70696` api: add connected_viewers to HealthInfo interface - `2344060` ws/handler: implement real bidirectional relay - `29eda76` state: add viewer/agent channel registries - `dcfaceb` desktop: production build works (328KB, 85KB gzip)