projects/progress.md

7.3 KiB

Butterfly Desktop Environment — Progress

Overview

A remote desktop environment with a Rust (Actix) backend, Angular 21 frontend, and Rust VM agent. The system mimics a traditional Windows-like desktop in the browser, receiving display/audio from VM agents with minimal lag. Full remote control — viewers can move the mouse, click, type, and scroll on the remote machine in real time.

Architecture

┌─────────────┐     WebSocket      ┌──────────────────┐     WebSocket      ┌─────────────┐
│  Angular 21 │◄──────────────────►│  Rust Actix Server│◄──────────────────►│ VM Agent exe │
│  (Browser)  │   display frames   │  (REST + WS)      │   display frames   │  (Rust)      │
│  Viewer     │   HUD commands ▸   │  relay hub        │   HUD commands ▸   │  captures    │
│  controls   │                    │                    │                    │  screen +    │
│  remote     │                    │                    │                    │  injects     │
│  desktop    │                    │                    │                    │  input       │
└─────────────┘                    └──────────────────┘                    └─────────────┘

Checklist

Phase 1: Rust Backend (builds & runs)

  • server/Cargo.toml — Dependencies: actix-web 4, actix-ws 0.4, actix-cors, dashmap, parking_lot, serde, uuid, chrono
  • server/src/main.rs — Actix HTTP server with CORS, compression, static file serving, SPA fallback
  • server/src/config.rs — Env-based config (BUTTERFLY_HOST, BUTTERFLY_PORT, etc.)
  • server/src/models.rs — Session, AgentConnection, WsMessage enum (serde-tagged), ApiResponse, HealthInfo (with connected_viewers)
  • server/src/state.rs — AppState with DashMap sessions/agents/viewers/agent_channels, FrameBuffer ring buffer, broadcast/forward methods
  • server/src/api/ — REST endpoints: GET/POST/DELETE /api/sessions, GET /api/health, POST /api/sessions/{id}/hud (wired to agent channel)
  • server/src/ws/ — WebSocket handler: agent/viewer connect, per-instance mpsc channels, real bidirectional relay, viewer catch-up, heartbeat timeout
  • server/src/stream/ — StreamStats tracker (frame count, byte relay, uptime)
  • server/static/index.html — Placeholder loading page
  • cargo build succeeds

Phase 1.5: Backend Relay Fix

  • Replaced stub broadcast_to_viewers() with real mpsc channel-based broadcast to all connected viewers
  • Replaced stub forward_to_agent() with real mpsc channel send to agent WS task
  • Added viewer registry (DashMap<String, Vec>) per session
  • Added agent channel registry (DashMap<String, Mutex>) per session
  • New viewers receive the latest buffered frame immediately on connect
  • HUD command REST endpoint now forwards through agent channel
  • Refactored WS handler to use tokio::select! for multiplexed read/write with resettable idle timeout
  • Fixed stats() to count only active sessions (not all sessions)
  • Added connected_viewers to HealthInfo
  • Removed unused dependencies (rand, tokio-stream)

Phase 2: Angular 21 Frontend (builds & serves)

  • Project scaffold with Angular CLI 21
  • Windows-like desktop shell (taskbar, start menu, window manager)
  • Remote display component (per-instance WebSocket, canvas frame rendering, FPS counter)
  • HUD overlay (mouse click/move/wheel, keyboard down/up forwarding)
  • Window Manager service (open, close, focus, minimize, maximize, drag, resize)
  • WebSocket service (typed message streams, heartbeat)
  • API service (health, sessions CRUD, HUD command forwarding)
  • Built-in apps: File Explorer, Terminal, Text Editor, Settings, Web Browser
  • Session picker dialog (create/connect to remote sessions)
  • Production build: 322KB total (84KB gzipped), output to dist/browser/
  • Dark theme with animated gradient desktop background

Phase 2.5: Frontend Bug Fixes

  • Taskbar clock now updates every second (was static computed signal)
  • Terminal auto-scrolls on output (added AfterViewChecked hook)
  • Remote display uses per-instance WebSocket (was shared singleton — broke multi-session)
  • Remote display canvas resizes with container via ResizeObserver
  • Browser iframe uses DomSanitizer for safe URL binding
  • Browser refresh uses key-based reload instead of URL hack
  • Removed unused imports (RouterModule, ViewChild, etc.)
  • HealthInfo interface updated with connected_viewers

Phase 3: VM Agent Executable

  • agent/Cargo.toml — Dependencies: scrap, enigo, tokio-tungstenite, image, base64, clap, reqwest, serde
  • agent/src/protocol.rs — AgentWsMessage enum matching server WsMessage, builder helpers
  • agent/src/config.rs — CLI args: --server, --session, --fps, --quality, --display, --audio, --heartbeat, --reconnect
  • agent/src/capture.rs — Screen capture via scrap (DXGI/X11/CoreGraphics), BGRA→RGB, JPEG encoding, base64, frame stats
  • agent/src/input.rsFull remote control: mouse move/click/dblclick/scroll, keyboard with 60+ key mappings (browser code→enigo Key), modifier handling, key_type for strings
  • agent/src/main.rs — Entry point: auto session creation via REST, WebSocket connect, tokio::select! loop (capture + receive + heartbeat), auto-reconnect, graceful shutdown

Remote Control Commands Supported

Command Description Params
mouse_move Move cursor x, y
mouse_down Press button button (0=left, 1=mid, 2=right)
mouse_up Release button button
mouse_click Click button button
mouse_dblclick Double-click button
scroll Scroll wheel deltaX, deltaY
key_down Press key key, code, ctrl, shift, alt, meta
key_up Release key key, code, ctrl, shift, alt, meta
key_click Type key key, code
key_type Type string text

Phase 4: Integration & Polish 🔲 (next)

  • End-to-end testing (agent → server → browser)
  • Audio capture and playback (cpal + Web Audio API)
  • Authentication (JWT / API keys)
  • Performance optimization (binary WS frames, delta encoding)
  • Start menu search filtering
  • Window snap/edge-docking
  • Touch support for mobile viewers
  • Clipboard forwarding
  • Multi-monitor support

Recent Commits

  • 0961634 agent: main.rs — entry point, WS client, capture loop, input dispatch, auto-reconnect
  • e1e6442 agent: input.rs — full remote control with 60+ key mappings
  • 4c93b47 agent: capture.rs — screen capture, BGRA→RGB, JPEG encoding, base64
  • 5a26c7c agent: config.rs — CLI args and configuration
  • 56f6e88 agent: protocol.rs — WsMessage types matching server
  • 50e5df0 agent: Cargo.toml — project dependencies
  • dd70696 api: add connected_viewers to HealthInfo interface
  • 2344060 ws/handler: implement real bidirectional relay
  • 29eda76 state: add viewer/agent channel registries
  • dcfaceb desktop: production build works (328KB, 85KB gzip)