projects/progress.md

6.8 KiB

Butterfly Desktop Environment — Progress

Overview

A remote desktop environment with a Rust (Actix) backend, Angular 21 frontend, and Rust VM agent. The system mimics a traditional Windows-like desktop in the browser, receiving display/audio from VM agents with minimal lag. Full remote control — viewers can move the mouse, click, type, and scroll on the remote machine in real time.

Low-latency pipeline: H.264 hardware-accelerated encoding (openh264), binary WebSocket frames (no JSON/base64 overhead), WebCodes GPU-accelerated decoding in the browser. Target: 5-15ms end-to-end latency on LAN for gaming.

Architecture

┌─────────────┐   Binary WS frames   ┌──────────────────┐   Binary WS frames   ┌─────────────┐
│  Angular 21 │◄────────────────────►│  Rust Actix Server│◄────────────────────►│ VM Agent exe │
│  (Browser)  │   H.264/JPEG data    │  (dumb pipe)      │   H.264/JPEG data    │  (Rust)      │
│  WebCodes   │   JSON text only ◄──►│  zero-copy relay  │   JSON text only ◄──►│  openh264    │
│  GPU decode │   (HUD, heartbeat)   │                    │   (HUD, heartbeat)   │  BGRA→YUV420 │
└─────────────┘                      └──────────────────┘                      └─────────────┘

Wire protocol:
  Binary WS frame = [1B type][4B timestamp][4B width][4B height][payload...]
  Text WS frame   = {"msg_type": "...", ...}  (JSON control messages)

Checklist

Phase 1: Rust Backend

  • Actix HTTP server, REST API, WebSocket handler, frame relay

Phase 2: Angular 21 Frontend

  • Windows-like desktop shell, taskbar, start menu, window manager
  • Built-in apps: File Explorer, Terminal, Text Editor, Settings, Browser
  • Session picker, API/WebSocket services, dark theme

Phase 3: VM Agent Executable

  • Screen capture (scrap), input injection (enigo), auto-reconnect

Phase 3.6: System Service Installation

CLI Subcommands

  • agent/src/cli.rs — Subcommand-based CLI (run, service install/uninstall/start/stop/status/restart)
  • agent/src/config.rs — Renamed AgentConfigRunOptions, added Clone, to_service_args(), hidden --windows-service flag

Service Management (Linux systemd)

  • agent/src/service.rs — systemd unit file generation with proper ExecStart, restart policy, env capture
  • Auto-detects display environment variables (DISPLAY, WAYLAND_DISPLAY, XDG_RUNTIME_DIR, DBUS_SESSION_BUS_ADDRESS)
  • systemctl enable --now on install, daemon-reload after changes
  • Configurable user (--user), working directory (--working-directory), log file (--log-file)

Service Management (Windows)

  • agent/src/service.rs — Windows Service registration via sc.exe (create, start, stop, delete, query)
  • Auto-start on boot, failure recovery (restart after 5s, reset after 1 day)
  • agent/src/main.rs — Full Windows Service runtime via windows-service crate (SCM dispatcher, service control handler, SERVICE_RUNNING/STOPPED state reporting)

Graceful Shutdown

  • tokio::signal::ctrl_c() handler in foreground mode
  • tokio::sync::watch shutdown channel propagated through reconnect loop
  • WebSocket close frame sent on shutdown
  • Windows SCM STOP/SHUTDOWN events trigger clean shutdown
  • Reconnect loop checks shutdown before attempting reconnection

Dependencies

  • agent/Cargo.tomlwindows-service 0.7 (cfg windows), clap env feature, base64 0.22

Phase 3.5: Low-Latency Video Pipeline

Agent (H.264 + binary frames)

  • agent/src/protocol.rs — Binary frame format (13-byte header: type + timestamp + width + height + payload)
  • agent/src/encoder.rs — H.264 encoder (openh264, optional feature), JPEG fallback, BGRA→I420 conversion
  • agent/src/capture.rs — Raw BGRA output (encoding moved to encoder)
  • agent/src/config.rs--encoder h264|jpeg flag, default 60fps
  • agent/src/main.rs — Binary WS frames for video, JSON text for control, capture+encode loop
  • agent/Cargo.toml — openh264 optional dep, cfg_if, release optimizations (LTO, codegen-units=1)

Server (zero-copy binary relay)

  • server/src/state.rs — Binary FrameBuffer (Vec<Vec>), WsOutMessage enum (Binary|Text), broadcast_binary_frame
  • server/src/ws/handler.rs — Binary frames from agent → broadcast to viewers (zero-copy); text frames for JSON control; viewer catch-up with latest binary frame

Frontend (WebCodes H.264 + JPEG fallback)

  • WebCodes VideoDecoder for H.264 GPU-accelerated decoding
  • Binary WebSocket frame parsing (13-byte header)
  • Annex-B NAL unit parsing, SPS/PPS extraction, AVCC description builder
  • Automatic codec detection from SPS (profile/level guessing)
  • JPEG fallback when H.264 unavailable
  • HUD input forwarding unchanged (JSON text frames)

Latency Comparison

Stage Old (JPEG+JSON) New (H.264+Binary)
Encode 15-30ms (CPU) 1-5ms (openh264)
Frame size 200-500KB 10-50KB
Network 2-5ms 0.5-1ms
Decode 3-5ms 1-2ms (GPU)
Total 25-45ms ~5-15ms

Recent Commits

  • 16f3f7a fix: CLI dispatch, clap env feature, base64 dep, stray attribute cleanup
  • fdd1dcb fix: resolve compilation errors from crate version mismatches (openh264, enigo, image APIs)
  • 84f559c agent: main.rs — subcommand CLI dispatch, shutdown signal (Ctrl+C + Windows SCM)
  • d0e8bf5 agent: service.rs — Linux systemd + Windows service management
  • 920ea0e agent: cli.rs — subcommand CLI (run, service install/uninstall/start/stop/status/restart)
  • 7a44809 agent: config.rs — rename AgentConfig to RunOptions, add Clone, windows_service field
  • eae74e9 agent: Cargo.toml — add windows-service 0.7 dependency
  • 60b23bc fix: Uint8Array to Blob cast for TS compatibility
  • 63e4513 frontend: WebCodes H.264 decoder, binary WS frames, AVCC description builder
  • 05cfe9e server: binary frame relay (zero-copy), text JSON for control
  • 31a862b server: binary FrameBuffer, WsOutMessage enum
  • 081cb0d agent: Cargo.toml v0.2.0 — openh264 optional feature
  • 86f0e4e agent: main.rs — binary WS frames, encoder pipeline
  • b7c254a agent: encoder.rs — H.264 + JPEG encoder abstraction
  • cf617d0 agent: capture.rs — raw BGRA output
  • b690b07 agent: config.rs — --encoder h264|jpeg flag
  • a97ebed agent: protocol.rs — binary video frame format
  • 1468097 docs: Phase 3 VM Agent complete
  • e1e6442 agent: input.rs — full remote control