Problem Statement
VNC streaming in local/self-hosted deployments has several critical limitations that prevent users from effectively monitoring and controlling browser automation:
1. No Live View in local mode
The VNC streaming infrastructure is designed for cloud mode only, where browsers run on dedicated VMs with IP addresses. In self-hosted Docker deployments, users see "Starting the stream..." indefinitely because there's no VNC server configured for local browsers.
2. No multi-session support
When running multiple workflows in parallel, there's no way to view individual browser sessions. The single shared X display (:99) means only one browser can be viewed at a time, and there's no mechanism to isolate different workflow streams.
3. Take Control doesn't persist
When a user clicks "Take Control" to manually interact with the browser, this state is lost whenever the VNC WebSocket connection briefly disconnects (which happens frequently during normal operation). Users must repeatedly click "Take Control" to regain manual control.
Current Behavior
- Live view shows "Starting the stream..." indefinitely in Docker deployments
- Multiple parallel workflows cannot be viewed individually
- "Take Control" state resets to "agent" after every VNC reconnection
- Only cloud deployments with dedicated browser VMs have working VNC streaming
Expected Behavior
- Each workflow run should have its own independent VNC stream
- Users should be able to view any running workflow's browser in real-time from the UI
- "Take Control" state should persist across VNC reconnections
- Multiple workflows should run in parallel without display conflicts
- Self-hosted Docker deployments should have the same VNC capabilities as cloud deployments
Use Cases
- Development & debugging: Developers need to watch browser automation in real-time to debug issues
- Manual intervention: Users need to take control of workflows to handle CAPTCHAs or unexpected prompts
- Parallel execution: Running multiple workflows simultaneously while monitoring each one independently
- Persistent sessions: Maintaining browser sessions across multiple sequential tasks
Related Issues
This addresses user requests and problems reported in:
Environment
- Deployment: Self-hosted Docker / docker-compose
- Skyvern version: 0.2.x - current
Problem Statement
VNC streaming in local/self-hosted deployments has several critical limitations that prevent users from effectively monitoring and controlling browser automation:
1. No Live View in local mode
The VNC streaming infrastructure is designed for cloud mode only, where browsers run on dedicated VMs with IP addresses. In self-hosted Docker deployments, users see "Starting the stream..." indefinitely because there's no VNC server configured for local browsers.
2. No multi-session support
When running multiple workflows in parallel, there's no way to view individual browser sessions. The single shared X display (
:99) means only one browser can be viewed at a time, and there's no mechanism to isolate different workflow streams.3. Take Control doesn't persist
When a user clicks "Take Control" to manually interact with the browser, this state is lost whenever the VNC WebSocket connection briefly disconnects (which happens frequently during normal operation). Users must repeatedly click "Take Control" to regain manual control.
Current Behavior
Expected Behavior
Use Cases
Related Issues
This addresses user requests and problems reported in:
Environment