feat(dataviewer): support multiple camera angles in episode viewer

### Component

`data-management/viewer/` (frontend + backend)

### Problem Statement

The current dataviewer episode viewer renders a single camera stream per episode. Bimanual VLA policies (TwinVLA, π₀, RDT-1B) require up to 3 camera views during training and inference:

- **Front/ego-centric camera** — workspace overview
- **Left wrist camera** — left arm end-effector view
- **Right wrist camera** — right arm end-effector view

RoboTwin 2.0 datasets include `front_image`, `wrist_image_left`, and `wrist_image_right` per episode step. LeRobot datasets may include additional camera keys. Without multi-camera visualization, operators cannot:

1. Verify camera coverage and alignment across views
2. Inspect occlusion or lighting issues in individual camera streams
3. Correlate spatial relationships between arm-mounted and workspace cameras
4. Quality-check the exact visual inputs the VLA model receives during training

### Proposed Solution

Add multi-camera display support to the episode viewer:

1. **Auto-detect camera keys** from LeRobot dataset metadata (scan `observation.images.*` keys)
2. **Grid layout** — display all camera views simultaneously in a responsive grid (1×1 for single camera, 1×3 for three cameras, 2×2 for four, etc.)
3. **Camera selector** — allow toggling individual cameras on/off via a camera panel
4. **Synchronized scrubbing** — all camera views stay frame-synchronized when scrubbing the timeline
5. **Per-camera zoom** — click a camera view to expand it to full width while keeping others visible as thumbnails
6. **Camera labels** — display the LeRobot image key name as an overlay on each view

### Technical Notes

- LeRobot v3.0 stores images as MP4 videos under `videos/{camera_key}_episode_{idx}.mp4`
- Camera keys are defined in dataset `info.json` under `features.observation.images`
- The backend already streams video frames; the change is primarily frontend layout and state management
- Consider using CSS grid with `auto-fit` for responsive layout

### Acceptance Criteria

- [ ] Episode viewer detects and displays all available camera streams from the dataset
- [ ] Camera views are frame-synchronized with the timeline scrubber
- [ ] Individual cameras can be toggled on/off
- [ ] Clicking a camera view expands it; clicking again returns to grid
- [ ] Camera key names displayed as overlay labels
- [ ] Works with single-camera datasets (no regression)
- [ ] Works with 3-camera bimanual datasets (RoboTwin 2.0, TwinVLA Tabletop-Sim)

### Context

- VLA training pipeline: `training/vla/` (branch `feat/vla-twinvla-robotwin`)
- Bimanual robot types: `evaluation/sil/bimanual_robot_types.py`
- Tracking issue: #115 (dataviewer build & lint remediation)
- Related: #210 (frontend testing tracking issue)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dataviewer): support multiple camera angles in episode viewer #436

Component

Problem Statement

Proposed Solution

Technical Notes

Acceptance Criteria

Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(dataviewer): support multiple camera angles in episode viewer #436

Description

Component

Problem Statement

Proposed Solution

Technical Notes

Acceptance Criteria

Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions