feat(rollout): decouple policy deployment from data recording with new `lerobot-rollout` CLI by imstevenpmwork · Pull Request #3413 · huggingface/lerobot

imstevenpmwork · 2026-04-19T22:48:26Z

Summary / Motivation

Introduces a clean architectural split between data collection (lerobot-record) and policy deployment (lerobot-rollout). Previously, lerobot-record handled teleoperation, policy inference, RTC async inference, and DAgger-style human-in-the-loop corrections in a single script with growing conditional complexity. This PR extracts all policy-related execution into a new lerobot.rollout module with a dedicated CLI, leaving lerobot-record as a focused teleoperation-only tool.
The new module follows a Strategy + Factory + Dependency Injection architecture: strategies are pluggable (base, sentry, highlight, dagger), inference backends are swappable (sync, rtc), and all shared state is assembled once into a RolloutContext bundle. This makes each deployment mode independently testable and extensible without touching the others.

Related issues

Related: N/A

What changed

New src/lerobot/rollout/ module — Complete policy deployment engine with:
- 4 strategies: BaseStrategy (autonomous, no recording), SentryStrategy (continuous recording + auto-upload), HighlightStrategy (ring buffer + keystroke save), DAggerStrategy (human-in-the-loop RaC with keyboard/pedal input)
- 2 inference engines: SyncInferenceEngine (inline) and RTCInferenceEngine (async background thread with action queue)
- RolloutContext dependency injection bundle with sub-contexts (runtime, hardware, policy, processor, dataset)
- ThreadSafeRobot wrapper for lock-protected concurrent hardware access
- RolloutRingBuffer memory-bounded circular buffer for highlight capture
- Polymorphic CLI dispatch via draccus ChoiceRegistry (--strategy.type=..., --inference.type=...)
New lerobot-rollout CLI (src/lerobot/scripts/lerobot_rollout.py) registered in pyproject.toml
Simplified lerobot-record — Stripped all policy/inference code; teleop is now required.
Extracted shared utilities:
- DatasetRecordConfig → src/lerobot/configs/dataset.py (shared between record and rollout)
- ActionInterpolator → src/lerobot/utils/action_interpolator.py (decoupled from RTC policy module)
- start_pedal_listener → src/lerobot/utils/pedal.py (evdev foot pedal support with graceful degradation)
Thread safety fix in policies/rtc/action_queue.py — added missing lock guards on qsize(), empty(), get_action_index()
Removed examples/hil/hil_data_collection.py (1184 lines), examples/hil/hil_utils.py, examples/rtc/eval_with_real_robot.py (673 lines) — replaced by lerobot-rollout
New examples — rollout.py and updated evaluate.py for lekiwi, phone-to-so100, and so100-EE setups
New docs — docs/source/inference.mdx covering all strategies and examples; updated hil_data_collection.mdx, il_robots.mdx, rtc.mdx

How was this tested (or how to run locally)

Full Tests CI run: https://github.com/huggingface/lerobot/actions/runs/24641227376
Minimal Tests added: tests/test_rollout.py — covers config validation, ring buffer, thread-safe robot, strategy/inference factory dispatch, DAgger state machine transitions, and context dataclass fields. Run with pytest -q tests/test_rollout.py -svv
Basic Manual HW checks, 2 cameras & OMX robot (mostly for testing the rollout strategies):
- lerobot-rollout --strategy.type=base
- lerobot-rollout --strategy.type=sentry
- lerobot-rollout --strategy.type=dagger with keyboard
- lerobot-rollout --strategy.type=highlight with keyboard
- lerobot-record pure teleoperation — no regressions
Deep E2E tests (same setup):
- I tested the behavior with a currently working workflow in main and in this branch. Results were successful as the behavior was the same between these two.
- Model: https://huggingface.co/imstevenpmwork/super_poulain_smolvla
- Dataset: https://huggingface.co/datasets/imstevenpmwork/super_poulain_draft

Checklist (required before merge)

Linting/formatting run (pre-commit run -a)
All tests pass locally (pytest)
Documentation updated
CI is green
Community Review: I have reviewed another contributor's open PR and linked it here: #

Reviewer notes

RTC needs HW testing and validation.
Anyone in the community is free to review the PR.
More development details in Feat/decouple record script #3380

HuggingFaceDocBuilderDev · 2026-04-19T22:50:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

imstevenpmwork · 2026-04-20T00:03:37Z

@claude Review this PR

claude · 2026-04-20T00:03:56Z

Claude finished @imstevenpmwork's task in 4m 1s —— View job

PR Review: `lerobot-rollout` CLI

I read all 54 changed files in detail. This is a well-structured PR that cleanly separates policy deployment from data recording. The Strategy + Factory + DI architecture is solid and the code is well-documented. Below is my full technical review.

Overall Assessment

Architecture: The decomposition into strategies (base/sentry/highlight/dagger), inference backends (sync/rtc), and context injection is clean and extensible. Each strategy is self-contained and the InferenceEngine ABC allows backend-agnostic action dispatch via send_next_action. Strong separation of concerns.

Code quality: Consistently documented, good logging, clean module boundaries. The codebase follows existing LeRobot patterns (draccus ChoiceRegistry, lazy imports, __all__ exports).

Bugs & Correctness Issues

1. DatasetRecordConfig.__post_init__ always appends timestamp — breaks --resume
src/lerobot/configs/dataset.py:74-77

def __post_init__(self) -> None:
    if self.repo_id:
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        self.repo_id = f"{self.repo_id}_{timestamp}"

When --resume=true, build_rollout_context calls LeRobotDataset.resume(cfg.dataset.repo_id) — but repo_id already has a fresh timestamp appended, so it will attempt to resume a dataset that doesn't exist. The timestamp should be conditional on not resuming, or this logic should be handled at the call site rather than in __post_init__.

Fix this →

2. DAgger strategy missing dataset validation in RolloutConfig.__post_init__
src/lerobot/rollout/configs.py:191-193

needs_dataset = isinstance(self.strategy, (SentryStrategyConfig, HighlightStrategyConfig))

DAggerStrategyConfig is not included in needs_dataset, but both _run_continuous and _run_corrections_only call dataset.add_frame() and dataset.save_episode() unconditionally. A missing --dataset.repo_id with DAgger will crash at runtime instead of failing fast at config validation.

Fix this →

3. DAgger correction phase incorrectly applies record_stride to human frames
src/lerobot/rollout/strategies/dagger.py:449 and 615

record_stride = max(1, cfg.interpolation_multiplier)
# ...
if record_tick % record_stride == 0:  # in CORRECTING phase
    dataset.add_frame(frame)
record_tick += 1

When interpolation_multiplier > 1, this skips every Nth human correction frame. But during corrections, the teleoperator provides frames at base FPS — there's no interpolation. This would discard human demonstration data. record_stride should only apply during the autonomous phase.

Fix this →

4. send_next_action silently produces partial action dicts
src/lerobot/rollout/strategies/core.py:269

action_dict = {k: interp[i].item() for i, k in enumerate(ordered_keys) if i < len(interp)}

If the interpolated tensor has fewer elements than ordered_action_keys, some keys are silently dropped. Sending a partial action dict to the robot could cause unpredictable hardware behavior. This should be a hard error, or at minimum a warning.

5. Highlight strategy has no episode lock on background push
src/lerobot/rollout/strategies/highlight.py:257-277 — _background_push calls safe_push_to_hub without _episode_lock, unlike Sentry (sentry.py:211-222) and DAgger (dagger.py:719-730) which both hold the lock. If the main loop calls dataset.save_episode() concurrently with a push, there could be a data race on the dataset's on-disk state.

Design Concerns

6. Global default constants changed — affects all users
src/lerobot/datasets/utils.py:74-75

DEFAULT_DATA_FILE_SIZE_IN_MB = 50   # was 100
DEFAULT_VIDEO_FILE_SIZE_IN_MB = 100  # was 200

These constants are used throughout the dataset system, not just by rollout. Halving the chunk sizes changes episode rotation behavior for existing pipelines, training scripts, and any code that depends on these defaults. This should be called out in the PR description and potentially discussed separately.

7. PEFT-via-record test removed with no rollout equivalent
tests/test_cli_peft.py — 82 lines deleted (test_peft_record_loads_policy). This verified that PEFT adapters could be loaded through the record CLI with policy inference. Since lerobot-record no longer loads policies, the test was removed — but there's no equivalent test for PEFT loading through lerobot-rollout. This is a test coverage gap for a supported workflow.

8. No integration tests for strategy control loops
The tests in test_rollout.py cover configs, ring buffer, state machine, and factories — but none exercise a strategy's run() method, even with mocked hardware. The complex control loops in sentry, highlight, and dagger (episode rotation, push timing, state transitions during recording) are only tested manually per the PR description.

Minor Issues

9. Dead code: _teleop_smooth_move_to
src/lerobot/rollout/strategies/dagger.py:173-194 — This helper is defined but never called. All 5 call sites are commented out with TODO notes. Should either be removed or the TODO resolved.

10. Duplicate frame in Highlight ring buffer after episode save
src/lerobot/rollout/strategies/highlight.py:170-187 — When the user presses save during live recording, the frame at line 170 is added to the dataset and the episode is saved. Then at line 184, _recording_live is now False so the same frame is appended to the ring buffer at line 187. This means the first frame of the next highlight will duplicate the last frame of the previous episode.

11. RolloutRingBuffer single-threaded assumption has no runtime guard
src/lerobot/rollout/ring_buffer.py:41-44 — The docstring notes the buffer is single-threaded, but there's no assertion or lock to catch accidental misuse. Since _current_bytes accounting would break under concurrent access, a lightweight debug-mode check could prevent subtle bugs.

12. context.py:build_rollout_context is ~200 lines
The function performs 8 sequential phases (policy loading, processors, hardware, features, dataset, policy processors, inference, assembly). While the ordering matters for fail-fast behavior, extracting some phases into named helpers would improve readability.

What Looks Good

Thread-safe robot wrapper (robot_wrapper.py) — Clean lock-protected I/O with read-only property proxying. Correct design for RTC concurrent access.
DAgger state machine (dagger.py:88-164) — Transition validation table with thread-safe consume/request pattern is robust and well-tested.
Ring buffer memory accounting (ring_buffer.py) — Proper byte estimation for torch tensors, numpy arrays, and primitives. Eviction loop is correct.
RTC error handling (rtc.py) — Consecutive error counting with threshold, graceful global shutdown propagation, and compile warmup management are production-quality.
Action interpolator extraction — Clean decoupling from RTC module to utils/ with backward-compat re-export.
Signal handler consolidation — ProcessSignalHandler moved from rl/process.py to utils/process.py for shared use across record and rollout.

Summary

This is a significant, well-designed architectural improvement. The separation of lerobot-record (teleop-only) from lerobot-rollout (policy deployment) is a clean split that will reduce maintenance burden and enable independent strategy evolution.

Must-fix before merge: Issues 1-3 (resume broken by timestamp, DAgger dataset validation, correction frame loss).
Should-fix: Issues 4-5 (partial action dict, highlight push lock).
Worth discussing: Issue 6 (global default changes) and test coverage (7-8).

…move rename_map config

feat(scripts): lerobot-rollout

bc06cb4

imstevenpmwork self-assigned this Apr 19, 2026

imstevenpmwork marked this pull request as ready for review April 19, 2026 23:01

imstevenpmwork requested review from pkooij and s1lent4gnt April 19, 2026 23:03

huggingface deleted a comment from claude Bot Apr 19, 2026

This comment was marked as resolved.

Sign in to view

imstevenpmwork added 5 commits April 20, 2026 15:34

fix(rollout) require dataset in dagger + use duration too

14c7a25

fix(docs): dagger num_episodes

02d8a34

test(rollout): fix expectations

b49e401

fix(rollout): features check

195b777

fix(rollout): device and task propagation + feature pos + warn fps + …

8d1401a

…move rename_map config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rollout): decouple policy deployment from data recording with new `lerobot-rollout` CLI#3413

feat(rollout): decouple policy deployment from data recording with new `lerobot-rollout` CLI#3413
imstevenpmwork wants to merge 6 commits intomainfrom
feat/lerobot-rollout

imstevenpmwork commented Apr 19, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 19, 2026

Uh oh!

imstevenpmwork commented Apr 20, 2026

Uh oh!

claude Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

imstevenpmwork commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary / Motivation

Related issues

What changed

How was this tested (or how to run locally)

Checklist (required before merge)

Reviewer notes

Uh oh!

HuggingFaceDocBuilderDev commented Apr 19, 2026

Uh oh!

imstevenpmwork commented Apr 20, 2026

Uh oh!

claude Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: lerobot-rollout CLI

Overall Assessment

Bugs & Correctness Issues

Design Concerns

Minor Issues

What Looks Good

Summary

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

imstevenpmwork commented Apr 19, 2026 •

edited

Loading

claude Bot commented Apr 20, 2026 •

edited

Loading

PR Review: `lerobot-rollout` CLI