feat: end-to-end cloud-connect test harness + Playwright-led skill by emranemran · Pull Request #962 · daydreamlive/scope

emranemran · 2026-04-20T04:59:43Z

Summary

Ships everything needed for a fresh contributor (or Claude Code agent) to test the deployed Livepeer cloud path end-to-end. Combines the original bash tooling (originally #962), the Playwright e2e updates (originally #970, now folded in), and discoverability / routing / deploy-orchestration work so "test cloud" reliably deploys the user's current code and runs the real UI flow.

Since Livepeer mode is the only supported cloud path going forward (the old direct / fal_app.py + CloudConnectionManager mode is being deprecated), this PR establishes the single cloud-testing entry point.

What "test cloud" now does

An agent matching the skill's trigger phrases ("test cloud", "verify cloud streaming", cloud-connect error, edits to livepeer_fal_app.py / livepeer_app.py, etc.) walks through:

Ask the user which fal app name + env to deploy to (defaulted from .env.local if set).
Derive SCOPE_CLOUD_APP_ID from the confirmed app+env following fal's URL convention (main → no suffix; other envs → --<env> suffix).
Sanity-check SCOPE_CLOUD_API_KEY / SCOPE_USER_ID; stop + ask if missing.
Free port 8000 so run-app.sh owns it.
Deploy via SCOPE_FAL_APP_NAME=… SCOPE_FAL_ENV=… ./deploy-staging.sh. Abort on failure — no running Playwright against stale code.
Start scope with the derived URL.
Run Playwright (cd e2e && npx playwright test). Full round-trip through livepeer trickle to the fal runner and back.
Report result — on success, every session-lifecycle Kafka event has fired and is ready to verify in ClickHouse.

Two test paths, one skill

Playwright e2e (primary) — drives the real Perform-mode UI with a synthetic camera (--use-fake-device-for-media-stream), verifies the output video comes back from the cloud. Produces every session-lifecycle Kafka event: pipeline_load_start, pipeline_loaded, session_created, stream_started, stream_heartbeat, session_closed. ~2–8 min depending on cold/warm.
test-cloud-connect.sh (secondary) — POSTs /api/v1/cloud/connect and polls status. Only verifies the wrapper-layer websocket_connected / websocket_disconnected pair. Bisect-friendly exit codes (0/1/2/3). Good for git bisect run or "did the fal container come up?".

Files

run-app.sh — launches daydream-scope in livepeer cloud mode, sources secrets from .env.local. (Fixed a bash-quoting bug that broke under word-splitting.)
deploy-staging.sh — now parametrized on SCOPE_FAL_APP_NAME / SCOPE_FAL_ENV / SCOPE_FAL_AUTH. Works from any cwd. Ships tracked so any contributor can run it.
test-cloud-connect.sh — HTTP-only orchestration script (push → CI wait → deploy-staging → connect → status poll).
.env.example — required/optional env vars grouped into client-side (SCOPE_CLOUD_APP_ID, SCOPE_CLOUD_API_KEY, SCOPE_USER_ID) and deploy-side (SCOPE_FAL_APP_NAME, SCOPE_FAL_ENV, SCOPE_FAL_AUTH).
e2e/playwright.config.ts — camera permission + fake-device launch args so headless Chromium completes browser↔local-scope WebRTC and actually delivers frames.
e2e/tests/cloud-streaming.spec.ts — new-UI selectors (Workflow/Perform toggle), Camera input, output-video assertion.
e2e/README.md — rewritten. Old version documented stale env vars that don't exist anymore; new one points at the skill for canonical setup.
.agents/skills/testing-livepeer-fal-deploy/SKILL.md — leads with Playwright, prescribes the ask-user → deploy → run flow above.
CLAUDE.md — new "Cloud testing — use this skill" section routes "test cloud" prompts to the skill. Legacy Local Cloud Testing and MCP Server Testing with Local Cloud Dev sections get deprecation markers pointing at the new path.

Verified

npx playwright test passes against scope-livepeer-emran with the passthrough pipeline (~2.8 min cold, ~17 s warm). Output video plays; heartbeats fire.
When combined with PR feat: bring livepeer runner Kafka events to parity with cloud-relay #969's runner-side changes (open, deployed to scope-livepeer-emran for verification but not merged), the full event set lands in ClickHouse correlated by user_id and connection_id = manifest_id.

Test plan

Fresh clone: .env.example documents every required env var.
Set .env.local values; ./deploy-staging.sh picks up SCOPE_FAL_APP_NAME + SCOPE_FAL_ENV.
./run-app.sh starts scope on :8000.
cd e2e && npm install && npx playwright install chromium then npx playwright test → passes.
./test-cloud-connect.sh --skip-push --skip-build-wait --skip-deploy exits 0 with CONNECTED.
Asking Claude Code "test cloud" in this repo routes to the skill and the agent walks the ask-user → deploy → Playwright flow.

Supersedes / related

Closes fix(e2e): update cloud-streaming test for graph-mode UI redesign #970 — the Playwright updates are cherry-picked into this PR.
Complements feat: bring livepeer runner Kafka events to parity with cloud-relay #969 (still open) — the runner-side Kafka event parity work this PR's Playwright test can verify end-to-end. That PR's code was deployed to scope-livepeer-emran to validate; it needs to be reviewed and merged separately.

🤖 Generated with Claude Code

Ship the scripts and skill we developed while debugging the livepeer fal deploy path, so other contributors can run the same test loop from their own Claude Code session. - test-cloud-connect.sh orchestrates push → CI build-cloud wait → deploy-staging → local scope start → /cloud/connect → status poll with bisect-friendly exit codes (0/1/2/3/4). Supports --skip-push, --skip-build-wait, --skip-deploy, --full-session, --keep-scope. - run-app.sh launches daydream-scope in livepeer cloud mode, sourcing secrets from .env.local (gitignored). - .env.example documents the required SCOPE_CLOUD_APP_ID / SCOPE_CLOUD_API_KEY / SCOPE_USER_ID env vars plus optional LIVEPEER_DEBUG. - .agents/skills/testing-livepeer-fal-deploy/SKILL.md teaches future agents when to use this loop, common failure signatures (ACCESS_DENIED, All orchestrators failed, did not receive ready message), and the known gap around /api/v1/session/start not being livepeer-compatible. Users need to supply their own deploy-staging.sh (a thin wrapper around `fal deploy src/scope/cloud/livepeer_fal_app.py --app <name> --auth public --env main`); the test script errors out with a clear message if it's missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>

coderabbitai · 2026-04-20T04:59:50Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 03352ee7-c313-4d36-a927-5152f28d7802

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/test-cloud-connect-tooling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-20T05:27:03Z

🚀 fal.ai Preview Deployment


App ID	`daydream/scope-pr-962--preview`
WebSocket	`wss://fal.run/daydream/scope-pr-962--preview/ws`
Commit	`88be832`

Livepeer Runner


App ID	`daydream/scope-livepeer-pr-962--preview`
WebSocket	`wss://fal.run/daydream/scope-livepeer-pr-962--preview/ws`
Auth	`private`

Testing Livepeer Mode

SCOPE_CLOUD_MODE=livepeer SCOPE_CLOUD_APP_ID="daydream/scope-livepeer-pr-962--preview/ws" uv run daydream-scope

The redesign in #886 replaced the streaming-first landing with a Workflow/Perform toggle and removed the "Daydream Scope" heading the test was polling for. The test has been dead since then. Updates: - Wait on the Perform-mode toggle appearing instead of the missing heading - Explicitly switch to Perform mode before the cloud/pipeline/stream steps — default is now Workflow (graph mode) where those controls aren't rendered - Find the cloud button by title attribute (covers all three states: "Connect to cloud", "Connecting to cloud...", "Cloud connected") - Bump the cloud-connect timeout to 180s so fal cold-starts have room - Verify frame flow by polling any <video> element rather than locating the old "Video Output" card wrapper - Stop uses the start-stream-button toggle (PlayOverlay changes role when streaming) with a text-based fallback Verified locally: full flow passes in ~3 minutes against a warm fal deploy with the passthrough pipeline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>

The previous iteration of this test false-positively passed. It polled any <video> for playback, which always finds the local input preview playing even when the browser↔local-scope WebRTC never completes and no frames ever reach the cloud. The result: ClickHouse saw only websocket_connected / pipeline_loaded / websocket_disconnected — nothing that requires a real round-trip through the livepeer runner. Two fixes: 1. Feed the browser a synthetic camera via --use-fake-device-for-media-stream (plus the Camera input toggle in the UI). This lets getUserMedia() succeed and a real WebRTC peer connection between browser and local scope complete end to end, which triggers CloudTrack._start() → LivepeerClient.start_media() and the "start_stream" trickle control message the runner needs. 2. Assert on the video inside the "Video Output" card, not any <video>. That element only renders when a remoteStream is set, so waiting on its visibility and currentTime > 0 is a true round-trip signal. After frames start flowing, idle 15s so stream_heartbeat events (~every 10s on the runner side) have a chance to fire. Verified locally: test passes in ~2.8 min against scope-livepeer-emran with passthrough. Full event set lands in ClickHouse when paired with the parity PR (#969). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>

Rewrite the testing-livepeer-fal-deploy SKILL so the primary recommended path is the Playwright test (folded in from #970 via cherry-pick). It's the only path that exercises the full livepeer trickle round-trip and produces every session-lifecycle Kafka event. Keep test-cloud-connect.sh as a secondary, bash-only smoke test for "did the fal container come up?" / bisecting cloud-connect regressions. Also fix run-app.sh: the previous form tried to inline-prefix env vars via ${VAR:+VAR=$VAR} on a backslash-continued command, which breaks under bash word-splitting ("SCOPE_CLOUD_API_KEY=sk_... command not found"). Switch to `export` + `exec uv run`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>

Three small changes so contributors and Claude Code agents actually find the skill instead of reinventing the test flow. - Broaden the skill's `description` with more trigger phrases so agents match on common prompts like "test the fal deploy", "run playwright", "verify kafka events", "diagnose fal", and the various observed error strings. - Add a "Testing the deployed Livepeer fal path" section to CLAUDE.md (right above the MCP testing sections) pointing at the skill and distinguishing the Playwright e2e from the bash smoke test. CLAUDE.md is auto-loaded, so agents see this on startup. - Rewrite `e2e/README.md` — the old version referenced stale env vars (`DAYDREAM_TEST_EMAIL` / `DAYDREAM_TEST_PASSWORD`, and an outdated `FAL_WS_URL`) and a flow that no longer matches. New README points at the skill for the canonical setup and gives a quick-reference block for the current `VITE_DAYDREAM_API_KEY` + `.env.local` flow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>

Livepeer cloud mode is the only supported cloud path going forward — the old direct/cloud-relay mode (fal_app.py + CloudConnectionManager + SCOPE_CLOUD_MODE=direct) is being deprecated. So "test cloud" from a user should no longer be ambiguous. - Add "test cloud" as an explicit trigger in the skill's description. - Rewrite the CLAUDE.md testing section to be the single cloud entry point with a clear routing directive: any "test cloud", "verify cloud streaming", or cloud-connect error → this skill. - Mark the legacy "Local Cloud Testing" and "MCP Server Testing with Local Cloud Dev" sections as DEPRECATED so agents don't accidentally land on them, while keeping the content for anyone unblocking in-flight work on the old path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>

"test cloud" should actually test the user's current working tree, not whatever code happens to be deployed. Previously the skill documented running Playwright directly against scope-livepeer-emran, which could silently false-pass against a stale deploy. Three changes make this work: - Parametrize `deploy-staging.sh` on SCOPE_FAL_APP_NAME + SCOPE_FAL_ENV (+ optional SCOPE_FAL_AUTH), defaulted from .env.local. Also make it track its own HERE path so it works when called from any cwd. - Document both vars in .env.example, grouped into client-side and deploy-side sections. SCOPE_CLOUD_APP_ID stays — but we now note that the skill derives it from the app+env the user confirms at test time (fal's URL convention: main has no suffix, other envs get --<env>). - Rewrite the SKILL's "Running the Playwright test" section with an explicit flow: ask for app+env → sanity-check secrets → free port 8000 → deploy → start scope with derived URL → run Playwright. This is what agents should follow when a user says "test cloud". Also add `deploy-staging.sh` to the repo (it was previously an untracked per-user file); needed so any contributor following the skill can actually run the deploy step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>

Squash of feat/test-cloud-connect-tooling (PR #962) onto this branch so we can exercise the parity changes end-to-end via Playwright + skill-driven "test cloud" flow. This commit is a throwaway for verification — once the parity code is signed off, revert this single commit before opening PR #969 for review so the diff stays focused. Squashed from: - feat: add end-to-end cloud-connect test harness and skill - fix(e2e): update cloud-streaming test for graph-mode UI redesign - fix(e2e): actually exercise the livepeer trickle path - feat: lead SKILL with Playwright + fix run-app.sh env var quoting - docs: make the testing-livepeer-fal-deploy skill discoverable - docs: route all "test cloud" prompts to the livepeer skill - feat: skill asks for fal app+env, deploys, then runs Playwright Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>

j0sh

Very cool! 🔥 🔥 Let's wait until #972 is merged then some of the info here can be updated since cloud relay mode won't exist anymore

@Scenario

Per Emran's blessing in chat, absorbing PR #962 ("end-to-end cloud-connect test harness + Playwright-led skill") into this PR so the two systems ship as one cohesive story instead of two PRs with overlapping concerns. The two surfaces stay invokable separately, as Emran requested: - `testing-livepeer-fal-deploy` skill — triggered by "test cloud", "verify cloud streaming", "run the e2e test", cloud-connect errors. Engineer- driven ad-hoc verification: ask user → deploy → run Playwright → report. Drives e2e/tests/cloud-streaming.spec.ts via npx playwright. - product-tests/ — automated CI gating, every PR, scenarios + chaos + regression + multimodal. Drives pytest + the @Scenario harness. Two different questions ("did my deploy work?" vs "is the product broken?") get two different tools. CLAUDE.md routing makes the distinction explicit. Files folded in (verbatim from PR #962, authored by emranemran): - .agents/skills/testing-livepeer-fal-deploy/SKILL.md - .env.example - deploy-staging.sh - run-app.sh - test-cloud-connect.sh - e2e/playwright.config.ts (camera permission + fake-device launch args) - e2e/tests/cloud-streaming.spec.ts (Perform-mode + camera + output video) - e2e/README.md (rewritten to point at the skill) CLAUDE.md merged: adds Emran's "Cloud testing — use this skill" routing section, with a note distinguishing his ad-hoc skill from the product-tests CI gate. Deprecation markers on the legacy "Local Cloud Testing" section preserved. Closes #962 once this lands. Co-Authored-By: Emran M <emranemran@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hunter Hillman <hthillman@gmail.com>

@Scenario

Per Emran's blessing in chat, absorbing PR #962 ("end-to-end cloud-connect test harness + Playwright-led skill") into this PR so the two systems ship as one cohesive story instead of two PRs with overlapping concerns. The two surfaces stay invokable separately, as Emran requested: - `testing-livepeer-fal-deploy` skill — triggered by "test cloud", "verify cloud streaming", "run the e2e test", cloud-connect errors. Engineer- driven ad-hoc verification: ask user → deploy → run Playwright → report. Drives e2e/tests/cloud-streaming.spec.ts via npx playwright. - product-tests/ — automated CI gating, every PR, scenarios + chaos + regression + multimodal. Drives pytest + the @Scenario harness. Two different questions ("did my deploy work?" vs "is the product broken?") get two different tools. CLAUDE.md routing makes the distinction explicit. Files folded in (verbatim from PR #962, authored by emranemran): - .agents/skills/testing-livepeer-fal-deploy/SKILL.md - .env.example - deploy-staging.sh - run-app.sh - test-cloud-connect.sh - e2e/playwright.config.ts (camera permission + fake-device launch args) - e2e/tests/cloud-streaming.spec.ts (Perform-mode + camera + output video) - e2e/README.md (rewritten to point at the skill) CLAUDE.md merged: adds Emran's "Cloud testing — use this skill" routing section, with a note distinguishing his ad-hoc skill from the product-tests CI gate. Deprecation markers on the legacy "Local Cloud Testing" section preserved. Closes #962 once this lands. Co-Authored-By: Emran M <emranemran@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hunter Hillman <hthillman@gmail.com>

The CLAUDE.md "Cloud testing — use this skill" section that should have landed in 8fe40ed didn't get staged before the commit. Adding it now: routes "test cloud" / "verify cloud streaming" / cloud-connect errors to the testing-livepeer-fal-deploy skill, with a note distinguishing it from the product-tests CI gate. Deprecation markers on legacy "Local Cloud Testing" section preserved. Co-Authored-By: Emran M <emranemran@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hunter Hillman <hthillman@gmail.com>

emranemran mentioned this pull request Apr 20, 2026

feat: add Kafka event publishing to Livepeer fal runner #956

Merged

emranemran and others added 3 commits April 20, 2026 15:56

emranemran changed the title ~~feat: add end-to-end cloud-connect test harness and skill~~ feat: end-to-end cloud-connect test harness + Playwright-led skill Apr 20, 2026

emranemran mentioned this pull request Apr 20, 2026

fix(e2e): update cloud-streaming test for graph-mode UI redesign #970

Closed

3 tasks

emranemran and others added 3 commits April 20, 2026 16:48

j0sh approved these changes Apr 21, 2026

View reviewed changes

hthillman mentioned this pull request Apr 28, 2026

Add product-tests: retry/close gates + scenario/chaos suite #984

Draft

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: end-to-end cloud-connect test harness + Playwright-led skill#962

feat: end-to-end cloud-connect test harness + Playwright-led skill#962
emranemran wants to merge 7 commits intomainfrom
feat/test-cloud-connect-tooling

emranemran commented Apr 20, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

j0sh left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

emranemran commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What "test cloud" now does

Two test paths, one skill

Files

Verified

Test plan

Supersedes / related

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 fal.ai Preview Deployment

Livepeer Runner

Testing Livepeer Mode

Uh oh!

j0sh left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

emranemran commented Apr 20, 2026 •

edited

Loading

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

github-actions Bot commented Apr 20, 2026 •

edited

Loading