feat: end-to-end cloud-connect test harness + Playwright-led skill#962
Open
emranemran wants to merge 7 commits intomainfrom
Open
feat: end-to-end cloud-connect test harness + Playwright-led skill#962emranemran wants to merge 7 commits intomainfrom
emranemran wants to merge 7 commits intomainfrom
Conversation
Ship the scripts and skill we developed while debugging the livepeer fal deploy path, so other contributors can run the same test loop from their own Claude Code session. - test-cloud-connect.sh orchestrates push → CI build-cloud wait → deploy-staging → local scope start → /cloud/connect → status poll with bisect-friendly exit codes (0/1/2/3/4). Supports --skip-push, --skip-build-wait, --skip-deploy, --full-session, --keep-scope. - run-app.sh launches daydream-scope in livepeer cloud mode, sourcing secrets from .env.local (gitignored). - .env.example documents the required SCOPE_CLOUD_APP_ID / SCOPE_CLOUD_API_KEY / SCOPE_USER_ID env vars plus optional LIVEPEER_DEBUG. - .agents/skills/testing-livepeer-fal-deploy/SKILL.md teaches future agents when to use this loop, common failure signatures (ACCESS_DENIED, All orchestrators failed, did not receive ready message), and the known gap around /api/v1/session/start not being livepeer-compatible. Users need to supply their own deploy-staging.sh (a thin wrapper around `fal deploy src/scope/cloud/livepeer_fal_app.py --app <name> --auth public --env main`); the test script errors out with a clear message if it's missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Contributor
🚀 fal.ai Preview Deployment
Livepeer Runner
Testing Livepeer Mode |
The redesign in #886 replaced the streaming-first landing with a Workflow/Perform toggle and removed the "Daydream Scope" heading the test was polling for. The test has been dead since then. Updates: - Wait on the Perform-mode toggle appearing instead of the missing heading - Explicitly switch to Perform mode before the cloud/pipeline/stream steps — default is now Workflow (graph mode) where those controls aren't rendered - Find the cloud button by title attribute (covers all three states: "Connect to cloud", "Connecting to cloud...", "Cloud connected") - Bump the cloud-connect timeout to 180s so fal cold-starts have room - Verify frame flow by polling any <video> element rather than locating the old "Video Output" card wrapper - Stop uses the start-stream-button toggle (PlayOverlay changes role when streaming) with a text-based fallback Verified locally: full flow passes in ~3 minutes against a warm fal deploy with the passthrough pipeline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>
The previous iteration of this test false-positively passed. It polled any <video> for playback, which always finds the local input preview playing even when the browser↔local-scope WebRTC never completes and no frames ever reach the cloud. The result: ClickHouse saw only websocket_connected / pipeline_loaded / websocket_disconnected — nothing that requires a real round-trip through the livepeer runner. Two fixes: 1. Feed the browser a synthetic camera via --use-fake-device-for-media-stream (plus the Camera input toggle in the UI). This lets getUserMedia() succeed and a real WebRTC peer connection between browser and local scope complete end to end, which triggers CloudTrack._start() → LivepeerClient.start_media() and the "start_stream" trickle control message the runner needs. 2. Assert on the video inside the "Video Output" card, not any <video>. That element only renders when a remoteStream is set, so waiting on its visibility and currentTime > 0 is a true round-trip signal. After frames start flowing, idle 15s so stream_heartbeat events (~every 10s on the runner side) have a chance to fire. Verified locally: test passes in ~2.8 min against scope-livepeer-emran with passthrough. Full event set lands in ClickHouse when paired with the parity PR (#969). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>
Rewrite the testing-livepeer-fal-deploy SKILL so the primary recommended path is the Playwright test (folded in from #970 via cherry-pick). It's the only path that exercises the full livepeer trickle round-trip and produces every session-lifecycle Kafka event. Keep test-cloud-connect.sh as a secondary, bash-only smoke test for "did the fal container come up?" / bisecting cloud-connect regressions. Also fix run-app.sh: the previous form tried to inline-prefix env vars via ${VAR:+VAR=$VAR} on a backslash-continued command, which breaks under bash word-splitting ("SCOPE_CLOUD_API_KEY=sk_... command not found"). Switch to `export` + `exec uv run`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>
3 tasks
Three small changes so contributors and Claude Code agents actually find the skill instead of reinventing the test flow. - Broaden the skill's `description` with more trigger phrases so agents match on common prompts like "test the fal deploy", "run playwright", "verify kafka events", "diagnose fal", and the various observed error strings. - Add a "Testing the deployed Livepeer fal path" section to CLAUDE.md (right above the MCP testing sections) pointing at the skill and distinguishing the Playwright e2e from the bash smoke test. CLAUDE.md is auto-loaded, so agents see this on startup. - Rewrite `e2e/README.md` — the old version referenced stale env vars (`DAYDREAM_TEST_EMAIL` / `DAYDREAM_TEST_PASSWORD`, and an outdated `FAL_WS_URL`) and a flow that no longer matches. New README points at the skill for the canonical setup and gives a quick-reference block for the current `VITE_DAYDREAM_API_KEY` + `.env.local` flow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>
Livepeer cloud mode is the only supported cloud path going forward — the old direct/cloud-relay mode (fal_app.py + CloudConnectionManager + SCOPE_CLOUD_MODE=direct) is being deprecated. So "test cloud" from a user should no longer be ambiguous. - Add "test cloud" as an explicit trigger in the skill's description. - Rewrite the CLAUDE.md testing section to be the single cloud entry point with a clear routing directive: any "test cloud", "verify cloud streaming", or cloud-connect error → this skill. - Mark the legacy "Local Cloud Testing" and "MCP Server Testing with Local Cloud Dev" sections as DEPRECATED so agents don't accidentally land on them, while keeping the content for anyone unblocking in-flight work on the old path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>
"test cloud" should actually test the user's current working tree, not whatever code happens to be deployed. Previously the skill documented running Playwright directly against scope-livepeer-emran, which could silently false-pass against a stale deploy. Three changes make this work: - Parametrize `deploy-staging.sh` on SCOPE_FAL_APP_NAME + SCOPE_FAL_ENV (+ optional SCOPE_FAL_AUTH), defaulted from .env.local. Also make it track its own HERE path so it works when called from any cwd. - Document both vars in .env.example, grouped into client-side and deploy-side sections. SCOPE_CLOUD_APP_ID stays — but we now note that the skill derives it from the app+env the user confirms at test time (fal's URL convention: main has no suffix, other envs get --<env>). - Rewrite the SKILL's "Running the Playwright test" section with an explicit flow: ask for app+env → sanity-check secrets → free port 8000 → deploy → start scope with derived URL → run Playwright. This is what agents should follow when a user says "test cloud". Also add `deploy-staging.sh` to the repo (it was previously an untracked per-user file); needed so any contributor following the skill can actually run the deploy step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>
emranemran
added a commit
that referenced
this pull request
Apr 21, 2026
Squash of feat/test-cloud-connect-tooling (PR #962) onto this branch so we can exercise the parity changes end-to-end via Playwright + skill-driven "test cloud" flow. This commit is a throwaway for verification — once the parity code is signed off, revert this single commit before opening PR #969 for review so the diff stays focused. Squashed from: - feat: add end-to-end cloud-connect test harness and skill - fix(e2e): update cloud-streaming test for graph-mode UI redesign - fix(e2e): actually exercise the livepeer trickle path - feat: lead SKILL with Playwright + fix run-app.sh env var quoting - docs: make the testing-livepeer-fal-deploy skill discoverable - docs: route all "test cloud" prompts to the livepeer skill - feat: skill asks for fal app+env, deploys, then runs Playwright Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: emranemran <emran.mah@gmail.com>
hthillman
added a commit
that referenced
this pull request
Apr 28, 2026
Per Emran's blessing in chat, absorbing PR #962 ("end-to-end cloud-connect test harness + Playwright-led skill") into this PR so the two systems ship as one cohesive story instead of two PRs with overlapping concerns. The two surfaces stay invokable separately, as Emran requested: - `testing-livepeer-fal-deploy` skill — triggered by "test cloud", "verify cloud streaming", "run the e2e test", cloud-connect errors. Engineer- driven ad-hoc verification: ask user → deploy → run Playwright → report. Drives e2e/tests/cloud-streaming.spec.ts via npx playwright. - product-tests/ — automated CI gating, every PR, scenarios + chaos + regression + multimodal. Drives pytest + the @Scenario harness. Two different questions ("did my deploy work?" vs "is the product broken?") get two different tools. CLAUDE.md routing makes the distinction explicit. Files folded in (verbatim from PR #962, authored by emranemran): - .agents/skills/testing-livepeer-fal-deploy/SKILL.md - .env.example - deploy-staging.sh - run-app.sh - test-cloud-connect.sh - e2e/playwright.config.ts (camera permission + fake-device launch args) - e2e/tests/cloud-streaming.spec.ts (Perform-mode + camera + output video) - e2e/README.md (rewritten to point at the skill) CLAUDE.md merged: adds Emran's "Cloud testing — use this skill" routing section, with a note distinguishing his ad-hoc skill from the product-tests CI gate. Deprecation markers on the legacy "Local Cloud Testing" section preserved. Closes #962 once this lands. Co-Authored-By: Emran M <emranemran@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hunter Hillman <hthillman@gmail.com>
10 tasks
hthillman
added a commit
that referenced
this pull request
Apr 28, 2026
Per Emran's blessing in chat, absorbing PR #962 ("end-to-end cloud-connect test harness + Playwright-led skill") into this PR so the two systems ship as one cohesive story instead of two PRs with overlapping concerns. The two surfaces stay invokable separately, as Emran requested: - `testing-livepeer-fal-deploy` skill — triggered by "test cloud", "verify cloud streaming", "run the e2e test", cloud-connect errors. Engineer- driven ad-hoc verification: ask user → deploy → run Playwright → report. Drives e2e/tests/cloud-streaming.spec.ts via npx playwright. - product-tests/ — automated CI gating, every PR, scenarios + chaos + regression + multimodal. Drives pytest + the @Scenario harness. Two different questions ("did my deploy work?" vs "is the product broken?") get two different tools. CLAUDE.md routing makes the distinction explicit. Files folded in (verbatim from PR #962, authored by emranemran): - .agents/skills/testing-livepeer-fal-deploy/SKILL.md - .env.example - deploy-staging.sh - run-app.sh - test-cloud-connect.sh - e2e/playwright.config.ts (camera permission + fake-device launch args) - e2e/tests/cloud-streaming.spec.ts (Perform-mode + camera + output video) - e2e/README.md (rewritten to point at the skill) CLAUDE.md merged: adds Emran's "Cloud testing — use this skill" routing section, with a note distinguishing his ad-hoc skill from the product-tests CI gate. Deprecation markers on the legacy "Local Cloud Testing" section preserved. Closes #962 once this lands. Co-Authored-By: Emran M <emranemran@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hunter Hillman <hthillman@gmail.com>
hthillman
added a commit
that referenced
this pull request
Apr 28, 2026
The CLAUDE.md "Cloud testing — use this skill" section that should have landed in 8fe40ed didn't get staged before the commit. Adding it now: routes "test cloud" / "verify cloud streaming" / cloud-connect errors to the testing-livepeer-fal-deploy skill, with a note distinguishing it from the product-tests CI gate. Deprecation markers on legacy "Local Cloud Testing" section preserved. Co-Authored-By: Emran M <emranemran@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hunter Hillman <hthillman@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ships everything needed for a fresh contributor (or Claude Code agent) to test the deployed Livepeer cloud path end-to-end. Combines the original bash tooling (originally #962), the Playwright e2e updates (originally #970, now folded in), and discoverability / routing / deploy-orchestration work so "test cloud" reliably deploys the user's current code and runs the real UI flow.
Since Livepeer mode is the only supported cloud path going forward (the old direct /
fal_app.py+CloudConnectionManagermode is being deprecated), this PR establishes the single cloud-testing entry point.What "test cloud" now does
An agent matching the skill's trigger phrases ("test cloud", "verify cloud streaming", cloud-connect error, edits to
livepeer_fal_app.py/livepeer_app.py, etc.) walks through:.env.localif set).SCOPE_CLOUD_APP_IDfrom the confirmed app+env following fal's URL convention (main→ no suffix; other envs →--<env>suffix).SCOPE_CLOUD_API_KEY/SCOPE_USER_ID; stop + ask if missing.run-app.showns it.SCOPE_FAL_APP_NAME=… SCOPE_FAL_ENV=… ./deploy-staging.sh. Abort on failure — no running Playwright against stale code.cd e2e && npx playwright test). Full round-trip through livepeer trickle to the fal runner and back.Two test paths, one skill
--use-fake-device-for-media-stream), verifies the output video comes back from the cloud. Produces every session-lifecycle Kafka event:pipeline_load_start,pipeline_loaded,session_created,stream_started,stream_heartbeat,session_closed. ~2–8 min depending on cold/warm.test-cloud-connect.sh(secondary) — POSTs/api/v1/cloud/connectand polls status. Only verifies the wrapper-layerwebsocket_connected/websocket_disconnectedpair. Bisect-friendly exit codes (0/1/2/3). Good forgit bisect runor "did the fal container come up?".Files
run-app.sh— launches daydream-scope in livepeer cloud mode, sources secrets from.env.local. (Fixed a bash-quoting bug that broke under word-splitting.)deploy-staging.sh— now parametrized onSCOPE_FAL_APP_NAME/SCOPE_FAL_ENV/SCOPE_FAL_AUTH. Works from any cwd. Ships tracked so any contributor can run it.test-cloud-connect.sh— HTTP-only orchestration script (push → CI wait → deploy-staging → connect → status poll)..env.example— required/optional env vars grouped into client-side (SCOPE_CLOUD_APP_ID,SCOPE_CLOUD_API_KEY,SCOPE_USER_ID) and deploy-side (SCOPE_FAL_APP_NAME,SCOPE_FAL_ENV,SCOPE_FAL_AUTH).e2e/playwright.config.ts— camera permission + fake-device launch args so headless Chromium completes browser↔local-scope WebRTC and actually delivers frames.e2e/tests/cloud-streaming.spec.ts— new-UI selectors (Workflow/Perform toggle), Camera input, output-video assertion.e2e/README.md— rewritten. Old version documented stale env vars that don't exist anymore; new one points at the skill for canonical setup..agents/skills/testing-livepeer-fal-deploy/SKILL.md— leads with Playwright, prescribes the ask-user → deploy → run flow above.CLAUDE.md— new "Cloud testing — use this skill" section routes "test cloud" prompts to the skill. LegacyLocal Cloud TestingandMCP Server Testing with Local Cloud Devsections get deprecation markers pointing at the new path.Verified
npx playwright testpasses againstscope-livepeer-emranwith the passthrough pipeline (~2.8 min cold, ~17 s warm). Output video plays; heartbeats fire.scope-livepeer-emranfor verification but not merged), the full event set lands in ClickHouse correlated byuser_idandconnection_id = manifest_id.Test plan
.env.exampledocuments every required env var..env.localvalues;./deploy-staging.shpicks upSCOPE_FAL_APP_NAME+SCOPE_FAL_ENV../run-app.shstarts scope on :8000.cd e2e && npm install && npx playwright install chromiumthennpx playwright test→ passes../test-cloud-connect.sh --skip-push --skip-build-wait --skip-deployexits 0 with CONNECTED.Supersedes / related
scope-livepeer-emranto validate; it needs to be reviewed and merged separately.🤖 Generated with Claude Code