Draft
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Contributor
🚀 fal.ai Preview Deployment
Livepeer Runner
Testing Livepeer Mode |
Add an in-app agent that translates natural-language intent into working Scope graphs and runtime parameter tweaks. Accessible from a new Agent button next to Graph in the toolbar; a right-side resizable drawer hosts the conversation, streaming responses, and workflow-proposal cards. Structural changes (new graph / pipeline load) require an explicit approve step; runtime params (prompts, noise, LoRA weights) auto-apply. Multi-provider BYOK: Anthropic (default), any OpenAI-compatible endpoint (OpenAI, OpenRouter, Groq, together.ai, Fireworks), and self-hosted (Ollama, vLLM, LM Studio). Provider + model are configured under a new Settings → Agent tab; API keys continue to live in the API Keys tab (extended to cover anthropic, openai, llm_custom). Agent tools are pure-Python in `agent_tool_impls.py` so both the MCP server and the in-app agent share the same surface. Everything the agent knows about Scope is discovered at runtime through introspection tools (pipeline registry, schema metadata, blueprints, LoRAs, assets, node-type manifest), so new pipelines and nodes work on day 0 without touching agent code. Backend - `src/scope/server/agent_tool_impls.py`: shared tool surface - `src/scope/server/agent_state.py`: session store + provider config - `src/scope/server/agent_providers.py`: Anthropic + OpenAI-compat + self-hosted providers behind a single event protocol - `src/scope/server/agent_loop.py`: SSE turn runner with propose→approve handshake and vision feedback - `src/scope/server/app.py`: `/api/v1/agent/chat` (SSE), `/agent/decision`, `/agent/config`, `/agent/sessions`, `/agent/node-catalog`; `/api/v1/keys/*` extended - `anthropic>=0.40` added to pyproject Frontend - `frontend/src/components/agent/`: AgentDrawer, ChatTranscript, MessageBubble, ToolCallBlock, Composer, WorkflowProposalCard - `frontend/src/contexts/AgentContext.tsx` + `lib/agentClient.ts`: state + SSE-over-POST reader - `frontend/src/components/settings/AgentProviderTab.tsx`: new settings tab, wired into SettingsDialog + Header - `frontend/src/components/graph/GraphToolbar.tsx`: Agent button - `frontend/src/data/nodes/manifest.json`: UI node-type catalog so the agent can compose arbitrary node graphs without hardcoding Signed-off-by: Hunter Hillman <hthillman@gmail.com>
…mpact tool UI
list_pipelines and get_pipeline_schema were reading .get("schemas", {})
from /api/v1/pipelines/schemas, but the endpoint returns {"pipelines": {...}}.
This made every pipeline (including plugin pipelines like ltx2/helios)
invisible to the agent, which then probed repeatedly and hit the round cap.
- Read from "pipelines" key; include name/description/supported_modes in the summary.
- On unknown pipeline_id, return the list of available ids so the agent can recover.
- Raise MAX_TOOL_ROUNDS 12→40 (a real workflow build legitimately chains many calls).
- Compact tool-call UI: shared subtle container, lighter chrome, smaller icons
so many tool calls no longer feel heavy while staying visible.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Hunter Hillman <hthillman@gmail.com>
On approval, the agent now writes the proposed graph into the React Flow canvas via a registered importer; the user presses Play to start. Previously apply_workflow POSTed to /pipeline/load + /session/start, which (a) never surfaced the graph in the UI and (b) failed in cloud mode because the backend tried to load pipelines on the local instance. - AgentContext: registerGraphImporter() pattern; decideProposal writes to canvas before notifying backend, toasts "Press Play to start". - GraphEditor: expose loadGraphConfig via useImperativeHandle (delegates to existing loadGraphFromParsed). - StreamPage: register importer that routes to the GraphEditor ref. - apply_workflow: strip pipeline-load / session-start side effects; just validate hash, clear pending proposal, return pipelines list. - System prompt: "approval writes the graph to the canvas automatically — apply_workflow only confirms it. Never call start_session, load_pipeline, or any session-starting tool." Continuation message updated to match. - .claude/launch.json: add scope-cloud dev entry for preview testing against the Livepeer cloud relay. Signed-off-by: Hunter Hillman <hthillman@gmail.com>
The agent kept concluding "propose_workflow only accepts backend nodes,
UI nodes must be added manually." That's wrong — GraphConfig silently
ignores extra fields, and the original dict (with ui_state intact) is
what gets stored on the proposal and sent to the frontend. The frontend
already handles ui_state via graphConfigToFlow. So UI nodes CAN round-
trip through propose_workflow; the agent just didn't know the shape.
Verified with a Python repro: {nodes, edges, ui_state: {nodes: [trigger,
slider], edges: [...]}} passes GraphConfig(**graph).validate_structure()
cleanly and the original dict keeps ui_state.
- System prompt GRAPH SHAPE section: explicit split between top-level
nodes/edges (backend: source|pipeline|sink|record) and ui_state.nodes/
ui_state.edges (everything else). Note blueprint grafting goes under
ui_state, not top-level.
- propose_workflow tool description: spell out both parts; warn that UI
nodes in top-level nodes will fail validation; point at
get_current_graph for a concrete example.
- get_blueprint tool description: flag that its nodes/edges are UI-typed
and need to land in ui_state when grafted.
Signed-off-by: Hunter Hillman <hthillman@gmail.com>
Workflows proposed by the agent were often missing wires (no VACE node,
no prompt connection) or used invalid handle prefixes ('parameter:'),
forcing the user to patch the graph manually. Fix that end-to-end:
- Add _validate_proposal() invoked by propose_workflow. Checks handle
format ('param:'/'stream:' only), edge source/target existence,
pipeline-handle presence, subgraph internal consistency, and emits
warnings for likely-missing wires (VACE → pipeline, prompt → pipeline).
Errors bounce back to the agent with actionable messages so it can
iterate; warnings flow through on success.
- Add get_pipeline_handles(pipeline_id) tool. Returns the exact
stream_inputs / stream_outputs / param_inputs a ui_state edge may
target, including aggregate handles (param:__prompt / __vace /
__loras) that depend on pipeline capability flags.
- Update node manifest: document VACE's real input handles
(ref_image, first_frame, last_frame, context_scale), clarify subgraph
dynamic ports via data.subgraphInputs/Outputs, describe control switch
mode, and add a $handle_convention key.
- Append a WIRING cheat sheet to SYSTEM_PROMPT with canonical slider /
prompt / VACE patterns so the agent has a concrete template to copy.
- Rewrite propose_workflow tool description to spell out the param: vs
stream: convention and require a get_pipeline_handles call before
wiring to a pipeline.
Tests: tests/test_agent_tool_impls.py covers the validator's reject
paths (bad prefix, missing target, unknown pipeline handle, subgraph
inconsistency, external port mismatch), its accept path, its VACE /
prompt warnings, and the handle deriver's aggregate inclusion.
Signed-off-by: Hunter Hillman <hthillman@gmail.com>
Adds evals/ package that drives the real agent endpoint in-process (via httpx.ASGITransport + asgi-lifespan) and grades proposals with deterministic structural checks. Three starter-workflow cases are prepopulated (mythical-creature, dissolving-sunflower, ltx-text-to-video); authoring more is dropping a YAML into evals/cases/. Excluded from default pytest (addopts -m 'not eval') and from PR CI — a manual-dispatch workflow runs it on demand so we can measure pass-rate without burning API budget on every push. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hunter Hillman <hthillman@gmail.com>
Adds two cases spanning the specificity range real users send:
- complex-krea-prompt-switch-record: precise multi-concept prompt
(krea pipeline + VACE reference image + prompt_list with >=5 items
driven by trigger + record node). Exercises the full new check surface.
- vague-capture-moments: deliberately vague "play with my webcam and
capture anything cool" — tests the agent's ability to fill gaps, with
graders that only assert clearly-implied structure.
New grader checks:
- pipelines_count_at_least — min pipeline count without pinning ids
- node_present — asserts N UI nodes of a type exist; optional min_items
for prompt_list
- wire_present { kind: pipeline_to_record } — pipeline stream output
into a record node
- wire_present { kind: prompt_list_to_pipeline } — prompt_list to
pipeline's param:__prompt
- wire_present { kind: trigger_to_prompt_list } — value source into
a prompt_list's param:trigger / param:cycle
README documents the precise-vs-vague authoring pattern and the new
check reference.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Hunter Hillman <hthillman@gmail.com>
Driven by the Krea eval failure: `complex-krea-prompt-switch-record` flagged 9/9 structural checks, and analysis split the failures into real agent mistakes (ignored pipeline name, invented handles, skipped prompt_list) plus two infrastructure bugs masking correct behavior. Agent improvements (src/scope/server/agent_loop.py): - "Honor the user's pipeline name" rule (CORE PRINCIPLES 1a). If the user names a pipeline, match against list_pipelines id+name and use it; never silently substitute. - "Never fabricate a parameter name" rule (WIRING). Reference-image conditioning always goes through VACE; forbid invented handles like param:i2v_image. - Promote prompt_list to the canonical "switch between N prompts with a button press" pattern; demote subgraph switcher to fallback for non-button-driven cases. - Completeness checklist gains items for recording (add record node + stream edge when user says save/capture) and prompt-list switching. Manifest fix (frontend/src/data/nodes/manifest.json): - prompt_list entry was lying about its handle names. Output was documented as "prompts" but the real React Flow handle built by PromptListNode.tsx is `param:prompt` (singular). The trigger and cycle input handles were missing entirely, as was the data.promptListItems data field. Corrected all three. Grader fixes (evals/grader.py): - node_present now searches top-level graph.nodes for source/pipeline/sink/record types; still searches ui_state.nodes for UI-only types (slider, vace, prompt_list, ...). - wire_present kind=pipeline_to_record accepts top-level stream edges (the canonical form a record node is wired with), in addition to the existing ui_state form. - Both bugs were false-negating the Krea run's correctly-wired top-level record node. New eval cases (evals/cases/): - complex-pipeline-name-respect — user names "krea"; agent must pick krea-realtime-video, not substitute. - complex-reference-image-no-invented-handles — user asks for reference-image conditioning on longlive; grader asserts the image → vace → pipeline path, closing the invented-handle loophole. - vague-save-the-output — "saves whatever I make" is phrasing for a record node; exercises the new completeness item. Verification: regrading the saved Krea artifact flips 2 failures to pass (record detection) and leaves 7 real agent failures. Live smoke on vague-save-the-output (passthrough) passes 1/1. Regression on starter-ltx-text-to-video is 3/3. The two GPU-gated cases (complex-pipeline-name-respect, complex-reference-image-no-invented-handles) correctly trigger the agent's "can't do what you asked, ask first" behavior when the named pipeline isn't registered locally; they're authored for GPU production CI. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hunter Hillman <hthillman@gmail.com>
User reported the agent produced a working workflow with two sink nodes, one of which had no incoming edge and shouldn't have been there. Validation passed (disconnected sinks are legal at the schema level) so only a grader check can guard against this. Adds a new `orphan_sinks` check that, for each top-level sink, looks for at least one top-level stream edge targeting it. Any sink without one is flagged. Wired into the forbid list of all 8 existing cases so it's a universal regression guard rather than dependent on a single reproducer prompt. Verified the check against the saved r01 artifacts: all pass (the previous Krea run had one correctly-wired sink, not an orphan). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hunter Hillman <hthillman@gmail.com>
8f80cf7 to
33db077
Compare
…i_node reflow
Drawer was a full-height fixed overlay that obscured the graph; now it's a
flex sibling of StreamPage, so opening it resizes the canvas instead of
covering it. API-key / provider changes now dispatch a
`scope:agent-config-changed` window event that AgentContext listens on, so
the "no API key configured" banner clears without a reload.
SYSTEM_PROMPT tightened: propose_workflow is explicitly structural-only,
runtime-tweakable params must go through update_parameters (not a new
proposal), and the STYLE block bans meta-narration phrases the model was
leaking into chat ("Let me...", "Hmm...", "The field is X (labeled Y)...").
Added a LAYOUT section telling the agent to keep UI-state nodes LEFT of
x=0 so they don't collide with the frontend's top-level auto-layout strip.
As a safety net, propose_workflow now runs _reflow_ui_nodes after
validation: AABB-test every UI node against every other UI node and
against the predicted top-level column rectangles; if anything overlaps,
reassign all UI nodes into three deterministic columns at x=-320/-620/-920
by role (sliders/primitives/triggers/math closest in, prompt lists middle,
image/vace/lora/subgraph outermost). No-op when the agent's layout is
already clean. Verified on the known-bad complex-krea-prompt-switch-record
proposal.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Hunter Hillman <hthillman@gmail.com>
…eak cases Two new regression guards: overlapping_nodes (forbid check): catches positioning bugs that render as visually stacked nodes even when edges are correct. Mirrors the reflow's bbox math — 240x140 for UI nodes (280 tall for image/vace/subgraph), 200x60 for top-level nodes at x=50/350/650/950. Added to all 8 existing cases and to the new layout-nodes-spaced case. forbid_proposal (Case field): set true on cases where the agent must NOT emit workflow_proposal because update_parameters is the right tool. Runner treats presence of a proposal as a failure instead of the usual "no proposal means fail" short-circuit. Two new cases: - layout-nodes-spaced: longlive + 2 sliders + prompt_list with triggers, exercising the many-UI-nodes-alongside-multiple-top-level-nodes surface where overlaps were observed. - runtime-tweak-no-repropose: user asks to change noise_scale on a running graph; any workflow_proposal fails the case. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hunter Hillman <hthillman@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an in-app agent that translates natural-language intent into working Scope graphs and live parameter tweaks. Accessible from a new Agent button next to Graph in the toolbar; a right-side resizable drawer hosts the conversation, streaming replies, tool-call blocks, and workflow-proposal cards.
anthropic,openai,llm_custom).agent_tool_impls.pyand shared between the MCP server and the in-app agent, so behavior stays identical.What's inside
Backend (
src/scope/server/)agent_tool_impls.py— shared tool surface (list/load pipelines, get schemas, capture frame, propose/apply workflow, update params, recording, logs, hardware, etc.)agent_state.py— session store, provider config, proposal handshake with graph hashagent_providers.py— Anthropic + OpenAI-compatible + self-hosted providers behind a single internal event protocolagent_loop.py— SSE turn runner; streams text deltas, tool calls, tool results, proposalsapp.py—/api/v1/agent/chat(SSE),/agent/decision,/agent/config,/agent/sessions,/agent/node-catalog; keys endpoint extendedanthropic>=0.40added topyproject.tomlFrontend (
frontend/src/)components/agent/—AgentDrawer,ChatTranscript,MessageBubble,ToolCallBlock,Composer,WorkflowProposalCardcontexts/AgentContext.tsx+lib/agentClient.ts— React state + SSE-over-POST reader (fetch + ReadableStream, since EventSource can't POST)components/settings/AgentProviderTab.tsx— new settings tab, wired intoSettingsDialog+Headercomponents/graph/GraphToolbar.tsx— Agent button next to Graphdata/nodes/manifest.json— UI node-type catalog so the agent can compose arbitrary graphs without hardcoding node typesTarget flows this unlocks
longlive), grafts in themanual-prompt-switcherblueprint, preloads prompts, proposes the graph. User approves → applied.capture_frame, reasons over the JPEG, callsget_pipeline_schemafor the current pipeline, and tunes the relevant knob (e.g.vace_context_scale). Auto-applied.Test plan
uv run ruff check src/— cleanuv run ruff format --check src/— cleannpm run build— buildsnpm run format:check— cleannpm run lint— 0 errors (49 pre-existing warnings, no new ones)uv run daydream-scopestarts clean; log showsAgent session store initializedGET /api/v1/agent/config,GET /api/v1/agent/node-catalog,GET /api/v1/agent/sessions,GET /api/v1/keys(includes anthropic/openai/llm_custom)CLAUDE • CLAUDE-SONNET-4-6; missing-key banner renderscapture_frame+update_parametersauto-applyOut of MVP
🤖 Generated with Claude Code