-
Notifications
You must be signed in to change notification settings - Fork 329
Research: Economy Mode E2E gaps (#585) #685
Copy link
Copy link
Closed
Labels
go:needs-researchNeeds investigationNeeds investigationsquadSquad triage inbox — Lead will assign to a memberSquad triage inbox — Lead will assign to a membersquad:fidoAssigned to FIDO (Quality Owner)Assigned to FIDO (Quality Owner)type:spikeResearch/investigation — produces a plan, not codeResearch/investigation — produces a plan, not code
Metadata
Metadata
Assignees
Labels
go:needs-researchNeeds investigationNeeds investigationsquadSquad triage inbox — Lead will assign to a memberSquad triage inbox — Lead will assign to a membersquad:fidoAssigned to FIDO (Quality Owner)Assigned to FIDO (Quality Owner)type:spikeResearch/investigation — produces a plan, not codeResearch/investigation — produces a plan, not code
Research for #585. 34 unit tests exist but NO E2E validation.
Gaps: No CLI invocation test, no spawn flow test proving economy propagates, no CostTracker integration test, no baseline vs economy comparison API, selectModelForTask is internal-only.
Key files: models.ts (ECONOMY_MODEL_MAP + estimateCost), model-selector.ts (selectModelForTask), cost-tracker.ts (per-agent metrics), economy.ts (CLI toggle), economy-mode.test.ts (34 unit tests), economy-mode/SKILL.md (policy).
Key insight: Per-token pricing IS cheaper but total cost depends on total tokens consumed. If economy model generates 15x more tokens, savings zeroed out. E2E must measure total tokens.
Proposed tests: CLI toggle round-trip, model resolution under economy, CostTracker comparison, full pipeline with real session.
Refs #585