Skip to content

docs: add OpenAPI spec for query fan-out report endpoints#2402

Merged
ekremney merged 3 commits into
mainfrom
ekremney/fanout-report-openapi
May 13, 2026
Merged

docs: add OpenAPI spec for query fan-out report endpoints#2402
ekremney merged 3 commits into
mainfrom
ekremney/fanout-report-openapi

Conversation

@ekremney
Copy link
Copy Markdown
Member

@ekremney ekremney commented May 12, 2026

Summary

OpenAPI-only PR. Defines the contract for the upcoming Query Fan-Out feature so the UI can build against a stable schema while the backend implementation lands in a follow-up.

  • GET /org/{spaceCatId}/brands/{brandId}/fanout-report302 Found redirect to a presigned S3 URL when a report exists; 404 Not Found otherwise. The Lambda never loads the report body — the client follows the redirect straight to S3.
  • POST /org/{spaceCatId}/brands/{brandId}/fanout-report — synchronously regenerates the report (DB reads → Semrush fan-out sweep → curation → S3 write), returns 201 Created with no body. Caller follows up with a GET to fetch the presigned URL.
  • New schemas: FanoutReport, FanoutReportTopic, FanoutReportSubQuery, FanoutReportRanking.

Full decision log: tmp/fanout/questions.md.


UI integration guide

The flow the UI sees:

  1. GET /org/{org}/brands/{brand}/fanout-report
  2. If 302 → follow Location header → fetch the gzipped JSON from S3 (browsers + fetch auto-decompress via Content-Encoding: gzip).
  3. If 404 → no report yet; show empty state, optionally surface a "Generate report" CTA that calls POST on the same path.
  4. After POST returns 201, redo step 1 to fetch the URL.

The S3 JSON body conforms to FanoutReport. UI consumption mapping below — one section per figure in Query Fanout in Project Serenity.pdf.

Top-level (every figure)

PDF element FanoutReport field Notes
"Data as of …" / freshness stamp isoDate Semrush snapshot date echoed from upstream.
Brand identity (header) brandName, brandDomains brandDomains is the apex set the curation used for ranking-match — useful for "why didn't this rank?" debug.
Market / model context country ("US"), llm ("chatgpt"), windowDays (7) Constants in MVP; promoted to query params later.
Topic list topics[] At most 5 topics, ordered by priorityScore descending. May be empty [].

Figure 1 — Auto-suggested topic cards

Each card on the entry screen maps to one element of topics[]:

PDF card field Source Notes
Card title (e.g. "Best CRM for SMB") topics[].name Falls back / pair with matchedTopicName when Semrush rewrote the input.
"matched to: X" hint (optional) topics[].matchedTopicName Show only when it differs meaningfully from name.
Priority badge ("High" / "Medium") Derived from topics[].priorityScore UI owns the thresholds — API ships the raw score; UI applies its own bucket cut-offs (e.g. by score percentile or hard breakpoints).
"Low citation rate with wide coverage gap …" descriptive copy UI-generated Not in the API. Compose from the score + rates client-side.
Topic volume: 2,400/mo topics[].volume
Citation rate: 3% across tracked prompts topics[].citationRate × 100 null means no execution data; display as .
Coverage: 24% of fan-out query universe See "Coverage derivation" below. Computed from subQueries[].brandPosition.

Coverage derivation (used in Fig 1 cards and Fig 2 bar):

const subs = topic.subQueries;
const total = subs.length;
const ranking  = subs.filter(s => s.brandPosition !== null && s.brandPosition <= 5).length;
const lowRank  = subs.filter(s => s.brandPosition !== null && s.brandPosition >= 6).length;
const notRanking = subs.filter(s => s.brandPosition === null).length;
const coveragePct = total === 0 ? 0 : Math.round(((ranking + lowRank) / total) * 100);

Figure 2 — Topic baseline & fan-out coverage bar

Shown after the user picks a topic from Fig 1. All from topics[selectedIndex]:

PDF element Source
Topic title (Best CRM for SMB) topics[].name
Subhead (8 tracked prompts · 28 deduplicated fan-out sub-queries) topics[].promptsTotal and topics[].subQueries.length
Topic volume: 2,400 monthly searches topics[].volume
Mention rate: 47% of tracked prompts topics[].mentionRate × 100
Citation rate: 3% of tracked prompts topics[].citationRate × 100
Ranking for 7 of 28 sub-queries (25% coverage) (ranking + lowRank) and total from the snippet above.
Segmented bar See below.

Coverage bar segments (MVP — 3 segments, not 4):

PDF label MVP value (this contract)
Ranking (pos. 1–5) ranking count
Low rank (pos. 6–10) lowRank count
Not ranking — competitor + Not ranking — 3rd party Collapsed to a single Not ranking segment in MVP — value = notRanking count

⚠️ The competitor-vs-3rd-party split shown in the PDF requires competitor-aware Semrush metadata that's not in iteration 1 of the Semrush contract. The schema deliberately ships no flags for it — when Semrush adds it, we'll extend the schema additively (no breaking change). For now Fig 2 collapses to 3 segments.

Figure 3 — Content opportunity cards

No (skipped for now)

Figure 4 — Sub-query table

Row per subQueries[] element with three filter tabs:

Column Source Display logic
Sub-query subQueries[].keyword
Intent subQueries[].intent Passed through from Semrush (e.g. Comparison, Commercial, Informational).
Volume subQueries[].volume
Your ranking subQueries[].brandPosition null✗ Not ranking; 1..5✓ Pos. N; 6..10↓ Pos. N.
Top domain subQueries[].topDomain The PDF also shows a Competitor / 3rd party chip — not in MVP (see Fig 2 note).

Tabs are pure client-side filters:

Tab Filter
All sub-queries no filter
Gaps only brandPosition === null || brandPosition >= 6
Content opportunities brandPosition === null

Behavior callouts the UI should be aware of

  1. brandDomains matching is exact-hostname, with www. stripped. Subdomains are kept as distinct entries (blog.acme.comacme.com). If a customer's brand-page only lists acme.com but the SERP credit lives on blog.acme.com, that ranking will not be attributed to the brand. Surface a customer-facing hint when this is likely (e.g. when brandDomains.length === 1 and many subQueries[].topDomain values share the brand's apex but differ at the subdomain).
  2. Topic-count cap is 5. When customers ask "where's topic X?", the answer may be "it didn't make the top 5 by priorityScore" — UI may want to expose this as a tooltip or empty-state message.
  3. Hardcoded constants in MVP. country = "US", llm = "chatgpt", windowDays = 7. If the UI surfaces a market/model picker, treat these as "the only options for now."
  4. Empty topics array is valid. topics: [] means either (a) the brand has no tracked topics, or (b) every topic was dropped during curation (e.g. low Semrush similarity score). Don't conflate with 404.
  5. POST is admin-only-by-convention but not enforced. Anyone with LLMO entitlement can POST. UI should still gate the "Generate report" button to roles that actually need it, to avoid accidental cost.

Test plan

  • npm run docs:lint — clean for new schemas (4 OpenAPI 3.1 errors fixed; 194 pre-existing warnings on other specs unchanged).
  • npm run docs:build — assembled to docs/index.html without errors.
  • UI: render docs/index.html and confirm the new endpoints + schemas look correct under the llmo tag.
  • UI: trial-mock a Fig 1 card and a Fig 4 row against a hand-edited sample FanoutReport JSON to confirm the field mapping above is sufficient.

🤖 Generated with Claude Code

@ekremney ekremney requested a review from claudiaboldis May 12, 2026 14:50
@github-actions
Copy link
Copy Markdown

This PR will trigger a minor release when merged.

ekremney and others added 3 commits May 12, 2026 17:22
Defines the contract for the upcoming Query Fan-Out feature (LLMO):
- GET /org/{spaceCatId}/brands/{brandId}/fanout-report — 302 to a presigned
  S3 URL when a report exists, 404 otherwise. Lambda never loads the body.
- POST /org/{spaceCatId}/brands/{brandId}/fanout-report — synchronously
  generates the report, writes it to S3, returns 201 with no body.

Adds the FanoutReport / FanoutReportTopic / FanoutReportSubQuery /
FanoutReportRanking schemas describing the JSON document the GET redirect
points to. Implementation lands in a follow-up PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Drops topicId, description, matchedTopicId, similarityScore, promptsMention,
promptsCitation — debug-only or pre-filtered redundancy. Keeps matchedTopicName
for the "Semrush rewrote the input" hint case.

Final per-topic shape (9 fields):
  topicUuid, name, matchedTopicName, volume, promptsTotal,
  mentionRate, citationRate, priorityScore, subQueries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@quazar/ai-seo-ts now includes v2/fanout/. Pinning the intent values to
KEYWORD_INTENT_ENUM (COMMERCIAL / INFORMATIONAL / NAVIGATIONAL /
TRANSACTIONAL / UNSPECIFIED) and documenting that the curation step picks
intents[0] from Semrush's multi-label array.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ekremney ekremney force-pushed the ekremney/fanout-report-openapi branch from 15cdce8 to 4daed04 Compare May 12, 2026 15:23
@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@ekremney ekremney changed the title feat: add OpenAPI spec for query fan-out report endpoints docs: add OpenAPI spec for query fan-out report endpoints May 13, 2026
@ekremney ekremney merged commit 79fb911 into main May 13, 2026
21 checks passed
@ekremney ekremney deleted the ekremney/fanout-report-openapi branch May 13, 2026 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant