Skip to content

feat: generate path-level prerender suggestions in audit worker#2500

Draft
ssilare-adobe wants to merge 6 commits into
mainfrom
feat/path-level-prerender-suggestions
Draft

feat: generate path-level prerender suggestions in audit worker#2500
ssilare-adobe wants to merge 6 commits into
mainfrom
feat/path-level-prerender-suggestions

Conversation

@ssilare-adobe
Copy link
Copy Markdown
Contributor

@ssilare-adobe ssilare-adobe commented May 11, 2026

Summary

Implements path-level prerender suggestion generation in the audit worker — the third tier between per-URL and domain-wide suggestions.

New: src/prerender/path-suggestions.js

  • extractPathType(url) — derives /first-segment/* path pattern from a URL
  • RcvPathQualificationStrategy — qualification gates: urlCount >= 10, valuablePercent >= 33%, pathScore >= 1.5 (where pathScore = weightedValuableTraffic + avgContentGainRatio, mirroring the rcv-scoring-dashboard's DEFAULT_WEIGHTS.threshold = 1.5)
  • buildPathTypeSuggestions — groups eligible per-URL suggestions by path prefix, fetches Athena agentic traffic hits (getAgenticHitsMapFromAthena), qualifies each path, returns suggestion objects keyed as {pathPattern}|prerender with pathType: true discriminator
  • findPreservablePathSuggestions — finds existing path suggestions to preserve across re-audits (active/deployed ones are not overwritten; only metrics are refreshed)
  • markSuggestionsAsCoveredByPaths — self-healing: on every audit run clears stale coveredByDomainWide refs pointing to undeployed path suggestions, and marks newly-covered per-URL suggestions. Note: coveredByDomainWide is a shared field — when set by this function it stores a path suggestion ID (not a domain-wide suggestion ID); the self-heal distinguishes them via pathType === true.

Updated: src/utils/agentic-urls.js

  • Added getAgenticHitsMapFromAthena(site, context, limit) — queries Athena over a 4-week window and returns a Map<pathname, totalHits> used to weight path qualification scores

Updated: src/cdn-logs-report/utils/query-builder.js + new SQL

  • Added top-agentic-urls-with-hits.sql — returns (url, total_hits) for the top agentic URLs, aggregated over the window

Updated: src/prerender/handler.js

  • Path suggestions opt-in via prerender.pathSuggestionsEnabled site config flag (defaults false)
  • On enabled sites: finds preservable paths → builds new paths → refreshes metrics on preserved ones → adds both to allSuggestions
  • mapNewSuggestion rank: domain-wide = 999999, path = 100000 (PATH_TYPE_SUGGESTION_RANK), per-URL = 0
  • Calls markSuggestionsAsCoveredByPaths after every syncSuggestions

Sequence

PR 3 of 4 — depends on spacecat-shared (PR 1) and spacecat-api-service (PR 2). project-elmo-ui (PR 4) depends on this.

Test plan

  • 41 new unit tests for path-suggestions.js (100% coverage)
  • 19 new unit tests for agentic-urls.js additions (100% coverage)
  • New handler tests cover pathSuggestionsEnabled=true/false branching and metric refresh
  • All prerender + agentic-urls tests pass

@codecov
Copy link
Copy Markdown

codecov Bot commented May 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown

This PR will trigger a minor release when merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant