feat: generate path-level prerender suggestions in audit worker#2500
Draft
ssilare-adobe wants to merge 6 commits into
Draft
feat: generate path-level prerender suggestions in audit worker#2500ssilare-adobe wants to merge 6 commits into
ssilare-adobe wants to merge 6 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
This PR will trigger a minor release when merged. |
…param to qualify, assert buildPathTypeSuggestions not called when disabled
…ainWide multiplexing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements path-level prerender suggestion generation in the audit worker — the third tier between per-URL and domain-wide suggestions.
New:
src/prerender/path-suggestions.jsextractPathType(url)— derives/first-segment/*path pattern from a URLRcvPathQualificationStrategy— qualification gates:urlCount >= 10,valuablePercent >= 33%,pathScore >= 1.5(wherepathScore = weightedValuableTraffic + avgContentGainRatio, mirroring the rcv-scoring-dashboard'sDEFAULT_WEIGHTS.threshold = 1.5)buildPathTypeSuggestions— groups eligible per-URL suggestions by path prefix, fetches Athena agentic traffic hits (getAgenticHitsMapFromAthena), qualifies each path, returns suggestion objects keyed as{pathPattern}|prerenderwithpathType: truediscriminatorfindPreservablePathSuggestions— finds existing path suggestions to preserve across re-audits (active/deployed ones are not overwritten; only metrics are refreshed)markSuggestionsAsCoveredByPaths— self-healing: on every audit run clears stalecoveredByDomainWiderefs pointing to undeployed path suggestions, and marks newly-covered per-URL suggestions. Note:coveredByDomainWideis a shared field — when set by this function it stores a path suggestion ID (not a domain-wide suggestion ID); the self-heal distinguishes them viapathType === true.Updated:
src/utils/agentic-urls.jsgetAgenticHitsMapFromAthena(site, context, limit)— queries Athena over a 4-week window and returns aMap<pathname, totalHits>used to weight path qualification scoresUpdated:
src/cdn-logs-report/utils/query-builder.js+ new SQLtop-agentic-urls-with-hits.sql— returns(url, total_hits)for the top agentic URLs, aggregated over the windowUpdated:
src/prerender/handler.jsprerender.pathSuggestionsEnabledsite config flag (defaults false)allSuggestionsmapNewSuggestionrank: domain-wide = 999999, path = 100000 (PATH_TYPE_SUGGESTION_RANK), per-URL = 0markSuggestionsAsCoveredByPathsafter everysyncSuggestionsSequence
PR 3 of 4 — depends on spacecat-shared (PR 1) and spacecat-api-service (PR 2). project-elmo-ui (PR 4) depends on this.
Test plan
path-suggestions.js(100% coverage)agentic-urls.jsadditions (100% coverage)pathSuggestionsEnabled=true/falsebranching and metric refresh