Skip to content

docs(preflight): design for end-to-end request-id tracing (SITES-43567)#2386

Open
noozoo wants to merge 2 commits into
mainfrom
SITES-43567
Open

docs(preflight): design for end-to-end request-id tracing (SITES-43567)#2386
noozoo wants to merge 2 commits into
mainfrom
SITES-43567

Conversation

@noozoo
Copy link
Copy Markdown

@noozoo noozoo commented May 11, 2026

This PR provides a doc for discussion about https://jira.corp.adobe.com/browse/SITES-43567

The design is for addition of an x-preflight-request-id in the allowed header, and code to save it, include it in logs, and provide it to calls to Mysticat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@noozoo noozoo requested a review from GeezerPelletier May 11, 2026 19:59
@codecov
Copy link
Copy Markdown

codecov Bot commented May 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@GeezerPelletier
Copy link
Copy Markdown
Contributor

PR Review: docs(preflight): design for end-to-end request-id tracing

Design doc only — no code changes. The four-part decomposition (CORS, middleware, logging, forwarding) is clean and the deployment ordering note at the bottom is the right call. A few items worth pinning before implementation starts.


Items to resolve before implementation

1. UUID validation — pin the approach

The doc says "validates it as a UUID (regex or crypto.randomUUID round-trip)." A crypto.randomUUID() call generates a new UUID — it cannot validate an incoming string. The implementation should use a UUID v4 regex (or a lightweight uuid package validate() call). This matters because preflightRequestId flows into log records and outbound headers to Mystique — a loose validation opens a log-injection / header-injection vector if a caller sends a crafted value. Pin the approach explicitly: recommend uuid.validate(value) && uuid.version(value) === 4.

2. getProfile?.() — verify this is a method, not a getter

context.attributes?.authInfo?.getProfile?.()?.user_id

If getProfile is a plain property (getter) on authInfo, calling it as getProfile?.() returns undefined silently. Worth confirming the actual helix-universal authInfo API shape — if it's a getter, the expression becomes context.attributes?.authInfo?.profile?.user_id.

3. t0 — where is the timer started?

The success-path log snippet uses durationMs: Date.now() - t0 but t0 is not mentioned elsewhere in the doc. Name where it's captured (const t0 = Date.now() at the top of createBetaPreflightJob) so the implementation is unambiguous.

4. imsOrgId header path — confirm for the beta path

context.pathInfo?.headers?.['x-gw-ims-org-id']

Some helix-universal routes surface request headers at context.pathInfo.headers, others at context.invocation.event.headers. Confirm the beta path controller already reads other headers from context.pathInfo so this accessor is consistent with the rest of the file.


Smaller notes

Legacy path and request ID — the doc scopes logging to the beta path only, which is reasonable given the legacy path is slated for deletion. Worth adding one sentence to the doc explicitly stating that preflightRequestId is intentionally not forwarded on the legacy path, so a future reader doesn't treat it as a gap to fill.

Coralogix field indexing — given that preflightRequestId is the join key for cross-service tracing (SpaceCat ↔ Mystique), confirm the field name is indexed in Coralogix before cutover. A searchable structured field is useless if it lands in an unindexed blob.

step sourcing in error vs. success logs — the pre-lookup error log uses step: data.step and the success log uses a separate step variable. These should be the same value, but naming them consistently (always data.step, or always a destructured step) avoids confusion in the implementation.


Summary

Good to proceed to implementation once items 1–4 above are addressed (or consciously deferred with a note). Item 1 (validation approach) is the only one with a security implication — the others are implementation clarity. The overall design is sound: the middleware placement, preflightLogFields helper pattern, and conditional header-forwarding via hasText are all correct.


Reviewed with Claude Code

Copy link
Copy Markdown
Contributor

@GeezerPelletier GeezerPelletier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please go over my review comments.

@github-actions
Copy link
Copy Markdown

This PR will trigger no release when merged.

@noozoo
Copy link
Copy Markdown
Author

noozoo commented May 12, 2026

All issues addressed, as follows:

  1. UUID validation — pinned
    You're right, crypto.randomUUID() is a generator. Replaced with isValidUUID from @adobe/spacecat-shared-utils (in-repo convention; already used in opportunities.js, url-store.js, remove-delegate.js). It wraps uuid.validate() pinned to v4 — the exact uuid.validate(value) && uuid.version(value) === 4 shape you suggested. Also added the security rationale (log/header-injection, response-splitting via newlines/control chars/oversized payloads) and a short example wrapper showing absent/invalid → field omitted from logs and not forwarded.

  2. getProfile?.() — verified as a method
    12+ call sites across src/ invoke authInfo.getProfile() with parens — ephemeral-run.js:77, url-store.js:120, user-activities.js:162, access-control-util.js:209, llmo/llmo.js:530, plg-onboarding.js:1452, etc. If it were a getter every one would be invoking the result of a property access. Definitively a method; the snippet stands.

  3. t0 — named
    Doc now says: capture const t0 = Date.now(); as the first statement of createBetaPreflightJob (and getBetaPreflightJobStatusAndResult), before the badRequest branches. Otherwise the validation-branch error logs would emit durationMs: NaN.

  4. imsOrgId accessor — confirmed context.pathInfo.headers
    Verified and recorded under Identity Fields:

The JSDoc on createBetaPreflightJob already declares it; the in-controller reads via resolvePromiseToken → src/support/utils.js use context.pathInfo?.headers?.cookie / .authorization.
Repo-wide grep finds only pathInfo.headers accessors across every controller and support module; zero invocation.event.headers reads under src/.
Structural reason: enrichPathInfo is in the global wrapper chain (src/index.js) so every route gets it populated.
One subtlety the doc now calls out: the new preflightRequestIdWrapper runs before enrichPathInfo, so it reads from request.headers directly; the controller-level imsOrgId read happens after and uses context.pathInfo.headers.

Smaller Notes

  1. Legacy path
    Added a paragraph stating preflightRequestId is intentionally not propagated on the legacy path (SQS to spacecat-audit-worker, not direct HTTP to Mystique; would need both a message-shape change and a worker-side change), and that the MFE only sets the header on beta-path requests so the value is naturally absent on legacy traffic.

  2. Coralogix indexing
    New ## Pre-Cutover Checklist section: confirm with the observability owner that preflightRequestId is registered as an indexed structured field in Coralogix (not just present in the JSON body), since otherwise the cross-service join becomes a full-text scan. Same check extended to imsOrgId, imsUserId, siteId, jobId.

  3. step sourcing
    Slightly more than a naming nit — the two values were also semantically different: pre-validation data.step is raw (mixed case, possibly undefined); local step is data.step.toLowerCase() and only assigned after the step-validation branch. So "always data.step" regresses case normalization, and "always step" doesn't work because the local doesn't exist in the pre-validation branches. Resolution: move the destructure to the top alongside t0:

const t0 = Date.now();
const { data = {} } = context;
const { url, siteId } = data;
const step = data.step?.toLowerCase();
Now step is one variable in scope from line 1 (undefined for malformed, lowercased otherwise); every call site uses it. The validation check simplifies to includes(step). Both the error and success snippets in the doc are updated. Added a one-liner that the success-path siteId: site.getId() deliberately overrides the function-scoped siteId (request may have arrived without one and the site was looked up by preview URL).

@noozoo noozoo requested a review from GeezerPelletier May 12, 2026 20:17
@noozoo noozoo temporarily deployed to dev-branches May 12, 2026 20:27 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants