Skip to content

Support Sourcepoint GPP consent for EC generation#642

Open
ChristianPavilonis wants to merge 80 commits intofeature/edge-cookies-finalfrom
edge-cookie-sourcepoint-consent
Open

Support Sourcepoint GPP consent for EC generation#642
ChristianPavilonis wants to merge 80 commits intofeature/edge-cookies-finalfrom
edge-cookie-sourcepoint-consent

Conversation

@ChristianPavilonis
Copy link
Copy Markdown
Collaborator

Summary

  • Add client-side Sourcepoint JS integration that auto-discovers _sp_user_consent_* from localStorage and mirrors GPP consent into __gpp / __gpp_sid cookies
  • Extend server-side GPP decoding to extract sale_opt_out from US GPP sections (IDs 7–23)
  • Add GPP US consent branch to allows_ec_creation() between existing TCF and us_privacy checks

Closes #640

Test plan

  • Rust tests pass (cargo test --workspace — 992 tests including 8 new)
  • JS tests pass (npx vitest run — 288 tests including 6 new)
  • Clippy clean
  • Verify EC generation succeeds for a Sourcepoint-only site in a regulated US state with GPP consent present
  • Verify GPC still blocks EC when GPP US section is permissive
  • Verify existing TCF and us_privacy EC gating behavior unchanged

🤖 Generated with Claude Code

@ChristianPavilonis ChristianPavilonis marked this pull request as draft April 16, 2026 16:42
@ChristianPavilonis ChristianPavilonis marked this pull request as ready for review April 16, 2026 21:49
- Rename 'Synthetic ID' to 'Edge Cookie (EC)' across all external-facing
  identifiers, config, internal Rust code, and documentation
- Simplify EC hash generation to use only client IP (IPv4 or /64-masked
  IPv6) with HMAC-SHA256, removing User-Agent, Accept-Language,
  Accept-Encoding, random_uuid inputs and Handlebars template rendering
- Downgrade EC ID generation logs to trace level since client IP and EC
  IDs are sensitive data
- Remove unused counter_store and opid_store config fields and KV store
  declarations (vestigial from template-based generation)
- Remove handlebars dependency

Breaking changes: wire field synthetic_fresh → ec_fresh, response headers
X-Synthetic-ID → X-TS-EC, cookie synthetic_id → ts-ec, query param
synthetic_id → ts-ec, config section [synthetic] → [edge_cookie].

Closes #462
The EC rename commit (984ba2b) accidentally re-introduced the
reject_placeholder_secrets() call and InsecureDefault tests that were
intentionally removed in 4c29dbf. Replace with log::warn() for
placeholder detection and restore the simple smoke test.
…igration

- Add ec/ module with EcContext lifecycle, generation, cookies, and consent
- Compute cookie domain from publisher.domain, move EC cookie helpers
- Fix auction consent gating, restore cookie_domain for non-EC cookies
- Add integration proxy revocation, refactor EC parsing, clean up ec_hash
- Remove fresh_id and ec_fresh per EC spec §12.1
- Migrate [edge_cookie] config to [ec] per spec §14
Implement Story 3 (#536): KV-backed identity graph with compare-and-swap
concurrency, partner ID upserts, tombstone writes for consent withdrawal,
and revive semantics. Includes schema types, metadata, 300s last-seen
debounce, and comprehensive unit tests.

Also incorporates earlier foundation work: EC module restructure, config
migration from [edge_cookie] to [ec], cookie domain computation, consent
gating fixes, and integration proxy revocation support.
Implement Story 4 (#537): partner KV store with API key hashing,
POST /admin/partners/register with basic-auth protection, strict
field validation (ID format, URL allowlists, domain normalization),
and pull-sync config validation. Includes index-based API key lookup
and comprehensive unit tests.
Implement Story 5 (#538): centralize EC cookie set/delete and KV
tombstone writes in finalize_response(), replacing inline mutation
scattered across publisher and proxy handlers. Adds consent-withdrawal
cleanup, EC header propagation on proxy requests, and docs formatting.
Implement Story 8 (#541): POST /api/v1/sync with Bearer API key auth,
per-partner rate limiting, batch size cap, per-mapping validation and
rejection reasons, 200/207 response semantics, tolerant Bearer parsing,
and KV-abort on store unavailability.
Implement Story 9 (#542): server-to-server pull sync that runs after
send_to_client() on organic traffic only. Refactors the Fastly adapter
entrypoint from #[fastly::main] to explicit Request::from_client() +
send_to_client() to enable post-send background work.

Pull sync enumerates pull-enabled partners, checks staleness against
pull_sync_ttl_sec, validates URL hosts against the partner allowlist,
enforces hourly rate limits, and dispatches concurrent outbound GETs
with Bearer auth. Responses with uid:null or 404 are no-ops; valid
UIDs are upserted into the identity graph.

Includes EC ID format validation to prevent dispatch on spoofed values,
partner list_registered() for KV store enumeration, and configurable
pull_sync_concurrency (default 3).
Implement Story 11 (#544): Viceroy-driven E2E tests covering full EC
lifecycle (generation, pixel sync, identify, batch sync, consent
withdrawal, auth rejection). Adds EC test helpers with manual cookie
tracking, minimal origin server with graceful shutdown, and required
KV store fixtures. Fixes integration build env vars.
Consolidate is_valid_ec_hash and current_timestamp into single canonical
definitions to eliminate copy-paste drift across the ec/ module tree. Fix
serialization error variants in admin and batch_sync to use Ec instead of
Configuration. Add scaling and design-decision documentation for partner
store enumeration, rate limiter burstiness, and plaintext pull token storage.
Use test constructors consistently in identify and finalize tests.
- Rename ssc_hash → ec_hash in batch sync wire format (§9.3)
- Strip x-ts-* prefix headers in copy_custom_headers (§15)
- Strip dynamic x-ts-<partner_id> headers in clear_ec_on_response (§5.2)
- Add PartnerNotFound and PartnerAuthFailed error variants (§16)
- Rename Ec error variant → EdgeCookie (§16)
- Validate EC IDs at read time, discard malformed values (§4.2)
- Add rotating hourly offset for pull sync partner dispatch (§10.3)
- Add _pull_enabled secondary index for O(1+N) pull sync reads (§13.1)
…nd cleanup

- Add body size limit (64 KiB) to partner registration
- Validate partner UID length (max 512 bytes) in batch sync and sync pixel
- Replace linear scan with binary search in encode_eids_header
- Use constant-time comparison inline in partner lookup, remove unused verify_api_key
- Remove unused PartnerAuthFailed error variant, fix PartnerNotFound → 404
- Add Access-Control-Max-Age CORS header to identify endpoint
- Tighten consent-denied integration test to expect only 403
- Add stability doc-comment to normalize_ip
- Log warning instead of silent fallback on SystemTime failure
…ror variants

Resolve integration issues from rebasing onto feature/ssc-update:
- Restore prepare_runtime() and validate_cookie_domain() lost in conflict resolution
- Add InsecureDefault error variant and wire reject_placeholder_secrets() into get_settings()
- Add sha2/subtle imports for constant-time auth comparison
- Fix error match arms (Ec → EdgeCookie, remove nonexistent PartnerAuthFailed)
- Fix orchestrator error handling to use send_to_client() pattern
- Remove dead cookie helpers superseded by ec/cookies module
Subresource requests (fonts, images, CSS) may omit the Sec-GPC header,
causing the server to incorrectly generate ts-ec cookies when the user
has opted out via Global Privacy Control. Gate generate_if_needed() on
the request Accept header containing text/html so only navigations
trigger EC identity creation.
Move admin route matching and basic-auth coverage to /_ts/admin for a hard cutover, and align tests and docs so operational guidance matches runtime behavior.
Addresses issue #612 - spec now correctly documents that the full EC ID
({64-hex}.{6-alnum}) is used as the KV store key, not just the 64-char
hash prefix.

Changes:
- Updated §4.1: ec_hash() now documented as for logging/debugging only
- Updated §7.2: KV key description changed from '64-character hex hash'
  to 'Full EC ID in {64-char hex}.{6-char alphanumeric} format'
- Updated §7.3: All KvIdentityGraph method parameters renamed from
  ec_hash to ec_id with proper documentation
- Updated §9.3: Batch sync request field renamed from ec_hash to ec_id
- Updated §9.4: Validation and error reasons updated (invalid_ec_hash
  → invalid_ec_id, ec_hash_not_found → ec_id_not_found)
- Updated §10.4: Pull sync URL parameter changed from ec_hash to ec_id
- Updated consent pipeline integration throughout to use full EC ID
- Updated all rate limiting descriptions (per EC ID, not per hash)

Rationale: The random suffix provides uniqueness for users behind the
same NAT/proxy infrastructure who would otherwise share identical
IP-derived hash prefixes.
Extends EC KV schema for cross-property identity resolution:

- Add asn field to GeoInfo (from Fastly geo.as_number())
- Add asn and dma fields to KvGeo for network identification
- Add KvDomainVisit and KvPubProperties for consortium-level domain tracking
- Add pub_properties field to KvEntry with 50-domain cap
- Track publisher domain visits in KvEntry::new() and update_last_seen()
- Respect existing 300s debounce for organic requests only

All new fields use Option types or serde(default) for backward compatibility.
Existing v1 entries continue to deserialize without error.
Implements cluster size evaluation to distinguish individual users from
shared networks (VPNs, corporate offices):

- Add KvNetwork struct with cluster_size and last_evaluated timestamp
- Add network field to KvEntry with TTL-gated cluster rechecks
- Add cluster_size to KvMetadata and IdentifyResponse
- Implement count_hash_prefix_keys() to list keys with common prefix
- Implement evaluate_cluster() on KvIdentityGraph (one-page, 100-key limit)
- Call cluster evaluation in handle_identify endpoint
- Return cluster_size in JSON body and x-ts-cluster-size header
- Add cluster_trust_threshold (default 10) and cluster_recheck_secs (default 3600) config

Cluster evaluation uses best-effort semantics: size unknown if list exceeds
100 keys. Cache hit avoids re-evaluation within recheck interval.
Derives coarse browser signals from TLS/H2/UA on every request to gate
EC identity operations. Unrecognized clients (known_browser != true) are
proxied normally but leave no trace in the identity graph.

- Add KvDevice struct (is_mobile, ja4_class, platform_class, h2_fp_hash,
  known_browser) and device field on KvEntry, written once on creation
- Add ec/device.rs with DeviceSignals::derive(), UA parsing, JA4 Section 1
  extraction, H2 fingerprint hashing, known browser allowlist (Chrome/
  Safari/Firefox)
- Add is_mobile and known_browser to KvMetadata for fast propagation checks
- Wire DeviceSignals through EcContext to KvEntry creation path
- Add bot gate in Fastly adapter: suppress KV graph, ec_finalize_response,
  and pull sync when known_browser != Some(true)
…bot gate

Document all KV schema additions implemented in the preceding commits:
geo extensions (asn/dma), publisher domain tracking, network cluster
evaluation, device signal derivation, and the bot gate architecture.

- Add §7A Device Signals and Bot Gate (signal derivation, allowlist,
  bot gate behavior matrix, KvDevice write policy, privacy rationale)
- Update §7.2 with full KvEntry schema including KvGeo, KvPubProperties,
  KvDomainVisit, KvDevice, KvNetwork, and extended KvMetadata
- Update §2 architecture diagram with Phase 0 bot gate step
- Update §4.3 EcContext with device_signals field
- Update §5.4 lifecycle with Phase 0 and ec_finalize gating
- Update §11 /identify with cluster_size in JSON and x-ts-cluster-size header
- Update §14 config with cluster_trust_threshold and cluster_recheck_secs
- Update §17.1 main.rs pseudocode with full bot gate wiring
The known_browser fingerprint allowlist (3 entries) was too narrow and
blocked legitimate browsers whose JA4/H2 combinations were not listed.

Replace the gate with DeviceSignals::looks_like_browser() which checks
for signal presence: ja4_class.is_some() && platform_class.is_some().
Real browsers always produce both; raw HTTP clients typically lack one
or both. known_browser is still computed and stored on KvDevice for
analytics but no longer gates identity operations.
…igration

- Add ec/ module with EcContext lifecycle, generation, cookies, and consent
- Compute cookie domain from publisher.domain, move EC cookie helpers
- Fix auction consent gating, restore cookie_domain for non-EC cookies
- Add integration proxy revocation, refactor EC parsing, clean up ec_hash
- Remove fresh_id and ec_fresh per EC spec §12.1
- Migrate [edge_cookie] config to [ec] per spec §14
Implement Story 3 (#536): KV-backed identity graph with compare-and-swap
concurrency, partner ID upserts, tombstone writes for consent withdrawal,
and revive semantics. Includes schema types, metadata, 300s last-seen
debounce, and comprehensive unit tests.

Also incorporates earlier foundation work: EC module restructure, config
migration from [edge_cookie] to [ec], cookie domain computation, consent
gating fixes, and integration proxy revocation support.
Implement Story 4 (#537): partner KV store with API key hashing,
POST /admin/partners/register with basic-auth protection, strict
field validation (ID format, URL allowlists, domain normalization),
and pull-sync config validation. Includes index-based API key lookup
and comprehensive unit tests.
- Rename ssc_hash → ec_hash in batch sync wire format (§9.3)
- Strip x-ts-* prefix headers in copy_custom_headers (§15)
- Strip dynamic x-ts-<partner_id> headers in clear_ec_on_response (§5.2)
- Add PartnerNotFound and PartnerAuthFailed error variants (§16)
- Rename Ec error variant → EdgeCookie (§16)
- Validate EC IDs at read time, discard malformed values (§4.2)
- Add rotating hourly offset for pull sync partner dispatch (§10.3)
- Add _pull_enabled secondary index for O(1+N) pull sync reads (§13.1)
Move admin route matching and basic-auth coverage to /_ts/admin for a hard cutover, and align tests and docs so operational guidance matches runtime behavior.
Prebid's liveIntentIdSystem.js uses a dynamic require() inside a
build-flag-guarded branch that their gulp pipeline dead-codes via
constant folding. esbuild leaves the require() in the output, causing
ReferenceError: require is not defined at browser runtime.

Remove from the bundle until we add an esbuild resolver plugin (or
switch to Prebid's own build pipeline) — tracked as a follow-up in the
design spec.
Introduces TSJS_PREBID_USER_IDS env var (mirroring TSJS_PREBID_ADAPTERS)
to control which Prebid User ID submodules are bundled. The hardcoded
imports in index.ts are replaced with a generated file written by
build-all.mjs at build time, defaulting to the same 13-submodule set.

- build-all.mjs: generatePrebidUserIds() validates names, denylists
  liveIntentIdSystem, and writes _user_ids.generated.ts. Existence check
  also probes dist/src/public/ to handle modules shipped as .ts in sources
  (sharedIdSystem).
- index.ts: replaces 13 hardcoded submodule imports with
  import './_user_ids.generated'
- _user_ids.generated.ts: committed default with all 13 submodules
- Tests: updated mocks and regression guard; added 9 syncPrebidEidsCookie
  behavior tests
- Docs: new "User ID Modules" section in prebid.md with TSJS_PREBID_USER_IDS
  usage; spec follow-up #1 marked complete
__gpp and __gpp_sid are read by the Rust server over HTTPS; they must be
Secure. Also sets Max-Age=86400 (matching ts-eids) so stale consent state
doesn't outlast the session, and replaces the legacy expires= deletion
pattern with Max-Age=0.
@ChristianPavilonis ChristianPavilonis force-pushed the edge-cookie-sourcepoint-consent branch from 3bbdb1d to 8a6df3a Compare April 22, 2026 20:33
@ChristianPavilonis ChristianPavilonis force-pushed the feature/edge-cookies-final branch from 9261993 to d8c943d Compare April 27, 2026 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants