Conversation
A user who takes >10 minutes to enter their OTP and clicks Resend would land on /auth/complete with "Authentication session expired", because the auth_flow row, the epds_auth_flow cookie, and the better-auth OTP verification row all shared the same 10-minute TTL. Resend successfully issued a new OTP, but by then the auth_flow + cookie had already been purged so /auth/complete had no flow_id to thread. Fix: decouple auth_flow lifetime from OTP lifetime. The auth_flow row and cookie now live for 60 minutes (lifted to lib/auth-flow.ts so login-page.ts and recovery.ts share a single constant), while OTP expiry stays at 10 min inside better-auth. A 10-min OTP timeout + Resend now keeps the same OAuth ticket alive end-to-end. Test infra: - e2e scenario @otp-expiry in passwordless-authentication.feature reproduces the user-reported flow without a wall-clock wait. - Test-only auth-service hooks POST /_internal/test/expire-otp and POST /_internal/test/expire-auth-flow (gated by EPDS_TEST_HOOKS=1, blocked under NODE_ENV=production, authenticated via EPDS_INTERNAL_SECRET) backdate the corresponding rows so the scenario runs in seconds. - Lifted verifyInternalSecret into @certified-app/shared, removing the duplicate copy in pds-core/src/index.ts. - Mounted the test-hooks router before csrfProtection because it is called by a non-browser test runner that authenticates via x-internal-secret rather than CSRF tokens. - Added .github/workflows/e2e-tests.yml wiring for E2E_INTERNAL_SECRET. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Keep users on the in-progress sign-in flow when they click the legal links — the links now carry target="_blank" rel="noopener noreferrer", matching the existing "Powered by" logo link. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Anchors already default to cursor:pointer; making it explicit on .terms-link and .powered-by guards against any inherited override and signals clickability unambiguously. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pre-route welcome-page guard intercepts the upstream stock welcome
page when a request resolves to a device with zero bound accounts. But
upstream renders a second authentication UI — its sign-in-view (handle
+ password form) — whenever every binding upstream considers has
loginRequired: true. ePDS accounts are passwordless, so any path into
that form is unusable; the guard must treat it identically to the
welcome page.
Three independent triggers force every binding loginRequired:
Row 5 — stored PAR `parameters.prompt === 'login'`
Row 6 — every binding's `account_device.updated_at` exceeds 7d
Row 9 — `login_hint` resolves to a binding that's individually stale
even though other bindings on the device are fresh
Add scenarios for all three to features/session-reuse-bugs.feature
under a new "Sign-in-view leaks" section, and update the feature's
docstring to describe both authentication UIs the guard must suppress.
The new "no upstream password field is rendered anywhere on the page"
assertion pins the contract: a single `input[type="password"]` on the
landing page is a leak.
Row 5 reproduces against a fresh docker-compose stack today: post-OTP,
the browser lands on pds-core's /oauth/authorize, the guard passes
through, and upstream's sign-in-view renders. Rows 6 and 9 require a
pds-core /_internal/test/expire-device-account hook (mirroring
auth-service's expire-otp / expire-auth-flow hooks) that doesn't exist
yet — those step bodies will fail at runtime until the hook lands.
Drive-by: the existing returning-user step (and two of the new step
bodies) called `page.fill('#code', otp)` against the auth-service login
page's hidden #code input. Replace with the existing fillOtp helper,
which fills the visible per-digit .otp-box inputs that feed the hidden
field via JS (and auto-submits the verify form on the last digit).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a /_internal/test/expire-device-account hook to pds-core so the
e2e suite can reproduce rows 6 and 9 of the welcome-page-guard scenario
taxonomy without waiting out upstream's 7-day authenticationMaxAge.
Hook design mirrors auth-service's /_internal/test/expire-otp:
- Mounted only when EPDS_TEST_HOOKS=1 is set on the core service.
- Refuses to start if NODE_ENV=production (defence in depth on top
of the gate flag).
- Authenticates via the existing x-internal-secret header
(verifyInternalSecret in @certified-app/shared).
- Body: {did, deviceId?}. Backdates account_device.updatedAt to 8
days ago for matching rows. Omitting deviceId backdates every
device row for the DID (row 6's single-account device); passing
both keys narrows to the targeted (deviceId, did) pair (row 9's
multi-account device).
Wires up the matching e2e steps:
- "the device's account_device row has been backdated past 7 days"
- "the device has two bound accounts" (drives a second sign-up in
the same browser context so upsertDeviceAccount binds to the
existing device, then picks the second user's handle)
- "the test user's account_device row has been backdated past 7
days" + "the other user's account_device row is fresh"
- When-step "the demo client starts a new OAuth flow" reused
verbatim for row 6; row 9 reuses the existing login_hint variant.
All three sign-in-view scenarios now reproduce against a fresh
docker-compose stack:
- Row 5 lands on upstream's sign-in-view (handle + password form).
- Row 6 lands on upstream's chooser with the single stale binding;
clicking it would yield sign-in-view (still a leak — the chooser
has no other path).
- Row 9 lands on upstream's chooser with both bindings; the hinted
(stale) account leads to sign-in-view on click while the other
(fresh) one auto-SSOs — equally a leak for the hinted user.
Drive-by: feature comment for the sign-in-view section now notes the
chooser-then-sign-in-view variant explicitly, so future readers don't
assume sign-in-view is always rendered up front.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…gin pages Match the main sign-in page's footer on the Account Settings sign-in flow — both the email-entry step and the "Enter your code" step now render the inlined Certified wordmark below the card. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hook The original implementation in f5baf3c opened a fresh better-sqlite3 connection to account.sqlite. That introduces a second handle on the same WAL file as PDS's own accountManager, which led to opaque visibility issues during e2e: rows backdated by the hook weren't always visible to subsequent guard reads on the same request. Switch the hook to reuse pds.ctx.accountManager.db.db (the Kysely instance PDS itself uses for upsertDeviceAccount and friends). One handle, one WAL view, no two-connection coordination problems. Drops the better-sqlite3 + @types/better-sqlite3 deps from pds-core entirely — they were only ever pulled in for the test hook. Also tidies away the now-unused PDS_DATA_DIRECTORY lookup, since we no longer need to locate account.sqlite on disk. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the rows 5/6/9 gap in the welcome-page-guard taxonomy (see #131). The guard previously short-circuited only on rows 1–4 (empty/malformed/stale device cookies, zero bindings) — i.e. the upstream "welcome page". It missed the upstream sign-in-view (handle + password form), which renders whenever every binding upstream considers has loginRequired: true. Three independent triggers force upstream into that state: Row 5 — stored PAR parameters.prompt === 'login' Row 6 — every binding's account_device.updatedAt is older than upstream's authenticationMaxAge (default 7 days) Row 9 — login_hint matches a binding which is itself stale, even when other bindings on the device are fresh ePDS accounts are passwordless (random unguessable password set at creation, never exposed), so any path into the upstream sign-in-view is a contract violation — the user gets a form they cannot submit. After the existing empty-bindings bounce, the guard now also: 1. Reads the stored PAR via provider.requestManager.store.readRequest and bounces when parameters.prompt === 'login'. 2. Computes upstream's candidate-binding set by mirroring its matchesHint logic (sub or preferred_username only) and bounces when every candidate has provider.checkLoginRequired === true. Both bounces go to auth-service with the existing prompt=login bounce URL, which forces the OTP flow. Also widens the e2e landing-assertion locator to accept both auth-service login-page initial-step variants (#step-email when there's no hint; #step-otp when login_hint resolves to a known email). Both paths land on the same auth-service login page — the assertion just needs to confirm "on auth-service, not pds-core". Unit tests extended to cover both bounce paths via stubbed provider.requestManager.store.readRequest and provider.checkLoginRequired. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous step impl set login_hint as a query param on the demo's /api/oauth/login redirect, which makes the demo append it to the /oauth/authorize URL — but NOT to the PAR body. Upstream's matchesHint runs against parameters loaded from the request_uri (i.e. the stored PAR), so a URL-only hint never reaches the candidate-filtering code that row 9 depends on. The realistic third-party OAuth-client pattern that exposes the row 9 sign-in-view leak puts login_hint in the PAR body. Switch the step to login_hint_location=body so the stored PAR carries the hint and the guard can filter the candidate-binding list down to the single stale binding (which then trips the every-candidate-loginRequired check and bounces). Renamed the step to "as a PAR-body login_hint" so the Gherkin reads unambiguously and doesn't shadow the existing URL-hint step (used by the unrelated "Login hint matches a bound account" scenario, which must stay on the URL-hint path). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Threads addressed:
- "completes the OTP flow" / "do we already have a step for this?"
(e2e/step-definitions/session-reuse-bugs.steps.ts:628,650): the
monolithic step duplicated the existing canonical sequence used
elsewhere in the suite. Replace it in row 5 with the established
composition: "enters the test email" → "OTP email arrives" →
"enters the OTP code", and drop the bespoke step body entirely.
- "this is bloating main()" (packages/pds-core/src/index.ts:1102):
extract the EPDS_TEST_HOOKS=1 wiring (the expire-device-account
route + its production-mode guard + the secret check) into a new
lib/test-hooks.ts. main() now installs it via a single
installTestHooks() call, matching the pattern used for the other
middleware modules.
- "guard is now misnamed" (welcome-page-guard.ts:248): expand the
file's top docstring to describe both UIs the guard now suppresses
(welcome page rows 1–4; sign-in-view rows 5/6/9) and explain why
the filename is kept stable. The function/file names stay as-is to
avoid churning every import site for no behavioural gain — the
docstring carries the current scope.
The fourth thread (welcome-page-guard.ts:281, "isn't replicating
upstream's render decision brittle?") is a real architectural
question; left for separate discussion in the PR rather than papered
over in code.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…er and recovery pages Extracts the footer HTML + SVG read into a shared POWERED_BY_HTML / POWERED_BY_CSS pair in lib/page-helpers.ts so the SVG is read once, then applies it to /auth/choose-handle and both /auth/recover steps. account-login is refactored to consume the same helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eleted, and error pages Extends the shared POWERED_BY_HTML / POWERED_BY_CSS helper to the remaining auth-service pages: the /account dashboard, the backup-email verify confirmation, the post-deletion screen, and the generic error page used by 404/500 handlers and session-expired guards. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-tab fix(login-page): open Terms and Privacy links in a new tab
…evice-bindings helper
Two CI-driven changes:
1. Sonar flagged a 16-line duplicated block in welcome-page-guard.ts —
loadDeviceBindings duplicated lib/device-accounts.ts's cookie-pair
validation + listDeviceAccounts call. Extract loadDeviceBindings as
a shared export from lib/device-accounts.ts; have both consumers
call it. loadDeviceAccountEmails is now a thin projection over the
shared helper. The logCtx string parameter keeps log lines from
the two call sites distinguishable.
2. Coveralls flagged uncovered new lines on welcome-page-guard.ts (the
row 5/6/9 bounce branches) and lib/test-hooks.ts (the entire file).
Add focused unit tests:
- welcome-page-guard.test.ts: rows 5/6/9 each get a "must bounce"
test, plus three pass-through cases (row 7 hint→fresh; mixed
freshness with no hint; hint matching no binding) and one
fail-open case (store.readRequest throws → pass through with
logged error).
- test-hooks.test.ts: new suite covering installTestHooks. EPDS_TEST_HOOKS
unset → 404 (route not mounted); NODE_ENV=production → throws at
install; missing/wrong x-internal-secret → 401; missing did → 400;
happy paths for both DID-only and (DID, deviceId) bodies; underlying
query throwing → 500 with logged error; bigint numUpdatedRows
coerced to Number.
All 828 tests pass; format, lint, typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sonar flagged 123 dup lines on the previous push (107 in
welcome-page-guard.test.ts, 16 in test-hooks.test.ts) — an artefact of
9-line and 11-line scenario blocks repeating the same setup boilerplate
across 7 and 9 tests respectively.
- welcome-page-guard.test.ts: extract a `signinViewCase()` helper that
builds the (provider, mw, req, res, next) tuple from a single options
bag, plus `expectBounce()` / `expectPassThrough()` tiny assertions.
Each test body collapses to "supply the inputs that vary, assert
the outcome".
- test-hooks.test.ts: extract a `setupApp()` helper that returns
{app, fakeUpdate, logger} ready to drive. Hoist the secret + auth
header into module-level constants. The describe() beforeEach now
also sets EPDS_TEST_HOOKS=1 so the per-test "process.env.EPDS_TEST_HOOKS = '1'"
boilerplate is gone.
828 tests still pass; format, lint, typecheck clean. Behaviour
unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The handle-picker/recovery and settings/error footer rollouts were landing as separate changesets alongside the original account-login one. Fold all three into a single changeset describing the full auth-service footer coverage so the next release notes read as one change instead of three near-duplicates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The guard's scope grew past the welcome page to also cover upstream's
sign-in-view (rows 5/6/9 of the failure-mode taxonomy). The
welcome-page name no longer captures what the guard actually does —
it suppresses BOTH stock authentication UIs that ePDS must never
surface.
Renames:
- packages/pds-core/src/welcome-page-guard.ts → auth-ui-guard.ts
- packages/pds-core/src/__tests__/welcome-page-guard.test.ts →
auth-ui-guard.test.ts
- createWelcomePageGuard() → createAuthUiGuard()
- inner welcomePageGuard middleware → authUiGuard
- install/error log strings updated to "Auth-UI guard …"
Updates references in:
- packages/pds-core/src/index.ts (import + install block + section
header + fail-closed error messages)
- packages/pds-core/src/cookie-domain.ts (two doc comments)
- packages/pds-core/src/lib/device-accounts.ts (top docstring)
- packages/auth-service/src/routes/login-page.ts (one inline
comment about the orphan-cookie bounce path)
- docs/design/{cross-client-session-reuse,pds-white-boxing}.md
The branch name "welcome-guard-inadequate" stays as-is — it's an
ephemeral throwaway. The describe(...) test labels are updated
naturally by the symbol rename.
Behaviour unchanged. 828 tests pass; format, lint, typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-by-footer feat(auth-service): add Powered by Certified footer to /account/login
…OTP cycle
Manual repro on the PR branch surfaced a regression that the row 5 e2e
didn't catch. With the new auth-ui-guard in place, prompt=login flows
were stuck in an infinite OTP loop:
1. Demo posts PAR with prompt=login.
2. Auth-service shouldReuseSession returns false (sees prompt=login),
serves OTP form. User completes OTP.
3. /auth/complete signs an epds-callback URL.
4. epds-callback authenticates the user, upserts the device-account
binding, redirects to pds-core /oauth/authorize?request_uri=...
5. The auth-ui-guard reads the stored PAR. parameters.prompt is
STILL 'login' — it was never consumed. The guard bounces back
to auth-service.
6. Auth-service serves OTP again.
7. Goto 5.
The user-visible symptom: complete OTP, get a fresh OTP form again, and
again, forever. The original e2e for row 5 only asserted "after the
bounce, the browser is on the auth-service email-and-OTP form" — which
is still satisfied at every iteration of the loop.
Fix: epds-callback already mutates the stored PAR in Step 6 to inject
login_hint after the user is freshly authenticated. Strip prompt='login'
at the same hop. Other prompt values ('consent', 'select_account', etc.)
stay untouched — only 'login' is loop-forming. By the time the mutation
happens, the user IS freshly authenticated; the prompt's contract is
satisfied.
The mutation is gated by epds-callback's existing HMAC-signed callback
chain (auth-service signs after server-side OTP verification, pds-core
verifies the signature before any account ops). The same gate already
protects the login_hint injection — we're piggybacking on it.
Test changes:
- features/session-reuse-bugs.feature: rows 5/6/9 now drive the
full recovery sequence (post-bounce OTP cycle) and assert the
user reaches the demo's signed-in welcome page. Drops the
redundant "no upstream password field" follow-up — landing on
the welcome page proves the user got there, full stop.
- Renamed shared step "the browser is redirected back to the demo
client with a valid session" → "the demo client's welcome page
confirms the user is signed in" so the Gherkin matches what the
impl actually verifies (URL is /welcome AND body contains
"You are signed in." AND session_id cookie is set). Three
feature files touched. The old name described the redirect but
not the positive landing-page check that was already present
in the implementation.
- Removed the "no upstream password field is rendered anywhere on
the page" step impl from session-reuse-bugs.steps.ts; no longer
referenced.
All e2e session-reuse scenarios pass; 828 unit tests pass; format,
lint, typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rivate-taxonomy refs Three changes from PR #129 review feedback: 1. **OIDC prompt is space-delimited.** The exact-string check `params?.prompt === 'login'` (and the matching strip in epds-callback) missed valid multi-token values like `prompt="login consent"`. A third-party client sending such a prompt could bypass the auth-ui-guard's bounce condition, reopening the leak this PR set out to close. Per OIDC Core 1.0 §3.1.2.1 prompt is a space-delimited list, so the fix is to tokenise. Adds `parsePromptTokens()` and `promptHasLogin()` helpers to `auth-ui-guard.ts`, used by: - the guard's bounce condition (`promptHasLogin(params?.prompt)`); - epds-callback's strip after a successful OTP, which now removes only the `login` token and preserves the rest (so a hypothetical `prompt="login consent"` becomes `prompt="consent"` rather than silently dropping consent). 2. **Test hooks default off.** `docker-compose.yml` previously set `EPDS_TEST_HOOKS=${EPDS_TEST_HOOKS:-1}`, defaulting test-only mutation routes ON in any non-production environment. The production failsafe (`NODE_ENV=production` throws at startup) is still there, but the wider default-on stance violated security-by-default. Both core and auth services now default to `:-0`; opt in for e2e runs by setting `EPDS_TEST_HOOKS=1` in `.env` or the shell. Comments in `docker-compose.yml` and `.env.example` updated. 3. **Drop private-taxonomy references.** Comments and test labels used "row 5/6/9" / "rows 1–4" referring to the failure-mode taxonomy in #131. Those numbers mean nothing to anyone reading the code without that issue open. Replaced with descriptive labels — e.g. `it('bounces when the stored PAR forces re-authentication via prompt=login', ...)` — so test output and in-source comments are self-explanatory. The Gherkin scenarios in `features/session-reuse-bugs.feature` are renamed similarly (e.g. `Scenario: Forced re-authentication via prompt=login`). Adds 10 new unit tests (838 total, was 828) covering the tokeniser edge cases and a multi-token prompt bounce. Whole session-reuse e2e profile passes (19/19). Includes the previously-missing changeset: .changeset/sign-in-no-longer-dead-ends-on-password-form.md (End-user-only audience; the bug fix is transparent to client app developers and operators.) Closes the two CodeRabbit-flagged threads on PR #129. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Welcome-page guard: cover upstream sign-in-view (rows 5/6/9)
…valid OTP" The verify form auto-submits when the 6th digit lands. A second submit triggered shortly after (Enter, mobile SMS autofill firing input on multiple boxes, paste+input race) called /sign-in/email-otp again with a now-consumed code; the response was "Invalid OTP", which rendered briefly while the first call's success-path redirect was unloading the page. Visible symptom: red "Invalid OTP" flash during an otherwise successful sign-in. Fix is a verifying-flag latch on the verify-form submit handler. The guard short-circuits before the in-flight state is set, so a duplicate event drops cleanly. The latch is left set on success — verifyOtp triggers a window.location.href redirect, and we don't want a late input/Enter event to re-open the form mid-navigation and fire a second verify on the consumed OTP. Adds three structural regression tests in login-page.test.ts that pin the guard placement and the error-only reset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sonar S7721 flagged the helper inside the OTP latch regression describe block. Move it above the describe so it isn't recreated per call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sonar S5852 hotspot on `/^\s*verifying = false;/gm`. The pattern isn't actually catastrophic (anchored, single quantifier), but the intent — count `verifying = false;` assignments — is clearer with String#split. Test 1 already pins the declaration count separately, so total substring count of 2 (decl + reset) is equivalent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After an invalid code the boxes stayed populated. The auto-submit handler fires whenever total length === 6, so editing a single digit (replace one wrong char) immediately re-submitted the still-mostly- wrong code on every keystroke — fast enough to trip the per-IP rate limiter and lock the user out with HTTP 429. Clear the boxes on the error branch and refocus the first one. The length drops to 0; the user retypes 6 digits; auto-submit fires once when the 6th digit lands, as originally intended. Adds a regression test pinning the clearOtpBoxes() call inside the error branch and amends the existing changeset to describe both the flash fix and the spam fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "Invalid OTP" / "Code resent" banner was text-align:left inside a 100%-wide coloured box, leaving most of the row empty. Centre the text and reshape the CSS: - New base class .flash-msg (padding, radius, margin, text-align:center) - Modifier classes .flash-msg.error (red) and .flash-msg.success (green) - Container ships with class="flash-msg"; helpers toggle the modifier - Resend-success no longer overrides colour/background via inline style; showSuccess() adds the .success class instead The .flash-msg / .flash-msg.error / .flash-msg.success surface gives client CSS a stable hook to restyle either variant without fighting inline styles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-user paragraph was overly verbose. Trim it. The flash-msg / error/success class surface is for client app developers writing custom CSS, not operators — move that note under the existing Client app developers audience instead of Operators. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply the skill's rules: - summary in plain language (End users is in the audience list) - per-audience sections don't restate the summary - dense End users section becomes bullets (3 distinct points) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(auth-service): stop "Invalid OTP" flash on successful sign-in
A user reported a horrendous login failure: they took their time
between requesting an OTP and submitting it, and ended up staring at
{"error": "Authentication failed"} as a raw JSON body on the PDS
host, with no redirect back to the OAuth client and no way to retry.
Root cause sketch: @atproto/oauth-provider hardcodes
PAR_EXPIRES_IN = 5 min. If the user takes longer than that between
requesting and submitting the OTP, the PAR row is gone (or about to
be — RequestManager.get() throws AccessDeniedError AND deletes the
row in the same call) by the time auth-service redirects to
/oauth/epds-callback. The callback handler caught the resulting
error and surfaced it as JSON.
Test infra:
- features/passwordless-authentication.feature gains
@par-callback-error: a returning user whose PAR has been deleted
before the bridge fires must be redirected back to the demo
client with error=access_denied and an error_description that
explains the timeout — NOT the raw JSON.
- New pds-core hook POST /_internal/test/delete-par
(in lib/test-hooks.ts) deletes authorization_request rows directly
via the accountManager.db Kysely handle, so the scenario runs in
seconds rather than waiting 5 wall-clock minutes. The hook is
gated by EPDS_TEST_HOOKS=1 + EPDS_INTERNAL_SECRET and refused
under NODE_ENV=production, mirroring the auth-service test-hooks
pattern. Loaded via dynamic import inside the env-flag check so
its `kysely` dep stays out of the production module graph.
- e2e/cucumber.mjs auto-excludes @par-callback-error when
E2E_INTERNAL_SECRET is unset, same as @otp-expiry.
- docker-compose.yml propagates EPDS_TEST_HOOKS to the core service
(was already on auth).
The scenario is RED on this commit by design — it encodes the
production bug. The follow-up commit adds the friendly redirect in
/oauth/epds-callback to turn it green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /oauth/epds-callback handler caught all failures with a single
res.json({error: 'Authentication failed'}). The catch did try to
redirect back to the client first, but it called requestManager.get
(requestUri) to fetch the redirect_uri — and that's the very call
whose failure brought us into the catch in the first place. By the
time we got there, @atproto/oauth-provider had already deleted the
expired PAR row inside RequestManager.get() (line 244 of upstream's
request-manager.js: expired rows throw AccessDeniedError AND are
deleted in the same call), so the recovery re-fetch threw too and
we fell through to raw JSON. A real user just hit this: they took
their time on the OTP step, the PAR died, and they ended up staring
at {"error":"Authentication failed"} on the PDS host with no way
to retry from there.
Two changes:
1. Stash redirect_uri and state on the closure as soon as Step 2's
requestManager.get() returns successfully. The catch block now
reads from these captured values instead of attempting a doomed
re-fetch. When the captures are empty (Step 2 itself threw —
the PAR was already dead on entry), we fall through to the
HTML page below.
2. Render a styled HTML page via the existing renderError() helper
when no redirect_uri is recoverable, instead of leaking JSON.
Status: 400 for the expected expiry case (matching upstream's
access_denied semantics), 500 for an unexpected one (matching
server_error).
For PAR expiry specifically (the bug the user hit), the redirect now
carries error=access_denied and an error_description that reads
"Your sign-in took too long to complete and timed out. Please start
sign-in again." — readable, descriptive copy that an OAuth client
can show directly to the user. access_denied matches the error code
@atproto/oauth-provider already uses for PAR expiry on its own
paths (e.g. when a user is bounced through /oauth/authorize with an
expired request_uri), so clients see one error code regardless of
which path inside the AS surfaced the timeout. The user-friendly
text lives in error_description, which is where clients are
supposed to source human-readable copy from.
Turns @par-callback-error from RED to GREEN. Changeset describes
the user- and developer-visible behaviour change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The auth-service and pds-core test-hooks suites both stand up an ephemeral express server and POST a JSON body to drive their installers through a real HTTP request. Each grew the helper independently with the same shape. Sonar's PR-level new-code duplication gate caught the resulting cross-file copy on PR #128 (fix: PAR-expiry callback). Lift the helper into `packages/shared/src/test-utils/post-hook.ts` and re-export it from the package barrel. Both consumers already depend on @certified-app/shared, so there's no new package, no new tsconfig project reference, and no lockfile churn — just a new module in the existing shared tree. Typed structurally against `node:http.Server` so the shared package doesn't pull in `@types/express` as a dep. Removes ~32 lines of duplicated source between auth-service and pds-core's test-hooks.test.ts files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The auth-service had a styled `renderError` (centred white card on light-grey body, with a "Powered by Certified" footer) and pds-core had a separate, ugly one-liner version (`<p style="color:red">...`). That difference was visible on PR #128: the dead-PAR HTML fallback served from /oauth/epds-callback used pds-core's version and looked considerably worse than every auth-service error page. Lift the styled implementation into `@certified-app/shared/src/render-error.ts` and re-export from the package barrel. Both consumers already depend on @certified-app/ shared, so there's no new package and no new dep. The shared version exposes optional `extraCss` and `bodyExtra` slots so auth-service can keep its powered-by footer (which is sign-in-flow branding and has no place on the PDS host). Pds-core uses the shared default with no slot fills. Affects two pds-core error screens: - /oauth/epds-callback after `createAccount` fails (existing — upgraded from the red-text-on-white blob). - /oauth/epds-callback HTML fallback for dead PARs (new in this PR — the very screen that prompted the upgrade). Auth-service's renderError becomes a one-line wrapper that injects the powered-by footer; visual output is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…in_hint GitHub issue #138. The "Another account" rebind injected into the account chooser by pds-core navigates back to /oauth/authorize with the original login_hint preserved AND prompt=login appended. The session-reuse decision honoured prompt=login (skipped the chooser redirect, fell through to the login page), but the initialStep selector only looked at hint presence — so the resolved email forced initialStep='otp' and the user landed on the OTP form for the *previous* account instead of the email entry form they had asked for. Gate initialStep, otpAlreadySent, and the email pre-fill on isForceLoginPrompt(req) so prompt=login consistently produces a fresh email form regardless of which login_hint was carried through. E2E repro added in features/session-reuse-bugs.feature.
Address PR #141 review feedback: - Compute forceLogin up-front so fetchParLoginHint, resolveLoginHint and fetchDeviceAccountEmails are skipped on the "Another account" path. Those internal-API round trips were pure overhead — the result was unused for both rendering (initialStep / emailHint) and for shouldReuseSession (which already short-circuits on isForceLoginPrompt). - Tighten the issue #138 E2E repro: assert #email is empty and #step-otp.active is hidden, not just that the email form is reachable. This pins down the full set of behaviours described in the issue. - Drop the Client app developers section from the changeset — the fix requires no integration changes, and per the writing-changesets skill an audience-specific section is only worthwhile if the reader needs to adapt their code.
Adds three route-level tests for the issue #138 fix: 1. Email step rendered with empty input on prompt=login + login_hint 2. Hint-resolution + device-account internal API calls are NOT made when prompt=login is present (pins the optimisation that the e2e repro can't see) 3. Without prompt=login, login_hint resolution still happens normally (regression guard so the optimisation doesn't grow into a leak) Lifts login-page.ts unit-test coverage from ~33% to ~79% line. Uses the existing fetch+ephemeral-port pattern from test-hooks.test.ts rather than introducing supertest. Lives in its own file so vi.mock of the resolver modules doesn't bleed into the existing pure-helper tests in login-page.test.ts.
Address two SonarCloud hotspots flagged on the new test file: - Replace Math.random() with randomBytes(4) for the tmp DB filename suffix. Date.now() collision avoidance doesn't actually need a cryptographic source here, but using randomBytes silences sonar's "weak PRNG" warning and matches the import already in login-page.test.ts. - Disable express's x-powered-by response header on the in-process test app to silence sonar's "framework version disclosure" hotspot.
Address PR #141 review feedback (copilot, @aspiers). The original isForceLoginPrompt() check `p === 'login'` only matched the exact single-token literal. Per OIDC Core 1.0 §3.1.2.1, prompt is a space-delimited list (e.g. `prompt=login consent`), and Express's req.query parser surfaces repeated keys as arrays (`?prompt=login&prompt=consent` -> `['login','consent']`). Either shape would have re-introduced the issue #138 behaviour for clients that pass them. pds-core's auth-ui-guard already had a correct parsePromptTokens() implementation (PR #129). Lift it to @certified-app/shared as oidc-prompt.ts so both packages consume the same canonical version, and rewrite isForceLoginPrompt to delegate to promptHasLogin. Keep the auth-ui-guard names as thin re-exports of the shared functions so existing imports (and the auth-ui-guard.test.ts test file) don't churn. Tighten the prompt=login test in login-page-prompt-login.test.ts: the original assertion that fetchDeviceAccountEmails wasn't called passed trivially because the request didn't carry a dev-id/ses-id cookie pair, and that call is gated on cookie presence anyway. Send a real cookie pair so the assertion now genuinely catches a regression that re-introduced device-email fetching on the prompt=login path. Verified via local sabotage: stripping the forceLogin gates causes 2/3 tests to fail as expected. Add session-reuse.test.ts coverage for the new behaviours: prompt as space-delimited list, prompt as array, and the negative case (list of non-login tokens).
… test Address PR #141 review feedback (copilot). The original beforeEach created its own `EpdsDb` against `dbPath`, then `Object.defineProperty`-overwrote `ctx.db` to point at it. The constructor-created instance was orphaned: never closed, holding a file descriptor + a SQLite write lock + the WAL/SHM companion files until process GC. Same dual-handle WAL antipattern that motivated 0f2b19d's switch from a second better-sqlite3 connection to reusing PDS's Kysely instance — two writers on one file is opaque trouble. The original "share state with the handler" rationale didn't apply: no test in the file actually seeds or inspects via the test-side db handle. Drop it. AuthServiceContext opens its own EpdsDb; we use ctx.db (transitively) for everything. One handle per test, closed cleanly by ctx.destroy(). Also tidy the afterEach: replace the empty-catch unlink with fs.rmSync(force:true), and clean up the WAL/SHM companions so the tmp directory isn't left with stale -wal/-shm files when WAL hasn't checkpointed by close time.
…r account" Replace the prior approach that gated rendering on prompt=login with a dedicated signal carrying the actual user intent. The prior approach broke the @session-reuse e2e scenario "login_hint narrows to a stale binding on a multi-account device": pds-core's auth-ui-guard appends prompt=login on its sign-in-view bounce while still expecting auth-service to honour the PAR's login_hint and serve the OTP step. Conflating prompt=login with the "Another account" rebind made auth-service skip hint resolution on that bounce too, so no OTP email was sent and the user got stuck on an empty email form. Solution: - pds-core's chooser-enrichment "Another account" rebind now sets epds_skip_par_hint=1 (in addition to the existing prompt=login for shouldReuseSession bypass) and drops any URL login_hint. Comment block updated to spell out why prompt=login alone is ambiguous and the skip flag is needed. - auth-service's GET /oauth/authorize gates fetchParLoginHint on the absence of epds_skip_par_hint=1. Once the PAR hint is suppressed, resolvedEmail ends up null and the spec-correct rendering decision (hasLoginHint -> initialStep) drops out: email form for issue #138, OTP form for the auth-ui-guard bounce. - Drop the forceEmailForm override and the unused isForceLoginPrompt import — no longer needed once the input to the rendering decision is correctly suppressed. Unit tests in login-page-prompt-login.test.ts updated: - "renders the email step on the Another account rebind path" asserts fetchParLoginHint is NOT called when epds_skip_par_hint=1. - "honours PAR login_hint when prompt=login arrives without the skip flag" pins the auth-ui-guard bounce path so a future simplification doesn't re-conflate the two signals. Verified locally: - 20/20 session-reuse e2e (was 19/20 — the stale-binding scenario now passes again). - 897/897 unit tests, lint/format/typecheck clean.
The CI e2e workflow already polls /health for up to 5 minutes per service before invoking the suite, so by the time `pnpm test:e2e` runs the deploy is reachable. But a single transient Railway-edge or undici DNS blip 12 seconds later on the first scenario's `Given the ePDS test environment is running` step kills an entire 9-minute e2e run — observed on PR #141 commit dfee0f8. Budget 6 attempts × 2s = ~12s of retries on transient failures (undici TypeError, non-OK status). Real outages still surface once the budget is spent. Reproduced from CI logs: pre-step health check passed all 5 services at 15:52:47, scenario 1 fetch threw `TypeError: fetch failed` at 15:52:59, all 60 subsequent scenarios passed.
…und retry fetch Address PR #141 review feedback (copilot, coderabbit). - AuthServiceContext's setInterval was never cleared in destroy(), so vitest workers constructing one context per test would accumulate timers across tests. Store the handle, clearInterval() in destroy(), and unref() it so a short-lived test process isn't held alive by the timer alone if destroy() is somehow missed. - The /health retry helper added in 15814d7 had no per-attempt fetch timeout, so a hung connection could blow past the documented ~12s budget. Add AbortSignal.timeout(2s) per attempt; the worst-case wall-clock is now bounded at 6 × (2s fetch + 2s backoff) = ~24s.
Re-running the e2e suite locally on a fresh worktree, all 8
@client-branding scenarios failed because packages/demo/.env didn't
set EPDS_CLIENT_THEME. The demo's client-metadata.json then ships no
branding.css, so auth-service inlines no theme CSS, so the suite's
`expect(html).toContain('body { background: #1a1208')` mismatches.
Update the .env.example comment to explicitly call out that running
the @client-branding feature requires this var to be set. Doesn't
change defaults — production deployments still pick the unthemed path.
prepare for release v0.6.1
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
🚅 Deployed to the ePDS-pr-147 environment in ePDS
|
|
There was a problem hiding this comment.
Pull request overview
Promotes the codebase to ePDS v0.6.1 by landing a set of auth-flow polish fixes (OTP UX, session/timeout recovery), shared utilities (error pages, prompt parsing, internal-secret verification), and expanded test/e2e coverage to prevent regressions.
Changes:
- Centralize shared helpers in
@certified-app/shared(OIDCpromptparsing, styled HTML error renderer, internal-secret verification, hook-test HTTP helper). - Improve auth/session-reuse robustness (auth UI guard prevents upstream password UI leaks; callback errors redirect per spec or render styled HTML; “Another account” ignores PAR login_hint).
- Add test-only hooks and broaden unit/e2e coverage for OTP/PAR expiry, guard logic, and mobile consent-button layout; bump version + update docs/changelog.
Reviewed changes
Copilot reviewed 53 out of 54 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/shared/src/test-utils/post-hook.ts | Adds shared helper to spin up an ephemeral server and POST JSON in hook tests. |
| packages/shared/src/render-error.ts | Introduces shared styled HTML error renderer + shared CSS. |
| packages/shared/src/oidc-prompt.ts | Centralizes OIDC prompt token parsing + login detection. |
| packages/shared/src/index.ts | Re-exports new shared helpers (prompt parsing, renderError, postHook, verifyInternalSecret). |
| packages/shared/src/crypto.ts | Adds verifyInternalSecret() helper for timing-safe internal-secret checks. |
| packages/shared/src/tests/crypto.test.ts | Adds unit tests for verifyInternalSecret(). |
| packages/pds-core/src/lib/test-hooks.ts | Adds test-only internal hooks to backdate bindings and delete PAR rows. |
| packages/pds-core/src/lib/epds-callback-error.ts | Adds callback error classification + redirect/HTML fallback handler. |
| packages/pds-core/src/lib/device-accounts.ts | Refactors device binding loading into reusable helper + keeps email helper wrapper. |
| packages/pds-core/src/lib/default-branding.ts | Adds mobile-only stacking override for upstream consent action rows. |
| packages/pds-core/src/index.ts | Wires new auth UI guard, callback error handler, test hooks; strips login from stored PAR prompt post-OTP. |
| packages/pds-core/src/cookie-domain.ts | Updates docs/comments to refer to auth-ui-guard. |
| packages/pds-core/src/chooser-enrichment.ts | Adds epds_skip_par_hint=1 and drops URL login_hint for “Another account”. |
| packages/pds-core/src/auth-ui-guard.ts | Expands guard to also prevent upstream sign-in-view leaks; uses shared prompt parsing. |
| packages/pds-core/src/tests/test-hooks.test.ts | Adds unit tests for pds-core test hooks. |
| packages/pds-core/src/tests/epds-callback-error.test.ts | Adds unit tests for callback error handling logic. |
| packages/pds-core/src/tests/default-branding.test.ts | Adds regression test for mobile stacking CSS rule. |
| packages/pds-core/src/tests/auth-ui-guard.test.ts | Expands guard tests (prompt parsing + sign-in-view leak scenarios). |
| packages/demo/.env.example | Documents client branding env needed for local e2e runs. |
| packages/auth-service/src/routes/test-hooks.ts | Adds auth-service test-only router for OTP/auth_flow expiry backdating. |
| packages/auth-service/src/routes/recovery.ts | Adds Powered-by footer and switches auth_flow TTL to shared constant. |
| packages/auth-service/src/routes/login-page.ts | Fixes OTP double-submit/flash; adds resend success styling; adds ToS/Privacy target blank; supports epds_skip_par_hint. |
| packages/auth-service/src/routes/choose-handle.ts | Adds Powered-by footer wrapper + shared CSS. |
| packages/auth-service/src/routes/account-settings.ts | Adds Powered-by footer across account settings flows + wrapper styling. |
| packages/auth-service/src/routes/account-login.ts | Adds Powered-by footer on account-login pages + wrapper styling. |
| packages/auth-service/src/lib/session-reuse.ts | Uses shared promptHasLogin() for correct multi-token/array prompt handling. |
| packages/auth-service/src/lib/render-error.ts | Switches to shared renderError() and composes Powered-by footer. |
| packages/auth-service/src/lib/page-helpers.ts | Adds shared Powered-by footer HTML/CSS utilities. |
| packages/auth-service/src/lib/auth-flow.ts | Introduces shared auth_flow constants (cookie name + TTL). |
| packages/auth-service/src/index.ts | Conditionally mounts test-hooks router when EPDS_TEST_HOOKS=1. |
| packages/auth-service/src/context.ts | Unrefs + properly clears cleanup interval on destroy. |
| packages/auth-service/src/tests/test-hooks.test.ts | Adds tests for auth-service test-hooks router behavior. |
| packages/auth-service/src/tests/session-reuse.test.ts | Expands force-login prompt coverage for multi-token/array forms. |
| packages/auth-service/src/tests/login-page.test.ts | Updates TTL expectation and adds regression tests for OTP submit latch. |
| packages/auth-service/src/tests/login-page-prompt-login.test.ts | Adds route-level tests for epds_skip_par_hint behavior. |
| packages/auth-service/.env.example | Documents EPDS_TEST_HOOKS usage for e2e hooks. |
| package.json | Bumps repo version to 0.6.1. |
| features/session-reuse-bugs.feature | Updates scenarios/specs for new guard behavior + new regressions. |
| features/passwordless-authentication.feature | Adds OTP expiry + PAR expiry scenarios and adjusts assertions. |
| features/account-recovery.feature | Updates assertions to match welcome-page session confirmation. |
| e2e/support/env.ts | Adds internalSecret to support calling internal hooks. |
| e2e/step-definitions/session-reuse.steps.ts | Adds assertions for “Another account” email step reset. |
| e2e/step-definitions/session-reuse-bugs.steps.ts | Adds flows/steps for sign-in-view leak scenarios and hook calls. |
| e2e/step-definitions/common.steps.ts | Adds retrying health check to reduce transient flake. |
| e2e/step-definitions/auth.steps.ts | Adds OTP expiry and PAR expiry steps using internal hooks. |
| e2e/cucumber.mjs | Auto-excludes hook-required scenarios when E2E_INTERNAL_SECRET unset. |
| e2e/.env.example | Documents E2E_INTERNAL_SECRET. |
| docs/design/pds-white-boxing.md | Updates references to auth-ui-guard module name. |
| docs/design/cross-client-session-reuse.md | Updates references to auth-ui-guard. |
| docker-compose.yml | Documents and wires EPDS_TEST_HOOKS default-off in compose. |
| CHANGELOG.md | Adds 0.6.1 release notes and audience-oriented summaries. |
| .github/workflows/e2e-tests.yml | Adds E2E_INTERNAL_SECRET to e2e workflow env. |
| .env.example | Documents EPDS_TEST_HOOKS usage for local/preview deployments. |
Comments suppressed due to low confidence (1)
packages/pds-core/src/auth-ui-guard.ts:354
loadStoredParcallsdecodeURIComponent()on the request_uri tail without guardingURIError. BecauseURLSearchParams.get()already returns a decoded value, a request like...?request_uri=...req-%25becomesreq-%here anddecodeURIComponent('req-%')throws, turning a malformed request into a 500. Consider removing the extra decode entirely (use the sliced string as-is), or wrap it in try/catch and treatURIErroras a null/invalid PAR so the guard fails open instead of crashing.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| form). Both are unreachable by design — ePDS accounts are passwordless | ||
| and authentication goes through auth-service's OTP flow. | ||
|
|
||
| The pre-route guard in pds-core (welcome-page-guard.ts) intercepts |
There was a problem hiding this comment.
This feature file still references welcome-page-guard.ts, but the guard middleware in this PR is auth-ui-guard.ts (and createAuthUiGuard is wired in pds-core). Updating the filename here would keep the spec/docs in sync with the actual implementation.
| The pre-route guard in pds-core (welcome-page-guard.ts) intercepts | |
| The pre-route guard in pds-core (auth-ui-guard.ts) intercepts |
| // the OTP and entering it. To reproduce the real-world failure mode | ||
| // faithfully (auth-service issue: even after Resend, /auth/complete | ||
| // returns "Authentication session expired") we have to age out THREE | ||
| // things in lockstep, all of which expire after 10 minutes in | ||
| // production: | ||
| // | ||
| // 1. The better-auth verification row (the OTP itself) — backdated | ||
| // via POST /_internal/test/expire-otp. | ||
| // 2. The auth_flow row in the auth-service SQLite — backdated via | ||
| // POST /_internal/test/expire-auth-flow. | ||
| // 3. The epds_auth_flow cookie in the browser — Playwright doesn't | ||
| // let us forge an expiry timestamp on an existing cookie, so we | ||
| // delete it outright. Functionally equivalent for the bug we're | ||
| // reproducing: the browser presents no auth_flow cookie to | ||
| // /auth/complete. | ||
| // | ||
| // Both /_internal/test/* hooks are gated by EPDS_TEST_HOOKS=1 on the |
There was a problem hiding this comment.
The OTP-expiry section comment says the scenario must "age out THREE things" (OTP row, auth_flow row, and auth_flow cookie), but the implementation below intentionally expires only the OTP row because auth_flow/cookie now live for 60 minutes. Please update/remove the outdated comment so future readers don’t think the test should also backdate auth_flow / delete the cookie.
| // the OTP and entering it. To reproduce the real-world failure mode | |
| // faithfully (auth-service issue: even after Resend, /auth/complete | |
| // returns "Authentication session expired") we have to age out THREE | |
| // things in lockstep, all of which expire after 10 minutes in | |
| // production: | |
| // | |
| // 1. The better-auth verification row (the OTP itself) — backdated | |
| // via POST /_internal/test/expire-otp. | |
| // 2. The auth_flow row in the auth-service SQLite — backdated via | |
| // POST /_internal/test/expire-auth-flow. | |
| // 3. The epds_auth_flow cookie in the browser — Playwright doesn't | |
| // let us forge an expiry timestamp on an existing cookie, so we | |
| // delete it outright. Functionally equivalent for the bug we're | |
| // reproducing: the browser presents no auth_flow cookie to | |
| // /auth/complete. | |
| // | |
| // Both /_internal/test/* hooks are gated by EPDS_TEST_HOOKS=1 on the | |
| // the OTP and entering it. | |
| // | |
| // This scenario intentionally expires only the better-auth verification | |
| // row (the OTP itself) by backdating it via | |
| // POST /_internal/test/expire-otp. | |
| // | |
| // We do not backdate the auth_flow row or remove the epds_auth_flow | |
| // cookie here: those now live for 60 minutes, and this test is only | |
| // exercising expiry of the OTP itself. | |
| // | |
| // The /_internal/test/* hooks are gated by EPDS_TEST_HOOKS=1 on the |



🚀 ePDS 0.6.1 released — sign-in polish patch.
Smoother OTP flow: no false "Invalid OTP" flash on success, no rapid-fire resubmits on a wrong code, and resending a code after the original times out no longer kills the session. Plus consistent "Powered by Certified" footer across all auth pages, ToS/Privacy links open in new tabs, consent buttons stack on mobile, and several dead-end error paths fixed.
Notes: https://github.com/hypercerts-org/ePDS/releases/tag/ePDS%400.6.1