Cognito Authentication Flows

How sign-in, password reset, and signup behave across the three Phenom Cognito pools. Includes the custom_message Lambda, the signup-disabled lockdown, mobile-app integration notes, and the operations playbook. Reflects live state as of 2026-05-19.

Companion to Cognito Email via SES, which documents the email-delivery layer (SES wiring, identity policy, DKIM). This page documents the auth flows that ride on top: sign-in, password reset, signup lockdown, and operations.

Reflects live state as of 2026-05-19.

Pool inventory (live)

Three Cognito user pools, all in us-east-1, all in AWS account 657033058608:

NamePool IDEstimated usersUsed by
phenom-stagingus-east-1_n8gO6SbP613Mobile app production build (intentional staging-as-prod)
phenom-produs-east-1_knEL7cqS34Reserved for future production cutover
phenom-dev-localus-east-1_AkG9mnbjA5Local-dev workstation (localhost:8080)

App clients with callbacks:

PoolClient nameClient IDCallback URLs
phenom-stagingphenom-dev-hasura-client6sjjnkaeagnqgkmbl1mr5rtfsrhttp://localhost:3000/*
phenom-stagingphenom-dev-synapse-oidc73q703cql980nrvq554a6sta54https://chat-testing.thephenom.app/_synapse/client/oidc/callback
phenom-prodphenom-prod-hasura-client8uun49ru7f3fdvmlc12vqig3ahttps://www.thephenom.app/*
phenom-dev-localphenom-dev-hasura-client-local2eq1vf0nvl5o3rha2vshm8j0mnhttp://localhost:8080/*
phenom-dev-localphenom-dev-nest-ops5u6atviker41lm8qknqua56sdchttps://nest-ops.thephenom.app/oauth2/idpresponse

Hosted UI domains (Cognito-managed):

  • https://us-east-1n8go6sbp6.auth.us-east-1.amazoncognito.com (phenom-staging)
  • https://phenom-prod-hasura-auth.auth.us-east-1.amazoncognito.com (phenom-prod)
  • https://phenom-dev-hasura-auth.auth.us-east-1.amazoncognito.com (phenom-dev-local)

Self-signup disabled (issue #72)

All three pools enforce admin-only user creation. Live state (verified 2026-05-19):

admin_create_user_config {
  allow_admin_create_user_only = true
}
  • SignUp API returns NotAuthorizedException: SignUp is not permitted for this user pool.
  • Hosted UI /signup is hidden. The “Sign up” link is removed from the hosted UI /login page.
  • User creation continues via admin-create-user (and via Terraform).
  • Password reset, admin invite, and existing auth flows are unaffected.

Live probe to confirm:

aws cognito-idp sign-up \
  --client-id 6sjjnkaeagnqgkmbl1mr5rtfsr \
  --username probe@example.com --password 'NoSignups!2026' \
  --region us-east-1
# → NotAuthorizedException: SignUp is not permitted for this user pool

Closed by phenom-infra commit 75afcd8 and issue #72.

Password reset flow

What the user experiences

  1. Client (mobile app, hosted UI, or web) calls ForgotPassword against the pool’s app client.
  2. Cognito generates a 6-digit code and invokes the custom_message Lambda with triggerSource = "CustomMessage_ForgotPassword".
  3. The Lambda returns a branded HTML body containing the code prominently plus a “Reset password” button linking to https://www.thephenom.app/reset-password?code={####}&email=<user>. Cognito substitutes the literal {####} placeholder with the actual code before sending to SES.
  4. SES delivers the email from Phenom <noreply@thephenom.app> (DKIM-signed; details on the SES page).
  5. User enters the code + new password on the reset surface. The client calls ConfirmForgotPassword. Done.

The custom_message Lambda

PR #71 (merged 2026-05-18) added a per-environment custom_message Lambda:

  • phenom-dev-cognito-custom-message (shared by phenom-staging and phenom-dev-local)
  • phenom-prod-cognito-custom-message (phenom-prod)

The Lambda intercepts only CustomMessage_ForgotPassword. All other trigger sources (sign-up, admin invite, attribute verification, MFA challenge) pass through untouched, so Cognito uses its built-in defaults for those. The Lambda code lives at environments/{development,production}/lambda-functions/cognito-custom-message/index.js (parallel copies; identical contents).

Lambda env var PASSWORD_RESET_URL controls the link destination. Default https://www.thephenom.app/reset-password (set via local.password_reset_url in each environment’s locals.tf).

The Lambda never sees the real reset code in memory. Cognito performs {####} substitution after the Lambda returns. Useful security property: the code never lands in CloudWatch logs.

Mobile-app integration (what the app needs to do)

The phenom-infra side is complete. The mobile app needs to:

  1. Call ForgotPassword when the user taps “Forgot password”:

    await cognito.forgotPassword({
      ClientId: COGNITO_CLIENT_ID,  // 6sjjnkaeagnqgkmbl1mr5rtfsr for the current live build
      Username: email,
    })
    

    This triggers the email. The API also returns CodeDeliveryDetails (destination, medium) which the app should surface (“Code sent to t***@example.com”).

  2. Call ConfirmForgotPassword when the user enters the code + new password:

    await cognito.confirmForgotPassword({
      ClientId: COGNITO_CLIENT_ID,
      Username: email,
      ConfirmationCode: code,
      Password: newPassword,
    })
    
  3. No callback URL needed. ForgotPassword and ConfirmForgotPassword are public Cognito endpoints; they do not use the OAuth callback flow. The client just needs ClientId + Username + (for confirm) ConfirmationCode + Password.

  4. Auth flows configured on the staging client (as of 2026-05-19, matching prod): ALLOW_USER_SRP_AUTH, ALLOW_USER_PASSWORD_AUTH, ALLOW_REFRESH_TOKEN_AUTH. Use SRP for sign-in.

Reset URL destination (live as of 2026-05-19)

https://www.thephenom.app/reset-password is live. The implementation lives in the Phenom-earth/www repo at web/reset-password/ (PR #117, merged 2026-05-19T15:46:30Z).

Behaviour:

  • GET /reset-password returns HTTP 308 redirect to /reset-password/ (trailing-slash convention). Query string is preserved through the redirect, so the Lambda’s URL works as-is without a Lambda change.
  • GET /reset-password/?email=…&code=… returns HTTP 200, renders the form with email readonly + prefilled, code prefilled when the URL value is exactly six digits, focus jumps to the new-password field.
  • On submit the page POSTs to https://cognito-idp.us-east-1.amazonaws.com/ with X-Amz-Target: AWSCognitoIdentityProviderService.ConfirmForgotPassword. Zero SDK dependency, vanilla JS, ~150 lines.
  • The page hardcodes the staging hasura client (6sjjnkaeagnqgkmbl1mr5rtfsr) today because that is the pool the live mobile app build targets. When the prod-pool cutover happens, swap the constant in web/reset-password/reset-password.js or add a ?cid= URL parameter and a small routing layer.
  • Cognito error mapping handled for the common cases: CodeMismatchException, ExpiredCodeException, InvalidPasswordException, LimitExceededException, TooManyFailedAttemptsException, UserNotFoundException. Other errors surface the raw Cognito message.

Verification done end-to-end via Interceptor on ai (logged-in CF Access cookie): production URL serves the form, form prefills correctly, synthetic code submission roundtrips through Cognito and renders the user-friendly error. Success path (valid code + valid password) is not yet round-tripped in automation; it requires reading a fresh code from WorkMail.

Mobile-app users would still benefit from in-app reset (no email click needed). Universal Links / App Links remain an option for a future iteration; they require apple-app-site-association + assetlinks.json served from www plus mobile-app entitlements.

Operations

Test accounts

The following CONFIRMED test users live in each pool (created 2026-05-19 for password-reset validation; pool-distinguishable usernames so the email recipient can tell which pool sent the reset):

PoolTest userNotes
phenom-stagingtest-staging@thephenom.appCONFIRMED, email_verified
phenom-prodtest-prod@thephenom.appCONFIRMED, email_verified
phenom-dev-localtest-devlocal@thephenom.appCONFIRMED, email_verified

Mail to *@thephenom.app is routed via SES inbound (inbound-smtp.us-east-1.amazonaws.com MX) to the WorkMail organisation m-85dbc6db1b474331af97f5ce0e777740. Shared initial password is held by on-call; rotate after live validation work and use admin-set-user-password to reset.

Testing playbook

Trigger a forgot-password from the CLI:

aws cognito-idp forgot-password \
  --client-id 6sjjnkaeagnqgkmbl1mr5rtfsr \
  --username test-staging@thephenom.app \
  --region us-east-1

Tail the Lambda log:

aws logs filter-log-events \
  --log-group-name /aws/lambda/phenom-dev-cognito-custom-message \
  --start-time $(( ($(date +%s) - 300) * 1000 )) \
  --region us-east-1

Watch SES delivery metric:

aws cloudwatch get-metric-statistics \
  --namespace AWS/SES --metric-name Send \
  --start-time $(date -u -d '10 minutes ago' +%FT%TZ) \
  --end-time $(date -u +%FT%TZ) \
  --period 60 --statistics Sum --region us-east-1

Confirm reset (after the user reads the code from the inbox):

aws cognito-idp confirm-forgot-password \
  --client-id 6sjjnkaeagnqgkmbl1mr5rtfsr \
  --username test-staging@thephenom.app \
  --confirmation-code XXXXXX \
  --password 'NewPassword!2026Aa#' \
  --region us-east-1

Hosted-UI URLs for manual demos

phenom-staging
https://us-east-1n8go6sbp6.auth.us-east-1.amazoncognito.com/forgotPassword?client_id=6sjjnkaeagnqgkmbl1mr5rtfsr&response_type=token&scope=email+openid+profile&redirect_uri=http%3A%2F%2Flocalhost%3A3000%2F

phenom-prod
https://phenom-prod-hasura-auth.auth.us-east-1.amazoncognito.com/forgotPassword?client_id=8uun49ru7f3fdvmlc12vqig3a&response_type=token&scope=email+openid+profile&redirect_uri=https%3A%2F%2Fwww.thephenom.app%2F

phenom-dev-local
https://phenom-dev-hasura-auth.auth.us-east-1.amazoncognito.com/forgotPassword?client_id=2eq1vf0nvl5o3rha2vshm8j0mn&response_type=token&scope=email+openid+profile&redirect_uri=http%3A%2F%2Flocalhost%3A8080%2F

All three Hosted UI domains are live (HTTP 200 as of 2026-05-19).

CI/CD

Workflows

  • Chat Infrastructure CI (.github/workflows/chat-ci.yml in phenom-infra). Plan + test + security + deploy for the development environment. Tier 4 auto-applies a narrow target on push to main: terraform apply -target=module.chat_synapse -target=module.chat_mcp_server. Path filter: modules/chat-*/** and environments/development/**.
  • Production Infrastructure CI (.github/workflows/prod-infra-ci.yml, added 2026-05-19). Plan + security + manual apply for the production environment. Plans on every push to main and every PR touching environments/production/** or shared modules/**. Apply is never automatic. The operator triggers workflow_dispatch with confirm: CONFIRM and an audit-trail reason string. Closes the prod drift-detection gap.

Both workflows authenticate via OIDC to the same IAM role (phenom-dev-github-actions). On 2026-05-19 the role was extended with cognito-idp:Get*, lambda:GetFunction* / ListVersionsByFunction / GetPolicy, secretsmanager:GetResourcePolicy, ses:Get* / List*, sesv2:GetEmailIdentity*, and rds:ListTagsForResource to satisfy AWS provider 6.x refresh calls. Also extended to grant access to the phenom-production-tfstate bucket so the same role can plan against both environments.

Terraform state

  • Backends: S3 (phenom-{development,production}-tfstate buckets in us-east-1).
  • DynamoDB state lock (new 2026-05-19): table terraform-locks. Both backends now declare dynamodb_table = "terraform-locks" and encrypt = true. Prevents concurrent-apply state corruption. Lock acquire/release is visible in terraform apply output.

Lambda packaging gotcha

data "archive_file" blocks zip the entire source_dir, including untracked files. A stray bun.lock or .DS_Store in a Lambda source directory caused source_code_hash drift between machines. Resolved 2026-05-19 (phenom-infra commit fd2f5bf):

  • .gitignore now excludes **/lambda-functions/**/bun.lock and **/.DS_Store globally.
  • Drifted Lambdas (hasura_action_phenom_handler dev+prod, hasura_cognito_sync_users dev+prod, file_validator dev) were re-applied with clean source dirs so deployed zips match repo source.

If you see source_code_hash drift on the next plan, check for untracked files in the Lambda source dir before applying.

Recent commits

phenom-infra:

DateCommitWhat
2026-05-169bff2efPR #69 SES email delivery on all three pools
2026-05-18cd3f582PR #71 custom_message Lambda with reset URL
2026-05-189204f46IAM extension for AWS provider 6.x refresh
2026-05-191f20a65IAM add cognito-idp:Get*
2026-05-1961d8cabDocker login user fix (applepublic, not applepublicdotcom)
2026-05-1975afcd8Disable self-signup on all three pools (#72)
2026-05-19fd2f5bfHardening bundle (DDB lock, drift fix, prod CI, client parity, IAM expansion) (#73)
2026-05-193b69ac8prod-infra-ci contents: write permission for commit-comment step
2026-05-19dc9890fNarrow cognito-idp:Get* to specific actions, satisfy Checkov CKV_AWS_287

www (the reset page itself):

DateCommitWhat
2026-05-19572c0c9PR #117 /reset-password page (HTML + zero-dep vanilla JS)

phenom-earth-docs:

DatePRWhat
2026-05-19PR #276scripts/canonicalize-dossier-graph-urls.py + canonicalized graph.json (slug-rule reconciliation against the manifest)
2026-05-19PR #275this page

Closed issues

  • phenom-infra#70 Add password-reset email URL via custom_message Lambda (closed via PR #71)
  • phenom-infra#72 Disable self-signup on all three user pools (closed via commit 75afcd8)
  • phenom-infra#73 Hardening bundle (closed via commit fd2f5bf)
  • www#116 /reset-password page (closed via PR #117 merge)

Known follow-ups (outside the work above)

  • Mobile app ForgotPasswordScreen is a stub. PhenomApp/.../Account/ForgotPasswordScreen.tsx:50 has onPress={() => {}} on the Resend button. The Cognito reset email is delivered, but the mobile app does not yet call ForgotPassword or ConfirmForgotPassword. Owner: mobile dev. Once wired, mobile users skip the web reset page entirely.
  • Upstream graph generator for disclosure-dossier-<release>-graph.json should produce canonical S3-matching URLs in the first place, making canonicalize-dossier-graph-urls.py a belt-and-suspenders defence rather than a hot patch. Generator lives outside phenom-earth-docs.
  • Pages Function SigV4 strict encoding. functions/files/disclosure-dossier/[[path]].ts line 182 uses encodeURIComponent for the canonical path. That does not strict-encode ' ( ) * ! (S3 needs %27 %28 %29 %2A %21). Today no canonical S3 key contains any of those, so this is latent rather than active. Worth a small awsUriEncode helper to close the gap.
  • Admin-sandbox reset-password-form.tsx in phenom-backend reads ?email= but not ?code=. Adding ?code= prefill there is redundant now that the live page on www already handles both, but kept as a known follow-up if the admin-sandbox is ever deployed.
  • Cloudflare edge negative-cache. Observed during dossier validation: edge served stale 404 from the int-docs Pages Function despite cache-control: no-store on error paths. ?_=<ts> cache-busting walked around it. Worth confirming whether the no-store header is honoured at the edge.
  • failure_threshold deprecated on aws_service_discovery_service in modules/ecs/services.tf. Provider warning today, breaking in a future provider major.
  • Lambda code duplication between environments/{dev,prod}/lambda-functions/ for hasura-cognito-trigger, hasura-cognito-sync-users, hasura-action-phenom-handler, cognito-custom-message. Consolidate into modules/lambdas/<name>/.
  • invite_message_template not set on admin_create_user_config. Admin-invite emails use plain Cognito boilerplate; should be branded like the password-reset HTML body (a second custom_message Lambda branch, triggerSource === 'CustomMessage_AdminCreateUser').
  • phenom-mailer Cloudflare Worker still does not exist. Per the 2026-04-18 directive, transactional email should eventually route through workers/phenom-mailer/ for consistency.

See also: Cognito Email via SES for the email-delivery layer.

Maintained by infra-on-call. Update this page when Cognito state changes materially.