Deployment Guide

Terraform deployment procedures, Docker image builds, DNS configuration, and post-deployment setup for the Phenom Chat system.

This guide covers deploying and updating the Phenom Chat infrastructure. It assumes familiarity with Terraform, AWS CLI, and Docker.

Prerequisites

ToolVersionPurpose
Terraform>= 1.0Infrastructure as code
AWS CLIv2AWS resource management
DockerLatestImage builds (on ai.matthewstevens.org only)
SSH accessai alias configuredBuild server for Docker images

Terraform Modules Overview

The chat system is composed of five Terraform modules, all under modules/ in the phenom-infra repository:

ModulePurposeKey Resources
chat-sharedShared Cognito clients and secretsCognito OIDC client (for Synapse), Cognito agent client (for MCP), Secrets Manager secret
chat-link-previewLink preview resolutionLambda function, IAM role, CloudWatch log group
chat-hasura-liteChat schema and Hasura metadataSQL migration script, metadata setup script (run as ECS tasks)
chat-synapseSynapse homeserver and Admin UIECS services (Synapse + Admin UI), ALB rules, security groups, RDS schema, Synapse config in Secrets Manager
chat-mcp-serverMCP tool serverECS service, ALB rules, target group

Module Dependency Graph

graph LR SHARED["chat-shared
Cognito clients
Secrets Manager"] LP["chat-link-preview
Lambda function"] HL["chat-hasura-lite
SQL migration
Hasura metadata"] SYN["chat-synapse
Synapse + Admin UI
ECS services"] MCP["chat-mcp-server
MCP server
ECS service"] SHARED --> SYN SHARED --> MCP LP --> HL HL --> MCP style SHARED fill:#151515,color:#e0e0e0,rx:30 style MCP fill:#121010,color:#a5e3e8,rx:30

Database Migrations

SQL Migration (Hasura Lite Schema)

The schema migration file is at modules/chat-hasura-lite/sql/001-chat-schema.sql. It creates all five chat tables, indexes, triggers, and seeds the three rooms.

To run the migration as an ECS task:

# 1. Register a one-shot task definition using postgres:16-alpine
aws ecs register-task-definition \
  --family phenom-dev-chat-migration \
  --network-mode awsvpc \
  --requires-compatibilities FARGATE \
  --cpu 256 --memory 512 \
  --execution-role-arn <ecs-task-execution-role-arn> \
  --container-definitions '[{
    "name": "migration",
    "image": "postgres:16-alpine",
    "command": ["psql", "-h", "<rds-host>", "-U", "<db-user>", "-d", "<db-name>", "-f", "/sql/001-chat-schema.sql"],
    "essential": true,
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "/ecs/phenom-dev-chat-migration",
        "awslogs-region": "us-east-1",
        "awslogs-stream-prefix": "migration"
      }
    }
  }]'

# 2. Run the task
aws ecs run-task \
  --cluster phenom-dev-cluster \
  --task-definition phenom-dev-chat-migration \
  --launch-type FARGATE \
  --network-configuration '{
    "awsvpcConfiguration": {
      "subnets": ["<private-subnet-1>", "<private-subnet-2>"],
      "securityGroups": ["<ecs-tasks-sg>"],
      "assignPublicIp": "DISABLED"
    }
  }'

Monitor the task in CloudWatch logs to confirm successful completion.


Docker Image Builds

MCP Server Image

The MCP server Dockerfile is at modules/chat-mcp-server/app/Dockerfile. Build on the AI server:

# SSH to the build server
ssh ai

# Set PATH for Docker/OrbStack
export PATH=/usr/local/bin:/opt/homebrew/bin:$HOME/.orbstack/bin:$PATH

# Navigate to the MCP server app directory
cd /path/to/phenom-infra/modules/chat-mcp-server/app

# Build and push to Docker Hub (private registry)
docker buildx build \
  --platform linux/amd64 \
  --builder multiarch \
  --push \
  -t applepublicdotcom/phenom-chat-mcp:testing \
  .

Synapse Wrapper Image

The custom Synapse image includes a startup wrapper that injects the homeserver.yaml configuration from the SYNAPSE_CONFIG_YAML environment variable (sourced from Secrets Manager).

ssh ai
export PATH=/usr/local/bin:/opt/homebrew/bin:$HOME/.orbstack/bin:$PATH

cd /path/to/phenom-infra/modules/chat-synapse/docker

docker buildx build \
  --platform linux/amd64 \
  --builder multiarch \
  --push \
  -t applepublicdotcom/phenom-synapse:testing \
  .

Docker Hub Authentication for ECS

ECS tasks pull images from the private applepublicdotcom Docker Hub registry. The task definitions include repositoryCredentials pointing to a Secrets Manager secret with Docker Hub credentials:

repositoryCredentials = {
  credentialsParameter = var.dockerhub_credentials_arn
}

The secret must contain:

{
  "username": "applepublicdotcom",
  "password": "<docker-hub-access-token>"
}

DNS Configuration

Chat uses a Cloudflare CNAME record pointing to the ALB:

RecordTypeValueProxy
chat-testing.thephenom.appCNAMEphenom-dev-alb-XXXXXXXX.us-east-1.elb.amazonaws.comProxied (orange cloud)

For production, the record would be chat.thephenom.app.


ALB Routing Rules

The ALB listener routes traffic based on host header and path pattern:

PriorityHost HeaderPath PatternTarget GroupPort
200chat-testing.thephenom.app/_matrix/*, /_synapse/*phenom-dev-syn-tg8008
201chat-testing.thephenom.app/chat-admin/*phenom-dev-syn-admin-tg80
300chat-testing.thephenom.app/mcp/*phenom-dev-mcp-tg3001

These rules are defined in:

  • modules/chat-synapse/alb-rules.tf (priorities 200-201)
  • modules/chat-mcp-server/alb-rules.tf (priority 300)

Cognito Client Setup

The chat-shared module creates two Cognito User Pool clients:

OIDC Client (for Synapse SSO)

resource "aws_cognito_user_pool_client" "synapse_oidc" {
  name                    = "${var.project_name}-synapse-oidc"
  generate_secret         = true
  allowed_oauth_flows     = ["code"]
  allowed_oauth_scopes    = ["openid", "email", "profile"]
  callback_urls           = [
    "https://${var.synapse_server_name}/_synapse/client/oidc/callback"
  ]
  supported_identity_providers = ["COGNITO"]
}

This client uses the authorization code flow. Synapse’s OIDC provider configuration references this client’s ID and secret.

Agent Client (for MCP Server)

resource "aws_cognito_user_pool_client" "agent" {
  name            = "${var.project_name}-chat-agent"
  generate_secret = true
  explicit_auth_flows = [
    "ALLOW_USER_PASSWORD_AUTH",
    "ALLOW_REFRESH_TOKEN_AUTH",
  ]
}

This client uses USER_PASSWORD_AUTH for machine-to-machine authentication. The MCP server authenticates using AGENT_USERNAME and AGENT_PASSWORD environment variables.

Both client credentials are stored in Secrets Manager under phenom-dev-development-chat-clients.


Post-Deployment Steps

After terraform apply completes successfully:

1. Run Database Migration

Execute the SQL migration ECS task (see Database Migrations above) to create the chat tables and seed the three rooms.

2. Configure Hasura Metadata

Track the chat tables in Hasura and set up relationships and permissions:

# Run the metadata setup script as an ECS task (uses curl image)
# The script calls Hasura metadata API endpoints to:
# - Track all 5 chat tables
# - Create foreign key relationships
# - Set up role-based permissions (user, support, admin)
# - Configure event triggers for link preview resolution

3. Create Synapse Admin Bot Account

Register an admin account on Synapse using the registration shared secret:

# Get the registration shared secret from Secrets Manager
aws secretsmanager get-secret-value \
  --secret-id phenom-dev-synapse-homeserver-config \
  --query 'SecretString' \
  --output text | grep registration_shared_secret

# Register admin user via Synapse admin API
curl -X POST "https://chat-testing.thephenom.app/_synapse/admin/v1/register" \
  -H "Content-Type: application/json" \
  -d '{
    "nonce": "<get from GET /_synapse/admin/v1/register>",
    "username": "admin",
    "password": "<strong-password>",
    "admin": true
  }'

4. Create Matrix Rooms

Create the three chat rooms on Synapse:

# Using the admin access token from step 3
curl -X POST "https://chat-testing.thephenom.app/_matrix/client/v3/createRoom" \
  -H "Authorization: Bearer <admin-access-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Phenom Internal",
    "topic": "Internal team chat",
    "visibility": "private",
    "preset": "private_chat"
  }'

# Repeat for Phenom Partners and Phenom Community

5. Verify Services

# Check all ECS services are running
aws ecs describe-services --region us-east-1 \
  --cluster phenom-dev-cluster \
  --services phenom-dev-synapse phenom-dev-synapse-admin phenom-dev-chat-mcp \
  --query 'services[*].{Name:serviceName,Running:runningCount,Status:status}' \
  --output table

# Verify health endpoints
curl -s https://chat-testing.thephenom.app/mcp/health | jq .
curl -s https://chat-testing.thephenom.app/_matrix/client/v3/login | jq .

Environment Wiring

All chat modules are wired together in environments/development/main.tf. The key variable connections:

module "chat_shared" {
  source               = "../../modules/chat-shared"
  project_name         = local.project_name
  environment          = "development"
  cognito_user_pool_id = module.cognito.user_pool_id
  cognito_domain       = module.cognito.domain
  synapse_server_name  = "chat-testing.thephenom.app"
}

module "chat_synapse" {
  source                     = "../../modules/chat-synapse"
  project_name               = local.project_name
  environment                = "development"
  vpc_id                     = module.networking.vpc_id
  private_subnet_ids         = module.networking.private_subnet_ids
  ecs_cluster_id             = module.ecs.cluster_id
  alb_listener_arn           = module.alb.https_listener_arn
  alb_security_group_id      = module.alb.security_group_id
  ecs_tasks_security_group_id = module.networking.ecs_tasks_sg_id
  server_name                = "chat-testing.thephenom.app"
  rds_host                   = module.rds.endpoint
  rds_port                   = module.rds.port
  rds_master_username        = module.rds.master_username
  rds_master_password        = module.rds.master_password
  cognito_user_pool_id       = module.cognito.user_pool_id
  cognito_oidc_client_id     = module.chat_shared.synapse_oidc_client_id
  cognito_oidc_client_secret = module.chat_shared.synapse_oidc_client_secret
  dockerhub_credentials_arn  = var.dockerhub_credentials_arn
}

module "chat_mcp_server" {
  source                     = "../../modules/chat-mcp-server"
  project_name               = local.project_name
  vpc_id                     = module.networking.vpc_id
  private_subnet_ids         = module.networking.private_subnet_ids
  ecs_cluster_id             = module.ecs.cluster_id
  alb_listener_arn           = module.alb.https_listener_arn
  ecs_tasks_security_group_id = module.networking.ecs_tasks_sg_id
  server_name                = "chat-testing.thephenom.app"
  hasura_endpoint            = "http://${module.ecs.graphql_service_name}:8080/v1/graphql"
  chat_secrets_arn           = module.secrets.chat_secrets_arn
  backend_type               = "hasura"
  dockerhub_credentials_arn  = var.dockerhub_credentials_arn
}

Deployment Pipeline

The standard deployment flow for chat infrastructure changes:

flowchart TD A["Code change in phenom-infra"] --> B["terraform plan"] B --> C{"Review plan output"} C -->|"Approved"| D["terraform apply"] C -->|"Rejected"| A D --> E{"Docker image changes?"} E -->|"Yes"| F["Build on ai.matthewstevens.org
Push :testing tag to Docker Hub"] E -->|"No"| G["Skip image build"] F --> H["Force new ECS deployment"] G --> H H --> I["Verify health endpoints"] I --> J{"Schema changes?"} J -->|"Yes"| K["Run migration ECS task
Update Hasura metadata"] J -->|"No"| L["Deployment complete"] K --> L style A fill:#1a1a1a,color:#fff,rx:30 style L fill:#121010,color:#a5e3e8,rx:30