Multi-Agent Workflows

Standard workflows for common development scenarios showing how multiple agents coordinate to deliver features, fix bugs, and maintain system quality.

Overview

This guide documents standard workflows that work in practice. Each workflow shows the sequence of agent handoffs, communication patterns, and decision points.

Standard Workflows

1. Feature Development Workflow

This is the primary workflow for implementing new features from concept to production.

Workflow Diagram

Product → Issue Created → Prime Agent Coordinates
    ↓
    └─ Feature Developer: Implement
       ├─ Break down requirements
       ├─ Design implementation
       ├─ Write code
       ├─ Test locally
       └─ Create PR → Code Review
           ├─ Code Reviewer: Review & Approve
           └─ Ready for Testing
               ↓
    ┌────────────────────────────────────────┐
    │ Stakeholder: Review on Test Server     │
    │ (Docker testing server deployment)     │
    └────────────────────────────────────────┘
    │ Approved? Yes
    └─ Workflow Enforcer: Validate & Merge
       ├─ Pre-deployment validation
       ├─ Merge PR
       └─ Deployment Complete → Documentation
           ├─ Technical Writer: Document
           ├─ Release notes
           ├─ Update overviews
           └─ Issue Closed

Step-by-Step Process

Phase 1: Requirement Gathering

  1. Product team identifies feature need
  2. Prime Agent creates GitHub issue with:
    • Clear user story
    • Acceptance criteria
    • Business context
    • Technical considerations
  3. Feature added to project board

Phase 2: Implementation

  1. Prime Agent delegates to Feature Developer
  2. Feature Developer:
    • Reviews requirements and acceptance criteria
    • Breaks down into implementation tasks
    • Designs implementation approach
    • Implements code changes
    • Tests locally (all acceptance criteria)
    • Creates pull request with description

Phase 3: Code Review

  1. Prime Agent delegates to Code Reviewer
  2. Code Reviewer:
    • Reviews code for quality and standards
    • Checks test coverage (>80% target)
    • Validates security implications
    • Checks documentation completeness
    • Approves or requests changes
  3. If changes requested:
    • Feature Developer makes updates
    • Code Reviewer re-reviews
    • Loop until approved

Phase 4: Testing

  1. Feature Developer deploys to Docker testing server
  2. Prime Agent requests stakeholder review
  3. Stakeholder (or QA Engineer):
    • Tests all acceptance criteria
    • Verifies on test server
    • Approves or identifies issues
  4. If issues found:
    • Issues documented
    • Feature Developer fixes
    • Return to Code Review phase
  5. If approved:
    • Ready for production deployment

Phase 5: Deployment

  1. Prime Agent delegates to Workflow Enforcer
  2. Workflow Enforcer:
    • Runs pre-deployment validation
    • Checks all tests passing
    • Verifies no conflicts
    • Merges PR to main
    • Triggers deployment pipeline
  3. Deployment to production

Phase 6: Documentation

  1. Prime Agent delegates to Technical Writer
  2. Technical Writer:
    • Creates release notes
    • Updates website overview
    • Adds user guide if needed
    • Verifies Hugo site builds
  3. Issue marked complete

Timeline

Phase Agent Duration Dependent On
Requirements Prime Agent 1-2 hours Product clarity
Implementation Feature Dev 4-16 hours Complexity
Code Review Reviewer 1-2 hours Code clarity
Testing Stakeholder 1-2 hours Feature scope
Deployment Enforcer 0.5 hours Approval
Documentation Writer 1-2 hours Completion

Example: Email Feature

Issue Created:

Title: Add email notification feature
Description:
- Users should receive email when new research available
- Configurable notification frequency
- Unsubscribe support
- HTML email templates

Acceptance Criteria:
- Email sent within 5 minutes of event
- All email templates rendered correctly
- Unsubscribe link works
- User preferences respected

Feature Developer Reports:

Implementation complete:
- Email service integrated
- Templates created (4 types)
- User preferences added
- Tested locally with all scenarios

PR ready for review
Docker deployment: https://phenom.matthewstevens.org/

Code Reviewer Approves:

Code review passed:
- Architecture clean and maintainable
- Test coverage: 85% (good)
- Security: No issues identified
- Documentation: Complete
- Approved ✓

Stakeholder Approves:

Testing on test server passed:
- Email delivery verified (5 min timing OK)
- Templates render correctly (all 4 types)
- Unsubscribe functionality works
- User preferences respected
- Approved for production ✓

Documentation Ready:

Release notes created for v1.2.0:
- Email notification feature description
- User guide for preferences
- Configuration documentation

Hugo site builds successfully.
Ready for deployment.

2. Bug Investigation and Fix Workflow

This workflow handles investigating and fixing bugs discovered in development or production.

Workflow Diagram

Bug Report → Prime Agent Receives
    ↓
    └─ QA Engineer: Investigate
       ├─ Reproduce issue
       ├─ Document steps
       ├─ Identify root cause
       ├─ Assess impact
       └─ Report Findings → Prime Agent Reviews
           ├─ Assess severity
           ├─ Prioritize
           └─ Delegate to Feature Developer
               ├─ Implement fix
               ├─ Test locally
               ├─ Create PR
               └─ Code Review
                   ├─ Code Reviewer: Review & Approve
                   └─ Testing
                       ├─ QA Engineer: Verify Fix
                       ├─ Regression testing
                       └─ Approved?
                           ├─ Yes: Deployment
                           │   └─ Workflow Enforcer: Deploy
                           └─ No: Back to Feature Developer

Step-by-Step Process

Phase 1: Bug Report

  1. Bug reported via:
    • GitHub issue
    • User feedback
    • Internal discovery
    • Monitoring alert
  2. Prime Agent receives report

Phase 2: Investigation

  1. Prime Agent delegates to QA Engineer
  2. QA Engineer:
    • Attempts to reproduce issue
    • Documents exact steps
    • Notes environment details
    • Identifies error patterns
    • Determines root cause
    • Assesses impact (# users, severity)
    • Reports findings with recommendations

Phase 3: Assessment

  1. Prime Agent reviews findings
  2. Prime Agent determines:
    • Severity level (Critical/High/Medium/Low)
    • Priority for fixing
    • Workaround availability
    • Resource allocation
  3. If low priority: Create follow-up issue
  4. If high priority: Proceed to fixing

Phase 4: Fix Implementation

  1. Prime Agent delegates to Feature Developer
  2. Feature Developer:
    • Reviews bug investigation report
    • Implements fix
    • Tests fix locally (verifies issue is gone)
    • Confirms no regressions locally
    • Creates PR with bug fix details

Phase 5: Code Review

  1. Prime Agent delegates to Code Reviewer
  2. Code Reviewer:
    • Reviews fix implementation
    • Ensures minimal, focused changes
    • Checks for edge cases
    • Approves fix

Phase 6: Fix Verification

  1. Prime Agent delegates to QA Engineer
  2. QA Engineer:
    • Tests fix on test server
    • Verifies original issue gone
    • Tests edge cases
    • Runs regression tests
    • Confirms fix is complete
  3. If additional issues found:
    • Report back to Feature Developer
    • Feature Developer makes additional fixes
    • Return to Code Review

Phase 7: Deployment

  1. Prime Agent delegates to Workflow Enforcer
  2. Workflow Enforcer:
    • Reviews fix readiness
    • Runs final validation
    • Merges PR
    • Deploys to production
  3. If production deployment:
    • Immediate deployment for Critical
    • Scheduled deployment for High/Medium
    • Bundle into release for Low

Phase 8: Post-Deployment

  1. QA Engineer monitors for:
    • Fix effectiveness
    • Side effects
    • User impact
  2. Prime Agent closes issue once verified

Timeline

Scenario Investigation Fix Review Deploy Total
Low priority bug 1-2 hrs 2-4 hrs 1-2 hrs 0.5 hrs 4.5-8.5 hrs
Medium priority 1-2 hrs 1-2 hrs 1 hr 0.5 hrs 3.5-5.5 hrs
High priority 0.5-1 hr 1-2 hrs 0.5-1 hr 0.5 hrs 2.5-4.5 hrs
Critical (prod) 0.25 hr 0.5-1 hr 0.25 hr 0.25 hr 1.25-2.25 hrs

Example: Login Bug

QA Engineer Investigation:

Bug Investigation: Login fails for some users

REPRODUCTION:
1. Visit login page
2. Enter valid credentials
3. Click submit
4. Some users: "Invalid credentials" error
5. Other users: Login successful (same credentials)

ROOT CAUSE FOUND:
- Session token validation bug
- Only affects users with certain regional characters in username
- Character encoding issue in validation function

IMPACT:
- Estimated 15% of user base affected
- Regional impact: EU users primarily
- Severity: HIGH - users cannot access accounts

RECOMMENDED FIX:
- Fix character encoding in session validator
- Add unit tests for character handling
- Test with international characters

STATUS: Ready for Feature Developer to fix

Feature Developer Fix:

Fix: Login bug for international characters

Root cause: Character encoding in session validator
Solution: Use proper UTF-8 encoding for all strings
Testing: Added 12 new unit tests for character handling

Changes:
- auth/validator.js - Fixed encoding (3 lines changed)
- tests/auth.test.js - Added character tests (15 new tests)

Local testing: All login scenarios verified
Ready for code review

QA Engineer Verification:

Fix Verification: Login bug for international characters
- Test Server: https://phenom.matthewstevens.org/

TESTING COMPLETED:
✓ Login with Latin characters (a-z, A-Z)
✓ Login with accented characters (é, ü, ñ, etc.)
✓ Login with Cyrillic characters (а, б, в, etc.)
✓ Login with CJK characters (中, 日, 한, etc.)
✓ Login with emoji in username (😀😃😄)
✓ Regression: Regular login still works
✓ All credentials accepted/rejected correctly

STATUS: VERIFIED - Fix is working correctly
Ready for production deployment

3. Production Issue Response Workflow

This workflow handles urgent issues discovered in production that need immediate resolution.

Workflow Diagram

Production Issue Detected → Alert/Notification
    ↓
    └─ Prime Agent: Assess Severity
       ├─ Critical (system down): Emergency path
       └─ High/Medium (degraded service): Expedited path
           ├─ Delegate to QA Engineer: Investigate
           │  ├─ Quick root cause analysis
           │  └─ Report findings
           │      ↓
           ├─ Delegate to Feature Developer: Hotfix
           │  ├─ Implement minimal fix
           │  └─ Test thoroughly
           │      ↓
           ├─ Delegate to Code Reviewer: Expedited Review
           │  └─ Quick quality check (no detailed review)
           │      ↓
           ├─ Delegate to Workflow Enforcer: Emergency Deploy
           │  ├─ Deploy immediately to production
           │  └─ Monitor closely
           │      ↓
           ├─ Delegate to QA Engineer: Verify Fix in Production
           │  └─ Confirm issue resolved
           │      ↓
           └─ Delegate to Technical Writer: Document
              └─ Create incident report

Step-by-Step Process

Phase 1: Issue Detection

  1. Issue discovered via:
    • Production monitoring alert
    • User reports
    • Internal observation
    • Error tracking system
  2. Prime Agent notified immediately

Phase 2: Severity Assessment

  1. Prime Agent determines severity:
    • Critical: System unavailable, data loss risk
    • High: Major feature broken, users unable to work
    • Medium: Feature degraded, partial workaround exists
  2. If Critical: Activate emergency response
  3. If High: Activate expedited response
  4. If Medium: Can follow standard fix workflow

Phase 3: Quick Investigation (Emergency Path)

  1. Prime Agent delegates to QA Engineer
  2. QA Engineer performs quick analysis:
    • Reproduce issue in production
    • Identify affected users/features
    • Determine if it’s a rollback candidate
    • Identify likely root cause
    • Report findings ASAP
  3. Report includes:
    • Impact assessment
    • Temporary workaround if available
    • Rollback viability

Phase 4: Immediate Response

  • If rollback viable: Execute rollback immediately
  • If rollback not viable: Proceed to hotfix

Phase 5: Hotfix Implementation (Emergency Path)

  1. Prime Agent delegates to Feature Developer
  2. Feature Developer:
    • Implements minimal fix (not full solution)
    • Tests thoroughly but quickly
    • Creates PR with clear description
    • Flags as EMERGENCY/HOTFIX
    • Ready for immediate review

Phase 6: Expedited Code Review

  1. Prime Agent delegates to Code Reviewer
  2. Code Reviewer does rapid review:
    • Check for obvious issues (not detailed review)
    • Verify minimal scope (not refactoring)
    • Ensure no security holes
    • Approve for production deployment
    • Full code review deferred to post-incident

Phase 7: Emergency Deployment

  1. Prime Agent delegates to Workflow Enforcer
  2. Workflow Enforcer:
    • Skips normal deployment validation
    • Deploys hotfix immediately to production
    • Monitors deployment closely
    • Confirms deployment succeeded
  3. Prime Agent notifies stakeholders: “Fix deployed”

Phase 8: Production Verification

  1. Prime Agent delegates to QA Engineer
  2. QA Engineer verifies in production:
    • Issue is actually resolved
    • No new errors introduced
    • User impact eliminated
    • Monitors for side effects
  3. Reports resolution to team

Phase 9: Post-Incident Activities

  1. Technical Writer:
    • Documents incident (when/what/how)
    • Captures lessons learned
    • Records root cause
  2. Feature Developer:
    • Plans proper fix for later release
    • Addresses temporary hotfix
  3. Team:
    • Discusses prevention measures
    • Updates monitoring/alerts

Timeline

Phase Duration Agent
Detection to assessment 2-5 min Alert system
Severity determination 5-10 min Prime Agent
Quick investigation 10-20 min QA Engineer
Rollback decision 5 min Prime Agent
Hotfix implementation 15-30 min Feature Developer
Code review 5-10 min Code Reviewer
Deployment 5-10 min Workflow Enforcer
Production verification 5-10 min QA Engineer
TOTAL: 57 min - 2 hrs Emergency path

Example: API Timeout Issue

Alert: API response time > 30s (normally 100ms)

QA Engineer Investigation (10 min):

Production Issue Investigation

ISSUE: API timeouts - customers unable to submit reports
IMPACT: 100% of API requests timing out
AFFECTED USERS: All active users

INVESTIGATION:
- Checked API logs: No error messages
- Checked database: 500K+ pending queries
- Checked database indexes: Missing index on new query

ROOT CAUSE: Database query without index
Missing indexes on: events.created_at, events.user_id
Causing full table scan: 50M rows × multiple queries

IMPACT ASSESSMENT:
- User-facing: All report submissions timing out
- Data loss risk: None (queue processing pending)
- Time to fix: 5 min (add index) OR 15 min (rollback)

RECOMMENDATION: Add missing database index (faster than rollback)
Status: Ready for developer

Decision (2 min): Prime Agent decides: Add index (not rollback) - less risk

Feature Developer Hotfix (15 min):

Hotfix: Add missing database index

Issue: API timeouts due to missing database index
Solution: Add index on events.created_at and events.user_id

Changes:
- migrations/add_event_indexes.sql (2 indexes)
- Zero downtime operation

Testing:
- Index creation tested on test database (5 min)
- Query performance verified: 100ms response time

Status: Ready for review

Code Reviewer (5 min):

Expedited Hotfix Review

Hotfix Review (emergency rules):
✓ Scope minimal - just adding index
✓ No security issues
✓ No obvious errors
✓ Safe for production

Approved for immediate deployment
Note: Full review in post-incident debrief

Workflow Enforcer (5 min):

Emergency Deployment complete
- Migration running...
- Index created successfully
- API response times: Back to 100ms

Status: DEPLOYED - 25 minutes from detection to resolution

QA Engineer Verification (5 min):

Production Verification of Fix:

Testing:
✓ API response times: 95-115ms (normal)
✓ Report submissions: Working
✓ No new errors in logs
✓ Database queries: Using indexes correctly

Conclusion: Issue RESOLVED
Status: All systems normal

Post-Incident:

Incident Report: API Timeouts

Root Cause: Missing database indexes
Time to Resolution: 23 minutes

Prevention:
- Add database migration verification to deployment process
- Create test queries that validate index usage
- Add performance regression tests
- Include DBA review for schema changes

Lessons Learned:
- Current monitoring good: Caught issue immediately
- Process worked well: Quick response despite complexity
- Improvement: Add index creation to standard deployment checks

Workflow Comparison

Aspect Feature Dev Bug Fix Production Issue
Duration 4-24 hours 3-6 hours 1-2 hours
Planning Detailed requirements Investigation focused Root cause quick scan
Implementation Complete solution Minimal fix Emergency hotfix
Testing Comprehensive Thorough Quick verification
Approval Full review Standard review Expedited review
Deployment Scheduled Scheduled or immediate Immediate
Post-deployment Standard monitoring Standard monitoring Intensive monitoring
Documentation Release notes Standard updates Incident report

Choosing the Right Workflow

Issue Type
  │
  ├─ New feature request?
  │   └─ YES → Feature Development Workflow
  │
  ├─ Bug found in development/test?
  │   └─ YES → Bug Investigation and Fix Workflow
  │
  ├─ Issue in production?
  │   │
  │   ├─ System down or major feature broken?
  │   │   └─ YES → Production Issue Response Workflow (Emergency)
  │   │
  │   └─ Feature degraded but workable?
  │       └─ YES → Production Issue Response Workflow (Expedited)
  │
  └─ Other?
      └─ Escalate to Prime Agent for custom workflow

Workflow Improvements

Each workflow should be reviewed after completion:

  1. What went well: Document successful patterns
  2. What was difficult: Identify pain points
  3. What took too long: Find bottlenecks
  4. What to do differently: Plan improvements
  5. Success metrics: Did we meet timeline/quality targets?