atBB Deployment Infrastructure Design#

Date: 2026-02-11 Status: Approved for Implementation

Overview#

This document outlines the deployment infrastructure for atBB: Docker containerization, GitHub Actions CI/CD pipeline, and administrator deployment guide.

Goals#

Package both services in one Docker image for simple deployment
Automate image builds and publishing to GitHub Container Registry (GHCR)
Document the complete deployment process
Support single-instance and production deployments

Non-Goals#

Redis session storage integration (future work)
Multi-instance/load-balanced deployments (requires Redis first)
Kubernetes-specific manifests (docker-compose example sufficient for now)

Architecture Decisions#

Container Architecture#

Decision: Single Docker container with both appview and web services, using nginx for internal routing and a process manager to run both apps.

Rationale:

Operators deploy one artifact
Preserves app separation without refactoring
Simple to understand and debug
Scales to medium deployments

Structure:

┌─────────────────────────────────────┐
│  atBB Container (Port 80)           │
│  ┌──────────────────────────────┐   │
│  │  Nginx (port 80)              │   │
│  │  Routes:                      │   │
│  │  - /api/* → appview:3000      │   │
│  │  - /*     → web:3001          │   │
│  └──────────────────────────────┘   │
│                                     │
│  ┌────────────┐  ┌──────────────┐  │
│  │  appview   │  │  web         │  │
│  │  (port 3000)  │  (port 3001) │  │
│  └────────────┘  └──────────────┘  │
│                                     │
│  Process Manager: Simple shell      │
│  script or npm-run-all              │
└─────────────────────────────────────┘

Alternatives Considered:

Separate containers for appview and web: More complex, overkill for current scale
Merge apps into single Hono service: Would require refactoring, couples the apps

Reverse Proxy#

Decision: Nginx inside the container for internal routing. Recommend Caddy outside the container for operators' infrastructure.

Rationale:

Nginx inside: Standard, well-known, simple configuration for routing between two apps
Caddy outside: Modern, easy automatic HTTPS for operators
Clear separation: internal routing vs. external TLS/domain management

Database Strategy#

Decision: External PostgreSQL database. Provide docker-compose example for testing.

Rationale:

Production best practice: managed database services (AWS RDS, DigitalOcean, etc.)
Easier backups, scaling, monitoring
Docker compose example lowers barrier for testing/development

Alternatives Considered:

Include postgres in same container: Bad practice, complicates backups/scaling
Require docker-compose only: Too rigid, doesn't support managed DB deployments

Session Storage#

Decision: In-memory sessions only (current implementation). Document Redis as future enhancement.

Rationale:

Application lacks Redis integration
In-memory storage suffices for single-instance deployments
Admin guide documents limitations and future plans

Implications:

Sessions lost on container restart
No multi-instance deployment support yet
Operators must be aware of this limitation

Port Exposure#

Decision: Expose only port 80 from the container.

Rationale:

Clean interface: one port for the entire forum
Operators map their host port to container's 80 (e.g., -p 8080:80)
If debugging needed, operators can docker exec into container

Alternatives Considered:

Also expose 3000 and 3001: Adds complexity, rarely needed

Build Strategy#

Decision: Multi-stage Docker build with separate build and runtime stages.

Stage 1 - Builder:

Base: node:22-alpine
Install pnpm globally
Copy entire monorepo (respecting .dockerignore)
pnpm install --frozen-lockfile (all dependencies including dev)
pnpm build (turbo builds lexicon → appview + web)

Stage 2 - Runtime:

Base: node:22-alpine
Install nginx and process management tools
Copy only production files from builder:
- Workspace configs, package.json files
- apps/appview/dist/, apps/web/dist/
- packages/*/dist/ (db, lexicon)
pnpm install --prod --frozen-lockfile (production deps only)
Copy nginx config and entrypoint script

Benefits:

Small final image (~200-250MB vs ~1GB single-stage)
Faster deploys, less attack surface
Production image contains no build tools or dev dependencies

Alternatives Considered:

Single-stage build: Simple but results in huge images with unnecessary tools
Use Nix/devenv in container: Matches local dev but significantly larger, unnecessary complexity

CI/CD Pipeline#

Decision: Two GitHub Actions workflows

Workflow 1: Pull Request Checks (.github/workflows/ci.yml)

Triggers: On PR open/update
Jobs (parallel after setup):
- Lint: pnpm turbo lint
- Test: pnpm test
- Build: pnpm build (verify compilation)
Blocks PR merge if any job fails
Does NOT build Docker image (too expensive for every PR)

Workflow 2: Build and Publish (.github/workflows/publish.yml)

Triggers:
- Push to main branch (after PR merge)
- Tag push matching v* (e.g., v1.0.0)
Prerequisites: Reuses CI checks, only builds if they pass
Jobs:
- Build Docker image (multi-stage)
- Push to GHCR: ghcr.io/<org>/atbb
Image tags:
- On main push: latest, main-<git-sha>
- On version tag: <version>, latest

Rationale:

Fast PR feedback without slow Docker builds
Builds only validated code
Supports bleeding-edge (latest) and versioned releases
Operators can deploy specific SHAs

Environment Configuration#

Decision: Document both .env file and individual -e flags.

Primary method: Environment file

docker run --env-file .env ghcr.io/<org>/atbb:latest

Alternative: Individual flags (for orchestrators)

docker run -e DATABASE_URL=... -e FORUM_DID=... ghcr.io/<org>/atbb:latest

Rationale:

Env file is easier for simple deployments
Individual flags support Kubernetes/orchestration tools
Flexibility for different deployment scenarios

Database Migrations#

Decision: Manual migration step before starting container.

Process:

# Run migrations using one-off container
docker run --env-file .env ghcr.io/<org>/atbb:latest \
  pnpm --filter @atbb/appview db:migrate

# Then start the main container
docker run -p 80:80 --env-file .env ghcr.io/<org>/atbb:latest

Rationale:

Database migrations are sensitive operations that should be deliberate
Explicit control over when migrations happen
Prevents race conditions with multiple container instances
Clear failure mode: migration fails → container doesn't start

Alternatives Considered:

Auto-run on startup: Convenient but risky, concurrent instances conflict
Separate migration image: More complexity, two images to maintain

Implementation Components#

1. Dockerfile#

Multi-stage build as described above. Key files:

/Dockerfile (at monorepo root)
/nginx.conf (nginx routing configuration)
/entrypoint.sh (starts nginx + both apps)
/.dockerignore (exclude node_modules, .git, .env, tests)

2. GitHub Actions Workflows#

.github/workflows/ci.yml:

Setup job: checkout, Node.js, pnpm, install deps
Parallel jobs: lint, test, build
Matrix strategy for Node.js versions? (optional)

.github/workflows/publish.yml:

Trigger conditions: main push, tag push
Use Docker buildx for multi-arch builds (optional)
Login to GHCR with GITHUB_TOKEN
Build and push with appropriate tags
Output image SHA for traceability

3. Administrator's Guide#

Document: docs/deployment-guide.md

Sections:

Prerequisites
- PostgreSQL 14+ (managed service recommended)
- Domain name + DNS
- Container runtime
- AT Protocol Setup:
  - Forum account DID (forum identity)
  - PDS instance to host forum records
  - Forum credentials (FORUM_HANDLE, FORUM_PASSWORD)
  - Understanding of lexicon namespace (space.atbb.*)
Quick Start
- Pull image from GHCR
- Create .env.production from template
- Run migrations (one-time)
- Start container
- Configure reverse proxy (Caddy example)
Environment Configuration
- Required: DATABASE_URL, FORUM_DID, PDS_URL, OAUTH_PUBLIC_URL, SESSION_SECRET
- Optional: JETSTREAM_URL, SESSION_TTL_DAYS, PORT overrides
- How to generate SESSION_SECRET: openssl rand -hex 32
- Future: REDIS_URL (not yet implemented)
Database Setup
- PostgreSQL provisioning options
- Running migrations command
- Migration troubleshooting
Running the Container
- Basic: docker run -p 80:80 --env-file .env ghcr.io/<org>/atbb:latest
- With version: ghcr.io/<org>/atbb:v1.0.0
- Health checks: /api/health
Reverse Proxy Setup
- Caddy (Recommended):
```
your-forum.com {
    reverse_proxy localhost:80
}
```
- Automatic HTTPS via Let's Encrypt
- Alternatives: nginx, Traefik examples
Monitoring & Logs
- Container logs: docker logs <container-id>
- Log format: JSON structured logs
- Health endpoint
- Future: Metrics/observability
Upgrading
- Pull new image
- Check release notes for migration requirements
- Run migrations if needed
- Stop old container, start new
- Downtime note: sessions will reset (in-memory storage)
Troubleshooting
- Database connection issues
- PDS connectivity problems
- OAuth misconfiguration
- Debug mode via environment variable
Docker Compose Example
- Full docker-compose.example.yml with:
  - PostgreSQL service
  - atBB service
  - Volume mounts for persistence
  - Network configuration
  - Future: Redis service (commented out)

4. Supporting Files#

.env.production.example:

Template for production environment variables
Comments explaining each variable
Security notes (SESSION_SECRET generation, etc.)

docker-compose.example.yml:

Complete working example for testing
PostgreSQL with volume persistence
atBB service with proper depends_on
Health checks configured

Testing Strategy#

Manual Testing Checklist#

Before merging:

Build Dockerfile locally: docker build -t atbb:test .
Verify image size is reasonable (~200-250MB)
Run container with test database: docker run -p 8080:80 --env-file .env.test atbb:test
Verify nginx routes work:
- curl http://localhost:8080/api/health → appview
- curl http://localhost:8080/ → web
Test migration command works
Test docker-compose example

CI/CD Testing#

PR workflow runs on every push to PR
Verify it fails when tests fail
Verify publish workflow only runs after merge
Check GHCR for published images with correct tags

Security Considerations#

Secrets Management:
- Never commit .env files
- Document SESSION_SECRET generation
- Warn operators to secure env files
Image Security:
- Use official Node.js alpine images
- No unnecessary tools in runtime image
- Regular dependency updates via Dependabot
Network Security:
- Container only exposes port 80
- Recommend TLS termination at reverse proxy
- Document CSP/security headers for Caddy/nginx
Database Security:
- Document least-privilege PostgreSQL user setup
- SSL/TLS for database connections
- Regular backup procedures

Open Questions#

Organization name for GHCR path: ghcr.io/<org>/atbb - what should <org> be?
Do we want multi-arch builds (amd64 + arm64)?
Should we add healthcheck to Dockerfile itself?
Node.js version matrix in CI (test multiple versions)?

Future Enhancements#

Redis Integration:
- Implement Redis session storage in application
- Update Dockerfile to support Redis connection
- Document multi-instance deployment
- Update docker-compose with Redis service
Observability:
- Prometheus metrics endpoint
- Structured logging with levels
- OpenTelemetry tracing
Kubernetes:
- Example k8s manifests
- Helm chart
- StatefulSet for potential future needs
Performance:
- Static asset caching in nginx
- Gzip compression
- HTTP/2 support

Success Criteria#

Dockerfile builds successfully and produces ~200-250MB image
Both PR checks and publish workflows pass in GitHub Actions
Image is published to GHCR with correct tags
Administrator's guide is clear and complete
docker-compose.example.yml works out of the box
Manual testing checklist passes
Operators can deploy with just image + env file + migrations