···11+name: Deploy Service (reusable)
22+33+on:
44+ workflow_call:
55+ inputs:
66+ service:
77+ required: true
88+ type: string
99+ description: "Service name (matches atelier.services.<name>)"
1010+ host:
1111+ required: false
1212+ type: string
1313+ default: terebithia
1414+ description: "Tailscale hostname to deploy to"
1515+ health_url:
1616+ required: false
1717+ type: string
1818+ description: "URL to check after deploy (omit to skip health check)"
1919+ branch:
2020+ required: false
2121+ type: string
2222+ default: main
2323+ db_path:
2424+ required: false
2525+ type: string
2626+ description: "SQLite DB path for pre-deploy snapshot (e.g. /var/lib/cachet/data/cachet.db)"
2727+ secrets:
2828+ TS_OAUTH_CLIENT_ID:
2929+ required: true
3030+ TS_OAUTH_SECRET:
3131+ required: true
3232+3333+jobs:
3434+ deploy:
3535+ runs-on: ubuntu-latest
3636+ environment:
3737+ name: production
3838+ url: ${{ inputs.health_url }}
3939+4040+ concurrency:
4141+ group: deploy-${{ inputs.service }}
4242+ cancel-in-progress: false
4343+4444+ steps:
4545+ - name: Setup Tailscale
4646+ uses: tailscale/github-action@v3
4747+ with:
4848+ oauth-client-id: ${{ secrets.TS_OAUTH_CLIENT_ID }}
4949+ oauth-secret: ${{ secrets.TS_OAUTH_SECRET }}
5050+ tags: tag:deploy
5151+ use-cache: "true"
5252+5353+ - name: Configure SSH
5454+ run: |
5555+ mkdir -p ~/.ssh
5656+ echo "StrictHostKeyChecking accept-new" >> ~/.ssh/config
5757+5858+ - name: Deploy
5959+ run: |
6060+ ssh ${{ inputs.service }}@${{ inputs.host }} << 'EOF'
6161+ set -e
6262+ cd ~/app
6363+6464+ git fetch --all
6565+ git rev-parse HEAD > /tmp/${{ inputs.service }}-prev-commit
6666+6767+ # snapshot SQLite DB before any changes
6868+ DB_PATH="${{ inputs.db_path }}"
6969+ if [ -n "$DB_PATH" ] && [ -f "$DB_PATH" ]; then
7070+ echo ":: snapshotting $DB_PATH"
7171+ cp "$DB_PATH" "$DB_PATH.pre-deploy"
7272+ fi
7373+7474+ git reset --hard origin/${{ inputs.branch }}
7575+ bun install --frozen-lockfile
7676+ sudo /run/current-system/sw/bin/systemctl restart ${{ inputs.service }}.service
7777+ EOF
7878+7979+ - name: Health check
8080+ if: inputs.health_url != ''
8181+ run: |
8282+ for i in $(seq 1 12); do
8383+ echo ":: attempt $i/12"
8484+ HTTP_CODE=$(curl -sf -o /dev/null -w "%{http_code}" "${{ inputs.health_url }}" 2>/dev/null || echo "000")
8585+8686+ if [ "$HTTP_CODE" = "200" ]; then
8787+ echo ":: ${{ inputs.service }} is healthy"
8888+ exit 0
8989+ fi
9090+9191+ echo ":: HTTP $HTTP_CODE — retrying in 5s"
9292+ [ $i -lt 12 ] && sleep 5
9393+ done
9494+ echo ":: health check failed after 60s"
9595+ exit 1
9696+9797+ - name: Check systemd status
9898+ if: inputs.health_url == ''
9999+ run: |
100100+ for i in $(seq 1 6); do
101101+ echo ":: attempt $i/6"
102102+ STATUS=$(ssh ${{ inputs.service }}@${{ inputs.host }} \
103103+ "systemctl is-active ${{ inputs.service }}.service" 2>/dev/null || echo "unknown")
104104+105105+ if [ "$STATUS" = "active" ]; then
106106+ echo ":: ${{ inputs.service }} is active"
107107+ exit 0
108108+ fi
109109+110110+ echo ":: status: $STATUS — retrying in 5s"
111111+ [ $i -lt 6 ] && sleep 5
112112+ done
113113+ echo ":: service not active after 30s"
114114+ exit 1
115115+116116+ - name: Rollback on failure
117117+ if: failure()
118118+ run: |
119119+ ssh ${{ inputs.service }}@${{ inputs.host }} << 'EOF'
120120+ set -e
121121+ cd ~/app
122122+123123+ PREV=$(cat /tmp/${{ inputs.service }}-prev-commit 2>/dev/null || echo "")
124124+ if [ -z "$PREV" ]; then
125125+ echo ":: no previous commit recorded, cannot rollback"
126126+ exit 1
127127+ fi
128128+129129+ echo ":: rolling back to $PREV"
130130+131131+ # restore DB snapshot if one exists
132132+ DB_PATH="${{ inputs.db_path }}"
133133+ if [ -n "$DB_PATH" ] && [ -f "$DB_PATH.pre-deploy" ]; then
134134+ echo ":: restoring DB snapshot"
135135+ sudo /run/current-system/sw/bin/systemctl stop ${{ inputs.service }}.service || true
136136+ cp "$DB_PATH.pre-deploy" "$DB_PATH"
137137+ fi
138138+139139+ git reset --hard "$PREV"
140140+ bun install --frozen-lockfile
141141+ sudo /run/current-system/sw/bin/systemctl restart ${{ inputs.service }}.service
142142+143143+ echo ":: rolled back ${{ inputs.service }} to $PREV"
144144+ EOF
+69-1
README.md
···246246atuin sync
247247```
248248249249+## Deployment
250250+251251+Two deploy paths: **infrastructure** (NixOS config changes in this repo) and **application code** (per-service repos).
252252+253253+### Infrastructure
254254+255255+Pushing to `main` here triggers `.github/workflows/deploy.yaml` which runs `deploy-rs` over Tailscale to rebuild NixOS on the target machine.
256256+257257+```sh
258258+# manual deploy
259259+nix run 'github:serokell/deploy-rs' -- --remote-build --ssh-user kierank .
260260+```
261261+262262+### Application code
263263+264264+Each service repo has a minimal workflow calling the reusable `.github/workflows/deploy-service.yml`. On push to `main`:
265265+266266+1. Connects to Tailscale (`tag:deploy`)
267267+2. SSHes as the **service user** (e.g., `cachet@terebithia`) via Tailscale SSH
268268+3. Snapshots the SQLite DB (if `db_path` is provided)
269269+4. `git pull` + `bun install --frozen-lockfile` + `sudo systemctl restart`
270270+5. Health check (HTTP URL or systemd status fallback)
271271+6. Auto-rollback on failure (restores DB snapshot + reverts to previous commit)
272272+273273+Per-app workflow — copy and change the `with:` values:
274274+275275+```yaml
276276+name: Deploy
277277+on:
278278+ push:
279279+ branches: [main]
280280+ workflow_dispatch:
281281+jobs:
282282+ deploy:
283283+ uses: taciturnaxolotl/dots/.github/workflows/deploy-service.yml@main
284284+ with:
285285+ service: cachet
286286+ health_url: https://cachet.dunkirk.sh/health
287287+ db_path: /var/lib/cachet/data/cachet.db
288288+ secrets:
289289+ TS_OAUTH_CLIENT_ID: ${{ secrets.TS_OAUTH_CLIENT_ID }}
290290+ TS_OAUTH_SECRET: ${{ secrets.TS_OAUTH_SECRET }}
291291+```
292292+293293+Omit `health_url` to fall back to `systemctl is-active`. Omit `db_path` for stateless services.
294294+295295+### mkService
296296+297297+`modules/lib/mkService.nix` standardizes service modules. A call to `mkService { ... }` provides:
298298+299299+- Systemd service with initial git clone (subsequent deploys via GitHub Actions)
300300+- Caddy reverse proxy with TLS via Cloudflare DNS and optional rate limiting
301301+- Data declarations (`sqlite`, `postgres`, `files`) that feed into automatic backups
302302+- Dedicated system user with sudo for restart/stop/start (enables per-user Tailscale ACLs)
303303+- Port conflict detection, security hardening, agenix secrets
304304+305305+Adding a new service: create a module in `modules/nixos/services/`, enable it in `machines/terebithia/default.nix`, and add a deploy workflow to the app repo. See `modules/nixos/services/cachet.nix` for a minimal example.
306306+307307+### Secrets (agenix)
308308+309309+Secrets are encrypted in `secrets/*.age` and declared in `secrets/secrets.nix`. Referenced as `config.age.secrets.<name>.path` — decrypted at activation time to `/run/agenix/`.
310310+311311+```sh
312312+cd secrets && agenix -e myapp.age # create/edit a secret
313313+```
314314+249315## Backups
250316251251-Services are automatically backed up nightly using restic to Backblaze B2. The `atelier-backup` CLI provides an interactive TUI for managing backups:
317317+Services are automatically backed up nightly using restic to Backblaze B2. Backup targets are auto-discovered from `data.sqlite`/`data.postgres`/`data.files` declarations in mkService modules.
318318+319319+The `atelier-backup` CLI provides an interactive TUI for managing backups:
252320253321```bash
254322sudo atelier-backup # Interactive menu
+5-7
modules/lib/mkService.nix
···197197198198 users.groups.${name} = {};
199199200200- # Allow service user to restart their own service
200200+ # Allow service user to manage their own service (for CI/CD deploys)
201201 security.sudo.extraRules = [
202202 {
203203 users = [ name ];
204204- commands = [
205205- {
206206- command = "/run/current-system/sw/bin/systemctl restart ${name}.service";
207207- options = [ "NOPASSWD" ];
208208- }
209209- ];
204204+ commands = map (cmd: {
205205+ command = "/run/current-system/sw/bin/systemctl ${cmd} ${name}.service";
206206+ options = [ "NOPASSWD" ];
207207+ }) [ "restart" "stop" "start" "status" ];
210208 }
211209 ];
212210