From af03f9aa57cb9409fdbd7ff13fcf133094c1e167 Mon Sep 17 00:00:00 2001 From: you Date: Sun, 5 Apr 2026 07:06:30 +0000 Subject: [PATCH] =?UTF-8?q?docs:=20deployment=20simplification=20spec=20?= =?UTF-8?q?=E2=80=94=20pre-built=20Docker=20images=20+=20one-line=20deploy?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/specs/deployment-simplification.md | 306 ++++++++++++++++++++++++ 1 file changed, 306 insertions(+) create mode 100644 docs/specs/deployment-simplification.md diff --git a/docs/specs/deployment-simplification.md b/docs/specs/deployment-simplification.md new file mode 100644 index 00000000..abee892f --- /dev/null +++ b/docs/specs/deployment-simplification.md @@ -0,0 +1,306 @@ +# Deployment Simplification Spec + +**Status:** Draft +**Author:** Kpa-clawbot +**Date:** 2026-04-05 + +## Current State + +CoreScope deployment today requires: + +1. **Clone the repo** and build from source (`docker compose build`) +2. **Create a config.json** — the example is 100+ lines with MQTT credentials, channel keys, theme colors, regions, cache TTLs, health thresholds, branding, and more. An operator must understand all of this before seeing a single packet. +3. **Set up a Caddyfile** for TLS (separate `caddy-config/` directory, bind-mounted) +4. **Understand the supervisord architecture** — the container runs 4 processes (mosquitto, ingestor, server, caddy) via supervisord. This is opaque to operators. +5. **No pre-built images** — there's no image on Docker Hub or GHCR. Every operator must `git clone` + `docker compose build`. +6. **Updates require rebuilding** — `git pull && docker compose build && docker compose up -d`. No `docker compose pull`. +7. **manage.sh is 100+ lines** of bash wrapping `docker compose` with state files, confirmations, and color output. It's helpful for the maintainer but intimidating for new operators. + +### What works well + +- **Dockerfile is solid** — multi-stage Go build, Alpine runtime, small image +- **Health checks exist** — `wget -qO- http://localhost:3000/api/stats` +- **Environment variable overrides** — ports and data dirs are configurable via `.env` +- **Data persistence** — bind mounts for DB (`~/meshcore-data`), named volume for Caddy certs +- **DISABLE_MOSQUITTO flag** — can use external MQTT broker +- **Graceful shutdown** — `stop_grace_period: 30s`, SIGTERM handling + +### What's painful + +| Pain Point | Impact | +|---|---| +| Must build from source | Blocks anyone without Go/Docker buildx knowledge | +| 100-line config.json required | Operator doesn't know what's optional vs required | +| No sensible defaults for MQTT | Can't connect to public mesh without credentials | +| No pre-built multi-arch images | ARM users (Raspberry Pi) must cross-compile | +| No one-line deploy | Minimum 4 steps: clone, configure, build, start | +| Updates = rebuild | Slow, error-prone, requires git | + +## Goal + +An operator who has never seen the codebase should be able to run CoreScope with: + +```bash +docker run -d -p 80:80 -v corescope-data:/app/data ghcr.io/kpa-clawbot/corescope:latest +``` + +And see live MeshCore packets from the public mesh within 60 seconds. + +## Pre-built Images + +Publish to **GHCR** (`ghcr.io/kpa-clawbot/corescope`) on every release tag. + +- **Tags:** `latest`, `vX.Y.Z`, `vX.Y`, `vX` +- **Architectures:** `linux/amd64`, `linux/arm64` (Raspberry Pi 4/5) +- **Build trigger:** GitHub Actions on `v*` tag push +- **CI workflow:** New job `publish` after existing `deploy`, uses `docker/build-push-action` with QEMU for multi-arch + +```yaml +# .github/workflows/publish.yml (simplified) +on: + push: + tags: ['v*'] +jobs: + publish: + runs-on: ubuntu-latest + permissions: + packages: write + steps: + - uses: actions/checkout@v5 + - uses: docker/setup-qemu-action@v3 + - uses: docker/setup-buildx-action@v3 + - uses: docker/login-action@v3 + with: + registry: ghcr.io + username: ${{ github.actor }} + password: ${{ secrets.GITHUB_TOKEN }} + - uses: docker/build-push-action@v6 + with: + push: true + platforms: linux/amd64,linux/arm64 + tags: | + ghcr.io/kpa-clawbot/corescope:latest + ghcr.io/kpa-clawbot/corescope:${{ github.ref_name }} + build-args: | + APP_VERSION=${{ github.ref_name }} + GIT_COMMIT=${{ github.sha }} + BUILD_TIME=${{ github.event.head_commit.timestamp }} +``` + +## Configuration + +### Hierarchy (highest priority wins) + +1. **Environment variables** — `CORESCOPE_MQTT_BROKER`, `CORESCOPE_PORT`, etc. +2. **`/app/data/config.json`** — full config file (volume-mounted) +3. **Built-in defaults** — work out of the box + +### Environment variables for common settings + +| Variable | Default | Description | +|---|---|---| +| `CORESCOPE_MQTT_BROKER` | `mqtt://localhost:1883` | Primary MQTT broker URL | +| `CORESCOPE_MQTT_TOPIC` | `meshcore/+/+/packets` | MQTT topic pattern | +| `CORESCOPE_PORT` | `3000` | HTTP server port (internal) | +| `CORESCOPE_DB_PATH` | `/app/data/meshcore.db` | SQLite database path | +| `CORESCOPE_SITE_NAME` | `CoreScope` | Branding site name | +| `CORESCOPE_DEFAULT_REGION` | (none) | Default region filter | +| `DISABLE_MOSQUITTO` | `false` | Skip internal MQTT broker | +| `DISABLE_CADDY` | `false` | Skip internal Caddy (when behind reverse proxy) | + +### Built-in defaults that work out of the box + +The Go server and ingestor already have reasonable defaults compiled in. The only missing piece is **a default public MQTT source** so a fresh instance can see packets immediately. Options: + +- **Option A:** Ship with the internal Mosquitto broker only (no external sources). Operator sees an empty dashboard and must configure MQTT. Safe but unhelpful. +- **Option B:** Ship with a public read-only MQTT source pre-configured (e.g., `mqtt.meshtastic.org` or equivalent if one exists for MeshCore). Operator sees live data immediately. Better UX. + +**Recommendation:** Option A as default (safe), with a documented one-liner to add a public source. The config.example.json already shows how to add `mqttSources`. + +## Compose Profiles + +A single `docker-compose.yml` with profiles: + +```yaml +services: + corescope: + image: ghcr.io/kpa-clawbot/corescope:latest + profiles: ["", "standard", "full"] # runs in all profiles + ports: + - "${HTTP_PORT:-80}:80" + volumes: + - ${DATA_DIR:-./data}:/app/data + environment: + - DISABLE_MOSQUITTO=${DISABLE_MOSQUITTO:-false} + - DISABLE_CADDY=${DISABLE_CADDY:-false} + healthcheck: + test: ["CMD", "wget", "-qO-", "http://localhost:3000/api/stats"] + interval: 30s + timeout: 5s + retries: 3 + restart: unless-stopped +``` + +**Note:** Since the container already bundles mosquitto + caddy + server + ingestor via supervisord, "profiles" are really just env var toggles: + +| Profile | DISABLE_MOSQUITTO | DISABLE_CADDY | Use case | +|---|---|---|---| +| **minimal** | `true` | `true` | External MQTT + external reverse proxy | +| **standard** (default) | `false` | `true` | Internal MQTT, no TLS (behind nginx/traefik) | +| **full** | `false` | `false` | Everything including Caddy auto-TLS | + +This avoids splitting into separate compose services. The monolithic container is actually fine for this use case — it's a single-purpose appliance. + +## One-Line Deploy + +### Simplest (Docker run, no TLS) + +```bash +docker run -d --name corescope \ + -p 80:80 \ + -v corescope-data:/app/data \ + -e DISABLE_CADDY=true \ + ghcr.io/kpa-clawbot/corescope:latest +``` + +### With Docker Compose + +```bash +curl -sL https://raw.githubusercontent.com/Kpa-clawbot/CoreScope/master/docker-compose.simple.yml -o docker-compose.yml +docker compose up -d +``` + +Where `docker-compose.simple.yml` is a minimal 15-line file shipped in the repo. + +## Update Path + +```bash +docker compose pull +docker compose up -d +``` + +Or for `docker run` users: + +```bash +docker pull ghcr.io/kpa-clawbot/corescope:latest +docker stop corescope && docker rm corescope +docker run -d --name corescope ... # same args as before +``` + +No rebuild. No git pull. No source code needed. + +## Data Persistence + +| Path | Content | Mount | +|---|---|---| +| `/app/data/meshcore.db` | SQLite database (all packets, nodes) | Required volume | +| `/app/data/config.json` | Custom configuration (optional) | Same volume | +| `/app/data/theme.json` | Custom theme (optional) | Same volume | +| `/data/caddy` | TLS certificates (Caddy-managed) | Named volume (automatic) | + +**Backup:** `cp ~/corescope-data/meshcore.db ~/backup/` — it's just a SQLite file. + +**Migration:** Existing `~/meshcore-data` directories work unchanged. Just point the volume at the same path. + +## TLS/HTTPS + +### Option 1: Caddy auto-TLS (built-in) + +The container ships Caddy. To enable auto-TLS: + +1. Mount a custom Caddyfile: + ```bash + docker run -d \ + -p 80:80 -p 443:443 \ + -v corescope-data:/app/data \ + -v caddy-certs:/data/caddy \ + -v ./Caddyfile:/etc/caddy/Caddyfile:ro \ + ghcr.io/kpa-clawbot/corescope:latest + ``` + +2. Caddyfile: + ``` + your-domain.com { + reverse_proxy localhost:3000 + } + ``` + +### Option 2: External reverse proxy (recommended for production) + +Run with `DISABLE_CADDY=true` and put nginx/traefik/cloudflare in front. This is the standard approach and what most operators already have. + +## Health Checks + +Already implemented. The container health check hits `/api/stats`: + +```bash +# From outside the container +curl -f http://localhost/api/stats + +# Response includes packet counts, node counts, uptime +``` + +Docker will mark the container as `healthy`/`unhealthy` automatically. + +## Monitoring + +**Future (M5 from RF health spec):** Expose a `/metrics` Prometheus endpoint with: + +- `corescope_packets_total` — total packets ingested +- `corescope_nodes_active` — currently active nodes +- `corescope_mqtt_connected` — MQTT connection status +- `corescope_ingestor_lag_seconds` — time since last packet + +This is not required for the deployment simplification work but should be designed alongside it. + +## Migration from Current Setup + +For existing operators using `manage.sh` + build-from-source: + +1. **Keep your data directory** — the bind mount path is the same +2. **Keep your config.json** — it goes in the data directory as before +3. **Replace `docker compose build`** with `docker compose pull` +4. **Update docker-compose.yml** — change `build:` to `image: ghcr.io/kpa-clawbot/corescope:latest` +5. **manage.sh continues to work** — it wraps `docker compose` and will work with pre-built images + +**Breaking changes:** None expected. The container interface (ports, volumes, env vars) stays the same. + +## Milestones + +### M1: Pre-built images (1-2 days) +- [ ] Create `.github/workflows/publish.yml` for multi-arch builds +- [ ] Push a test `v0.x.0` tag and verify image on GHCR +- [ ] Update README with `docker run` quickstart +- [ ] Create `docker-compose.simple.yml` (minimal compose file using pre-built image) + +### M2: Environment variable configuration (1 day) +- [ ] Add env var parsing to Go server `config.go` (overlay on config.json) +- [ ] Add env var parsing to Go ingestor +- [ ] Add `DISABLE_CADDY` support to `entrypoint-go.sh` +- [ ] Document all env vars in README + +### M3: Sensible defaults (0.5 day) +- [ ] Ensure server starts with zero config (no config.json required) +- [ ] Verify ingestor connects to localhost MQTT by default +- [ ] Test: `docker run` with no config produces a working (empty) dashboard + +### M4: Documentation + migration guide (0.5 day) +- [ ] Write operator-facing deployment docs in `docs/deployment.md` +- [ ] Migration guide for existing users +- [ ] One-page quickstart + +**Total estimate:** 3-4 days of work. + +## Torvalds Review + +> "Is this over-engineered?" + +The spec is intentionally simple. Key decisions: + +1. **No Kubernetes manifests, Helm charts, or Terraform.** Just Docker. +2. **No config management system.** Env vars + optional JSON file. +3. **Keep the monolithic container.** Splitting into 4 separate services (server, ingestor, mosquitto, caddy) would be "proper" microservices but is worse for operators who just want one thing to run. The supervisord approach is fine for an appliance. +4. **No custom CLI tool.** `docker compose` is the interface. +5. **Profiles are just env vars**, not separate compose files or services. + +The simplest version is literally just M1: publish the existing image to GHCR. Everything else is polish. An operator can already `docker run` the image — they just can't `docker pull` it because it's not published anywhere.