mirror of
https://github.com/Kpa-clawbot/meshcore-analyzer.git
synced 2026-05-12 09:24:44 +00:00
57e272494d
## Summary Fixes RCA #2 from #955: the HTTP listener and `/api/stats` go live before background goroutines (pickBestObservation, neighbor graph build) finish, causing CI readiness checks to pass prematurely. ## Changes 1. **`cmd/server/healthz.go`** — New `GET /api/healthz` endpoint: - Returns `503 {"ready":false,"reason":"loading"}` while background init is running - Returns `200 {"ready":true,"loadedTx":N,"loadedObs":N}` once ready 2. **`cmd/server/main.go`** — Added `sync.WaitGroup` tracking pickBestObservation and neighbor graph build goroutines. A coordinator goroutine sets `readiness.Store(1)` when all complete. `backfillResolvedPathsAsync` is NOT gated (async by design, can take 20+ min). 3. **`cmd/server/routes.go`** — Wired `/api/healthz` before system endpoints. 4. **`.github/workflows/deploy.yml`** — CI wait-for-ready loop now polls `/api/healthz` instead of `/api/stats`. 5. **`cmd/server/healthz_test.go`** — Tests for 503-before-ready, 200-after-ready, JSON shape, and anti-tautology gate. ## Rule 18 Verification Built and ran against `test-fixtures/e2e-fixture.db` (499 tx): - With the small fixture DB, init completes in <300ms so both immediate and delayed curls return 200 - Unit tests confirm 503 behavior when `readiness=0` (simulating slow init) - On production DBs with 100K+ txs, the 503 window would be 5-15s (pickBestObservation processes in 5000-tx chunks with 10ms yields) ## Test Results ``` === RUN TestHealthzNotReady --- PASS === RUN TestHealthzReady --- PASS === RUN TestHealthzAntiTautology --- PASS ok github.com/corescope/server 19.662s (full suite) ``` Co-authored-by: you <you@example.com>