mirror of
https://github.com/Kpa-clawbot/meshcore-analyzer.git
synced 2026-07-02 18:11:43 +00:00
5629a489b2
## Summary Build the distance analytics index lazily on the first `/api/analytics/distance` request instead of eagerly inside `Load()` (and its background-load chunked merge). Per the triage Fix path on the issue: - Eager startup build removed from `Load()` and from `loadAllPacketsBackground()`'s post-merge pass. - First request returns `202 Accepted` + `Retry-After: 5` and kicks off the build in a background goroutine, gated by `sync.Once` so concurrent first-window requests all observe 202 (single build, not N parallel O(n²) computations). - Once built, subsequent requests fall through to the existing analytics-recomputer / TTL cache and serve 200 as before. - Debounced rebuild policy: refire only when `Δobs > 5%` since last build OR `>5 min` elapsed, whichever is more restrictive. Background loader also resets the gate so the next request rebuilds against the larger dataset. Effect: operators who never visit distance analytics no longer pay the O(n²) construction at startup. Acceptance criteria (a) no startup build, (b) first request triggers build, (c) concurrent in-flight requests get 202 are encoded as failing-first tests. ## Red → green - Red: `bc947ad1` — 3 assertion failures (`expected ... empty, got 3`, `expected 202, got 200`, `expected all 10 ... got 0`). - Green: `5264b68a` — production change makes them pass, no other tests regress. ## Files changed - `cmd/server/store.go` — lazy-build state (`distLazyMu`/`Once`/`Built`/`Building`/`LastBuilt`/`LastObs`), `TriggerDistanceIndexBuild`, `DistanceIndexBuilt`, `DistanceIndexBuilding`; eager `buildDistanceIndex` calls in `Load()` post-pass and chunked-background-load post-pass removed (Once reset instead so the next request rebuilds against the full dataset). - `cmd/server/routes.go` — `/api/analytics/distance` returns 202 + `Retry-After` until built. - `cmd/server/distance_lazy_index_test.go` — new tests (the three triage acceptance criteria). - `cmd/server/coverage_test.go`, `cmd/server/parity_test.go`, `cmd/server/routes_test.go`, `cmd/server/hop_disambig_e2e_test.go` — pre-warm the index via `TriggerDistanceIndexBuild()` + `DistanceIndexBuilt()` poll where the test asserts the 200 JSON shape. ## Perf justification Startup cost on a 500K-obs / 2K-node dataset: previously O(n²) hop scan during `Load()` post-pass and again during the background-load merge — measured at 10–20s in `specs/startup-audit.md`. New code: zero work at startup, the same O(n²) work runs at most once per HTTP request cycle (and only when the index is stale per debounce policy). Cold-path concurrency is bounded by `sync.Once`, so N parallel first-window requests never produce N parallel builds. ## Scope No config field added (debounce thresholds are hardcoded constants per the triage Fix path — `5%` / `5min`). No public API signature changes. No DB-side migration. Tests cover the lazy invariant, the 202+Retry-After contract, and concurrent first-request behavior. Closes #1011 --------- Co-authored-by: Kpa-clawbot <bot@corescope.local>