Files
meshcore-analyzer/cmd/server
Kpa-clawbot 5629a489b2 perf(distance): lazy build distance index on first request (#1011) (#1597)
## Summary

Build the distance analytics index lazily on the first
`/api/analytics/distance` request instead of eagerly inside `Load()`
(and its background-load chunked merge). Per the triage Fix path on the
issue:

- Eager startup build removed from `Load()` and from
`loadAllPacketsBackground()`'s post-merge pass.
- First request returns `202 Accepted` + `Retry-After: 5` and kicks off
the build in a background goroutine, gated by `sync.Once` so concurrent
first-window requests all observe 202 (single build, not N parallel
O(n²) computations).
- Once built, subsequent requests fall through to the existing
analytics-recomputer / TTL cache and serve 200 as before.
- Debounced rebuild policy: refire only when `Δobs > 5%` since last
build OR `>5 min` elapsed, whichever is more restrictive. Background
loader also resets the gate so the next request rebuilds against the
larger dataset.

Effect: operators who never visit distance analytics no longer pay the
O(n²) construction at startup. Acceptance criteria (a) no startup build,
(b) first request triggers build, (c) concurrent in-flight requests get
202 are encoded as failing-first tests.

## Red → green

- Red: `bc947ad1` — 3 assertion failures (`expected ... empty, got 3`,
`expected 202, got 200`, `expected all 10 ... got 0`).
- Green: `5264b68a` — production change makes them pass, no other tests
regress.

## Files changed

- `cmd/server/store.go` — lazy-build state
(`distLazyMu`/`Once`/`Built`/`Building`/`LastBuilt`/`LastObs`),
`TriggerDistanceIndexBuild`, `DistanceIndexBuilt`,
`DistanceIndexBuilding`; eager `buildDistanceIndex` calls in `Load()`
post-pass and chunked-background-load post-pass removed (Once reset
instead so the next request rebuilds against the full dataset).
- `cmd/server/routes.go` — `/api/analytics/distance` returns 202 +
`Retry-After` until built.
- `cmd/server/distance_lazy_index_test.go` — new tests (the three triage
acceptance criteria).
- `cmd/server/coverage_test.go`, `cmd/server/parity_test.go`,
`cmd/server/routes_test.go`, `cmd/server/hop_disambig_e2e_test.go` —
pre-warm the index via `TriggerDistanceIndexBuild()` +
`DistanceIndexBuilt()` poll where the test asserts the 200 JSON shape.

## Perf justification

Startup cost on a 500K-obs / 2K-node dataset: previously O(n²) hop scan
during `Load()` post-pass and again during the background-load merge —
measured at 10–20s in `specs/startup-audit.md`. New code: zero work at
startup, the same O(n²) work runs at most once per HTTP request cycle
(and only when the index is stale per debounce policy). Cold-path
concurrency is bounded by `sync.Once`, so N parallel first-window
requests never produce N parallel builds.

## Scope

No config field added (debounce thresholds are hardcoded constants per
the triage Fix path — `5%` / `5min`). No public API signature changes.
No DB-side migration. Tests cover the lazy invariant, the
202+Retry-After contract, and concurrent first-request behavior.

Closes #1011

---------

Co-authored-by: Kpa-clawbot <bot@corescope.local>
2026-06-04 23:48:47 -07:00
..