Commit Graph

11 Commits

Author SHA1 Message Date
Kpa-clawbot 2e28aa3e04 fix(#1229): source-diversity confidence weighting in neighbor-graph tier-1 resolver (#1235)
RED 235b65b4 (CI will surface URL after PR open) — `test(#1229): tier-1
must prefer multi-observer edges`. Green: 841fc5de.

## Summary
Implements **Option C** from issue #1229: edge source-diversity
confidence weighting. Each neighbor-graph edge already tracks the set of
distinct observers that contributed to it (`NeighborEdge.Observers`).
This PR is the first to consume that signal in the disambiguator.

Tier-1 score in `pm.resolveWithContext` becomes `Score(now) ×
Confidence()` where:

```
Confidence() = min(1.0, max(1, |Observers|) / 3.0)
```

- 1 observer → 1/3 weight (single-source, suspect)
- 2 observers → 2/3 weight
- ≥3 observers → 1.0 (saturated, full historical weight)

A 6-observer edge (30 obs) now beats a 1-observer edge (25 obs) by 3.6×
(vs. 1.2× before) — enough to clear `affinityConfidenceRatio` and skip
the tier-2 geo fallback that was misresolving in cross-region cases.
Stacks with the geo-rejection filter merged in #1228/#1230 to give two
independent defenses against cross-region prefix-collision pollution.

## Why C over A/B
- **A (per-observer graphs):** N×memory cost, biggest refactor surface.
- **B (per-region/IATA segmented):** requires region attribution on
every packet + per-region cache plumbing; deferred follow-up.
- **C:** smallest diff (~30 lines), no schema migration, leverages an
existing field, composes additively with #1228.

A and B remain valid follow-ups if C proves insufficient.

## Backward compatibility (persistence)
`neighbor_edges` schema is **unchanged**. `Observers` is rebuilt by
`BuildFromStoreWithOptions` from live observations on every graph
refresh (5-min TTL). Persisted rows carry an empty set only during the
post-restart warm-up; `Confidence()` defaults n→1 when `|Observers|==0`,
so legacy rows resolve as single-observer (degraded but non-zero)
confidence rather than disappearing. Defensive.

## Tests
- `cmd/server/hop_disambig_confidence_test.go:48` — RED-then-GREEN E2E:
two `8a` candidates from the same anchor, candX placed geo-near with 1
observer × 25 obs, candY placed geo-far with 6 observers × 5 obs.
Without confidence weighting tier-1 falls through (1.2× ratio) and
tier-2 picks the wrong (geo-near) candX. With confidence weighting
tier-1 fires and picks candY. Asserts `method == "neighbor_affinity"` to
pin the resolver path.
- `TestNeighborEdge_ObserverSetIsDistinct` — guards the source-diversity
counter against double-counting same-observer contributions and pins the
`Confidence()` formula at both endpoints (single → fractional, ≥3 →
1.0).

All existing tier-1 tests (`hop_disambig_tier1_test.go`) continue to
pass — they seed with a single observer, so their weights drop from 1.0
to 1/3 uniformly across candidates, preserving the ratio guard outcome.

Fixes #1229

---------

Co-authored-by: bot <bot@corescope.local>
2026-05-16 19:55:00 +00:00
Kpa-clawbot 7179afcfde feat(#1228): reject geo-implausible neighbor-graph edges at build time (#1230)
Fixes #1228 — geo-implausible neighbor-graph edges are rejected at build
time.

Red commit: `5a6d9660` — failing tests for 4 cases (reject SF↔Berlin,
accept local CA, accept no-GPS endpoint, counter increments). Live CI
run (latest commit):
https://github.com/Kpa-clawbot/CoreScope/actions?query=branch%3Afix%2Fissue-1228

## Why

The disambiguator's tier-1 affinity graph is built blindly from path
co-occurrence. On wide-geo MQTT deployments, a single bad hop
disambiguation seeds an edge across geographically impossible distances
(e.g. Bay Area ↔ Berlin), which then reinforces the same wrong
resolution next time. Self-poisoning spiral.

## What changed

- `upsertEdge` now consults a per-graph GPS index. When **both**
endpoints have known GPS and their haversine distance exceeds the
threshold, the edge is dropped and `NeighborGraph.RejectedEdgesGeoFar`
(atomic) is incremented.
- Either endpoint missing GPS ⇒ accept (no signal to reject), per
acceptance criteria.
- Threshold is configurable via `neighborGraph.maxEdgeKm` (default **500
km** — well above any plausible terrestrial LoRa hop, including
satellite-assisted). 0 ⇒ use default; negative ⇒ disable the filter.
Exposed via `Config.NeighborMaxEdgeKm()`.
- New `BuildFromStoreWithOptions` carrying the threshold;
`BuildFromStore` and `BuildFromStoreWithLog` are kept as thin wrappers.
- Stats are surfaced under `GET /api/analytics/neighbor-graph` as
`stats.rejected_edges_geo_far`.
- All rejection logs PII-truncate pubkeys to 8 hex chars (public repo
discipline).
- `config.example.json` updated with the new field + comment.

## Follow-up

#1229 (per-region scoped affinity graphs) depends on this landing first.

---------

Co-authored-by: corescope-bot <bot@corescope.local>
2026-05-16 10:14:44 -07:00
Kpa-clawbot dbb013a6bf test(#1201): regression coverage for hop disambiguator tier-1 + end-to-end top-hops fixture (#1202)
Mutation test confirmed: reverting cmd/server/store.go:2975
(`setContext(buildHopContextPubkeys(tx, pm))` → `setContext(nil)`) in
`buildDistanceIndex` produces failing assertion in
`TestTopHopsRespectsContextAcrossAllCallSites`: top-hops ranking flips
to `72dddd→8acccc@13.0km` (Berlin↔Berlin misresolution), CA↔CA pair
absent. After reverting the mutation, the test passes again.

Fixes #1201

## Summary
Pure test addition. No production code changed. Adds regression coverage
for the hop disambiguator's tier-1 (neighbor affinity) path and an
end-to-end fixture that catches revert-to-nil-context regressions across
all 9 call sites of `pm.resolveWithContext`.

## Sub-tasks (all 4 landed)

1. **Tier-1 explicit** — `hop_disambig_tier1_test.go`:
   - `Tier1_StrongAffinityPicksX` (strong-X edge wins)
- `Tier1_StrongAffinityPicksY` (reverse weights — proves score is read)
   - `Tier1_AmbiguousEdgeSkipsToTier2` (`Ambiguous=true` → skip)
2. **Tier ordering** — `Tier1_BeatsTier2WhenBothSignal` (tier 1 wins
when both signal)
3. **Tier-1 fallback** —
   - `Tier1_EmptyGraphFallsThrough` (graph has no edges for context)
   - `Tier1_NilGraphFallsThrough` (graph is nil)
- `Tier1_ScoresTooCloseFallsThrough` (best < `affinityConfidenceRatio` ×
runner-up)
4. **End-to-end fixture** — `hop_disambig_e2e_test.go`:
- 9 nodes with intentional prefix collisions across SLO/LA/NYC/Berlin
(prefix `72`) and SF/CA/Berlin (prefix `8a`); Berlin candidates have
`obsCount=200` so they'd win tier-3 absent context.
   - 50 transmissions path `["72","8a"]`, sender + observer in CA.
- Affinity graph seeded with strong `sender↔72aa` and `sender↔8aaa`
edges.
- Asserts: CA↔CA hop present, no Berlin pubkeys in `distHops`, max
distance < 300 km cap.

## TDD exemption
Net-new regression-sentinel tests for behavior already correct on master
post-#1198. Each test passed on first run (no production bug surfaced).
The mutation test on sub-task 4 is the gating proof: forcing
`setContext(nil)` at `store.go:2975` makes the test fail with the exact
misresolution class the issue describes (Berlin↔Berlin leaks into
top-hops).

## Acceptance criteria
- [x] Tier-1 affinity test added with 3 cases
- [x] Tier-ordering test added
- [x] Tier-1 fallback tests added (nil / empty / scores-too-close)
- [x] End-to-end fixture added with multi-candidate-prefix nodes
- [x] End-to-end fixture fails if any call site reverts to `nil` context
(mutation-verified)
- [x] Test files live in `cmd/server/` alongside
`prefix_map_role_test.go`

---------

Co-authored-by: openclaw-bot <bot@openclaw.local>
Co-authored-by: corescope-bot <bot@corescope.local>
2026-05-15 20:24:55 -07:00
Kpa-clawbot 353c5264ad fix(#1197): plumb hop-context + observation-count tiebreak to disambiguator (#1198)
Red commit: 5ffdf6b07c (CI run: pending —
see PR Checks tab)

Fixes #1197

## What this changes

Two-part fix matching the issue spec:

1. **Tier-3/4 tiebreak by observation count, not slice order**
(`store.go` resolver + `getAllNodes`).
- Plumbs `nodes.advert_count` → new `nodeInfo.ObservationCount` field
via the existing `getAllNodes` query (graceful fallback when the column
is absent on legacy DBs).
- `resolveWithContext` tier 3 (GPS preference) now picks the GPS-having
candidate with the highest observation count.
- Tier 4 (no-GPS fallback) likewise picks by observation count instead
of `candidates[0]`.
2. **Plumb hop-context to the resolver** at all four call sites called
out in the issue.
- New `buildHopContextPubkeys(tx, pm)` collects: sender pubkey from
`tx.DecodedJSON.pubKey`, observer pubkey from `tx.ObserverID`, plus
unambiguous-prefix anchors (single-candidate prefixes in the path).
- Wired into the four sites: broadcast distance compute (~1707),
recompute-on-path-change (~2944), `buildDistanceIndex` (~2982),
`computeAnalyticsTopology` (~5125).
- Per-tx hop caches were moved inside the per-tx loop on the distance
paths since context now varies per tx (was safely shared before only
because every caller passed `nil`).
- `computeAnalyticsTopology` aggregates context across the analytics
scan rather than per-tx because `resolveHop` is called outside the scan
loop downstream.

## Tests

Red→green pairs visible in the commit history:

- Pair A — tier-3 observation-count tiebreak
(`TestResolveWithContext_Tier3_PicksHigherObservationCount`).
- Pair B — context plumbing
(`TestBuildHopContextPubkeys_IncludesSenderAndUnambiguousAnchors`) +
tier-2 geo-proximity
(`TestResolveWithContext_Tier2_PicksGeographicallyCloserCandidate`).

`go test ./...` green on `cmd/server`.

## Out of scope (per issue)

300 km hop cap, API confidence/alternative-count surfacing, firmware
prefix-collision space — all explicitly excluded in #1197.

---------

Co-authored-by: openclaw-bot <bot@openclaw.local>
Co-authored-by: corescope-bot <bot@corescope.local>
Co-authored-by: Kpa-clawbot <bot@kpa-clawbot.local>
2026-05-15 09:16:39 -07:00
Kpa-clawbot a815e70975 feat: Clock skew detection — backend computation (M1) (#746)
## Summary

Implements **Milestone 1** of #690 — backend clock skew computation for
nodes and observers.

## What's New

### Clock Skew Engine (`clock_skew.go`)

**Phase 1 — Raw Skew Calculation:**
For every ADVERT observation: `raw_skew = advert_timestamp -
observation_timestamp`

**Phase 2 — Observer Calibration:**
Same packet seen by multiple observers → compute each observer's clock
offset as the median deviation from the per-packet median observation
timestamp. This identifies observers with their own clock drift.

**Phase 3 — Corrected Node Skew:**
`corrected_skew = raw_skew + observer_offset` — compensates for observer
clock error.

**Phase 4 — Trend Analysis:**
Linear regression over time-ordered skew samples estimates drift rate in
seconds/day. Detects crystal drift vs stable offset vs sudden jumps.

### Severity Classification

| Level | Threshold | Meaning |
|-------|-----------|---------|
|  OK | < 5 min | Normal |
| ⚠️ Warning | 5 min – 1 hour | Clock drifting |
| 🔴 Critical | 1 hour – 30 days | Likely no time source |
| 🟣 Absurd | > 30 days | Firmware default or epoch 0 |

### New API Endpoints

- `GET /api/nodes/{pubkey}/clock-skew` — per-node skew data (mean,
median, last, drift, severity)
- `GET /api/observers/clock-skew` — observer calibration offsets
- Clock skew also included in `GET /api/nodes/{pubkey}/analytics`
response as `clockSkew` field

### Performance

- 30-second compute cache avoids reprocessing on every request
- Operates on in-memory `byPayloadType[ADVERT]` index — no DB queries
- O(n) in total ADVERT observations, O(m log m) for median calculations

## Tests

15 unit tests covering:
- Severity classification at all thresholds
- Median/mean math helpers
- ISO timestamp parsing
- Timestamp extraction from decoded JSON (nested and top-level)
- Observer calibration with single and multi-observer scenarios
- Observer offset correction direction (verified the sign is
`+obsOffset`)
- Drift estimation: stable, linear, insufficient data, short time span
- JSON number extraction edge cases

## What's NOT in This PR

- No UI changes (M2–M4)
- No customizer integration (M5)
- Thresholds are hardcoded constants (will be configurable in M5)

Implements #690 M1.

---------

Co-authored-by: you <you@example.com>
2026-04-14 23:22:35 -07:00
Kpa-clawbot 9917d50622 fix: resolve neighbor graph duplicate entries from different prefix lengths (#699)
## Problem

The neighbor graph creates separate entries for the same physical node
when observed with different prefix lengths. For example, a 1-byte
prefix `B0` (ambiguous, unresolved) and a 2-byte prefix `B05B` (resolved
to Busbee) appear as two separate neighbors of the same node.

Fixes #698

## Solution

### Part 1: Post-build resolution pass (Phase 1.5)

New function `resolveAmbiguousEdges(pm, graph)` in `neighbor_graph.go`:
- Called after `BuildFromStore()` completes the full graph, before any
API use
- Iterates all ambiguous edges and attempts resolution via
`resolveWithContext` with full graph context
- Only accepts high-confidence resolutions (`neighbor_affinity`,
`geo_proximity`, `unique_prefix`) — rejects
`first_match`/`gps_preference` fallbacks to avoid false positives
- Merges with existing resolved edges (count accumulation, max LastSeen)
or updates in-place
- Phase 1 edge collection loop is **unchanged**

### Part 2: API-layer dedup (defense-in-depth)

New function `dedupPrefixEntries()` in `neighbor_api.go`:
- Scans neighbor response for unresolved prefix entries matching
resolved pubkey entries
- Merges counts, timestamps, and observers; removes the unresolved entry
- O(n²) on ~50 neighbors per node — negligible cost

### Performance

Phase 1.5 runs O(ambiguous_edges × candidates). Per Carmack's analysis:
~50ms at 2K nodes on the 5-min rebuild cycle. Hot ingest path untouched.

## Tests

9 new tests in `neighbor_dedup_test.go`:

1. **Geo proximity resolution** — ambiguous edge resolved when candidate
has GPS near context node
2. **Merge with existing** — ambiguous edge merged into existing
resolved edge (count accumulation)
3. **No-match preservation** — ambiguous edge left as-is when prefix has
no candidates
4. **API dedup** — unresolved prefix merged with resolved pubkey in
response
5. **Integration** — node with both 1-byte and 2-byte prefix
observations shows single neighbor entry
6. **Phase 1 regression** — non-ambiguous edge collection unchanged
7. **LastSeen preservation** — merge keeps higher LastSeen timestamp
8. **No-match dedup** — API dedup doesn't merge non-matching prefixes
9. **Benchmark** — Phase 1.5 with 500+ edges

All existing tests pass (server + ingestor).

---------

Co-authored-by: you <you@example.com>
2026-04-10 11:19:54 -07:00
Kpa-clawbot 767c8a5a3e perf: async chunked backfill — HTTP serves within 2 minutes (#612) (#614)
## Summary

Adds two config knobs for controlling backfill scope and neighbor graph
data retention, plus removes the dead synchronous backfill function.

## Changes

### Config knobs

#### `resolvedPath.backfillHours` (default: 24)
Controls how far back (in hours) the async backfill scans for
observations with NULL `resolved_path`. Transmissions with `first_seen`
older than this window are skipped, reducing startup time for instances
with large historical datasets.

#### `neighborGraph.maxAgeDays` (default: 30)
Controls the maximum age of `neighbor_edges` entries. Edges with
`last_seen` older than this are pruned from both SQLite and the
in-memory graph. Pruning runs on startup (after a 4-minute stagger) and
every 24 hours thereafter.

### Dead code removal
- Removed the synchronous `backfillResolvedPaths` function that was
replaced by the async version.

### Implementation details
- `backfillResolvedPathsAsync` now accepts a `backfillHours` parameter
and filters by `tx.FirstSeen`
- `NeighborGraph.PruneOlderThan(cutoff)` removes stale edges from the
in-memory graph
- `PruneNeighborEdges(conn, graph, maxAgeDays)` prunes both DB and
in-memory graph
- Periodic pruning ticker follows the same pattern as metrics pruning
(24h interval, staggered start)
- Graceful shutdown stops the edge prune ticker

### Config example
Both knobs added to `config.example.json` with `_comment` fields.

## Tests
- Config default/override tests for both knobs
- `TestGraphPruneOlderThan` — in-memory edge pruning
- `TestPruneNeighborEdgesDB` — SQLite + in-memory pruning together
- `TestBackfillRespectsHourWindow` — verifies old transmissions are
excluded by backfill window

---------

Co-authored-by: you <you@example.com>
2026-04-05 09:49:39 -07:00
Kpa-clawbot cbfce41d7e perf: optimize neighbor graph build (3 fixes for 30s+ CPU) (#562)
## Summary

Fixes critical performance issue in neighbor graph computation that
consumed 65% of CPU (30+ seconds) on a 325K packet dataset.

## Changes

### Fix 1: Cache strings.ToLower results
- Added cachedToLower() helper that caches lowercased strings in a local
map
- Pubkeys repeat across hundreds of thousands of observations
- Pre-computes fromLower once per transaction instead of once per
observation
- **Impact:** Eliminates ~8.4s (25.3% CPU)

### Fix 2: Cache parsed DecodedJSON via StoreTx.ParsedDecoded()
- Added ParsedDecoded() method on StoreTx using sync.Once for
thread-safe lazy caching
- json.Unmarshal on decoded_json now runs at most once per packet
lifetime
- Result reused by extractFromNode, indexByNode, trackAdvertPubkey
- **Impact:** Eliminates ~8.8s (26.3% CPU)

### Fix 3: Extend neighbor graph TTL from 60s to 5 minutes
- The graph depends on traffic patterns, not individual packets
- Reduces rebuild frequency 5x
- **Impact:** ~80% reduction in sustained CPU from graph rebuilds

## Tests

- 7 new tests added, all 26+ existing neighbor graph tests pass
- BenchmarkBuildFromStore: 727us/op, 237KB/op, 6030 allocs/op

Related: #559

---------

Co-authored-by: Kpa-clawbot <259247574+Kpa-clawbot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: you <you@example.com>
2026-04-04 01:25:51 -07:00
Kpa-clawbot 0e1beac52f fix: neighbor affinity graph empty results + performance + accessibility (#523) (#524)
## Summary

Fixes the neighbor affinity graph returning empty results despite
abundant ADVERT data in the store.

**Root cause:** `extractFromNode()` in `neighbor_graph.go` only checked
for `"from_node"` and `"from"` fields in the decoded JSON, but real
ADVERT packets store the originator public key as `"pubKey"`. This meant
`fromNode` was always empty, so:
- Zero-hop edges (originator↔observer) were never created
- Originator↔path[0] edges were never created
- Only observer↔path[last] edges could be created (and only for
non-empty paths)

**Fix:** Check `"pubKey"` first in `extractFromNode()`, then fall
through to `"from_node"` and `"from"` for other packet types.

## Bugs Fixed

| Bug | Issue | Fix |
|-----|-------|-----|
| Empty graph results | #522 | `extractFromNode()` now reads `pubKey`
field from ADVERTs |
| 3-4s response time | #523 comment | Graph was rebuilding correctly
with 60s TTL cache — the slow response was due to iterating all packets
finding zero matches. With edges now being found, the cache works as
designed. |
| Incomplete visualization | #523 comment | Downstream of bug 1+2 —
fixed by fixing the builder |
| Accessibility | #523 comment | Added text-based neighbor list, dynamic
aria-label, keyboard focus CSS, dashed lines for ambiguous edges,
confidence symbols |

## Changes

- **`cmd/server/neighbor_graph.go`** — Fixed `extractFromNode()` to
check `pubKey` field (real ADVERT format)
- **`cmd/server/neighbor_graph_test.go`** — Added 2 new tests:
`TestBuildNeighborGraph_AdvertPubKeyField` (real ADVERT format) and
`TestBuildNeighborGraph_OneByteHashPrefixes` (1-byte prefix collision
scenario)
- **`public/analytics.js`** — Added accessible text-based neighbor list,
dynamic aria-label, dashed line pattern for ambiguous edges
- **`public/style.css`** — Added `:focus-visible` keyboard focus
indicator for canvas

## Testing

All Go tests pass (`go test ./... -count=1`). New tests verify the fix
prevents regression.

Fixes #523, Fixes #522

---------

Co-authored-by: you <you@example.com>
2026-04-03 00:30:39 -07:00
Kpa-clawbot 58f791266d feat: affinity debugging tools (#482) — milestone 6 (#521)
## Summary

Milestone 6 of #482: Observability & Debugging tools for the neighbor
affinity system.

These tools exist because someone will need them at 3 AM when "Show
Neighbors is showing the wrong node for C0DE" and they have 5 minutes to
diagnose it.

## Changes

### 1. Debug API — `GET /api/debug/affinity`
- Full graph state dump: all edges with weights, observation counts,
last-seen timestamps
- Per-prefix resolution log with disambiguation reasoning (Jaccard
scores, ratios, thresholds)
- Query params: `?prefix=C0DE` filter to specific prefix,
`?node=<pubkey>` for specific node's edges
- Protected by API key (same auth as `/api/admin/prune`)
- Response includes: edge count, node count, cache age, last rebuild
time

### 2. Debug Overlay on Map
- Toggle-able checkbox "🔍 Affinity Debug" in map controls
- Draws lines between nodes showing affinity edges with color coding:
  - Green = high confidence (score ≥ 0.6)
  - Yellow = medium (0.3–0.6)
  - Red = ambiguous (< 0.3)
- Line thickness proportional to weight, dashed for ambiguous
- Unresolved prefixes shown as  markers
- Click edge → popup with observation count, last seen, score, observers
- Hidden behind `debugAffinity` config flag or
`localStorage.setItem('meshcore-affinity-debug', 'true')`

### 3. Per-Node Debug Panel
- Expandable "🔍 Affinity Debug" section in node detail page (collapsed
by default)
- Shows: neighbor edges table with scores, prefix resolutions with
reasoning trace
- Candidates table with Jaccard scores, highlighting the chosen
candidate
- Graph-level stats summary

### 4. Server-Side Structured Logging
- Integrated into `disambiguate()` — logs every resolution decision
during graph build
- Format: `[affinity] resolve C0DE: c0dedad4 score=47 Jaccard=0.82 vs
c0dedad9 score=3 Jaccard=0.11 → neighbor_affinity (ratio 15.7×)`
- Logs ambiguous decisions: `scores too close (12 vs 9, ratio 1.3×) →
ambiguous`
- Gated by `debugAffinity` config flag

### 5. Dashboard Stats Widget
- Added to analytics overview tab when debug mode is enabled
- Metrics: total edges/nodes, resolved/ambiguous counts (%), avg
confidence, cold-start coverage, cache age, last rebuild

## Files Changed
- `cmd/server/neighbor_debug.go` — new: debug API handler, resolution
builder, cold-start coverage
- `cmd/server/neighbor_debug_test.go` — new: 7 tests for debug API
- `cmd/server/neighbor_graph.go` — added structured logging to
disambiguate(), `logFn` field, `BuildFromStoreWithLog`
- `cmd/server/neighbor_api.go` — pass debug flag through
`BuildFromStoreWithLog`
- `cmd/server/config.go` — added `DebugAffinity` config field
- `cmd/server/routes.go` — registered `/api/debug/affinity` route,
exposed `debugAffinity` in client config
- `cmd/server/types.go` — added `DebugAffinity` to
`ClientConfigResponse`
- `public/map.js` — affinity debug overlay layer with edge visualization
- `public/nodes.js` — per-node affinity debug panel
- `public/analytics.js` — dashboard stats widget
- `test-e2e-playwright.js` — 3 Playwright tests for debug UI

## Tests
-  7 Go unit tests (API shape, prefix/node filters, auth, structured
logging, cold-start coverage)
-  3 Playwright E2E tests (overlay checkbox, toggle without crash,
panel expansion)
-  All existing tests pass (`go test ./cmd/server/... -count=1`)

Part of #482

---------

Co-authored-by: you <you@example.com>
2026-04-02 23:45:03 -07:00
Kpa-clawbot 4a56be0b48 feat: neighbor affinity graph builder (#482) — milestone 1 (#507)
## Summary

Milestone 1 of 7 for the neighbor affinity graph feature (#482).
Implements the core `NeighborGraph` data structure and
`BuildFromStore()` algorithm.

**Spec:** `docs/specs/neighbor-affinity-graph.md` on
`spec/482-neighbor-affinity` branch.

## What's Built

### `cmd/server/neighbor_graph.go`
- **`NeighborGraph` struct** — thread-safe (sync.RWMutex) in-memory
graph with edge map and per-node index
- **`BuildFromStore(*PacketStore)`** — iterates all packets/observations
to extract first-hop edges:
- `originator ↔ path[0]` for ADVERT packets only (originator identity
known)
  - `observer ↔ path[last]` for ALL packet types
  - Zero-hop ADVERTs: `originator ↔ observer` direct edge
- **Affinity scoring** — `score = min(1.0, count/100) × exp(-λ × hours)`
with 7-day half-life
- **Jaccard disambiguation** — resolves ambiguous hash prefixes using
mutual-neighbor overlap
- **Confidence threshold** — auto-resolve only when best ≥ 3×
second-best AND ≥ 3 observations
- **Transitivity poisoning guard** — only fully-resolved edges used as
evidence
- **Orphan prefix handling** — unknown prefixes stored as unresolved
markers
- **Cache management** — 60s TTL, `IsStale()` check for rebuild
triggering

### `cmd/server/neighbor_graph_test.go`
22 unit tests covering all spec requirements:

| Test | What it validates |
|------|-------------------|
| EmptyStore | Empty graph from empty store |
| AdvertSingleHopPath | Both edge types from single-hop ADVERT |
| AdvertMultiHopPath | originator↔path[0] + observer↔path[last] |
| AdvertZeroHop | Direct originator↔observer edge |
| NonAdvertEmptyPath | No edges from non-ADVERT empty path |
| NonAdvertOnlyObserverEdge | Only observer↔last_hop for non-ADVERTs |
| NonAdvertSingleHop | observer↔path[0] only |
| HashCollision | Ambiguous edge with candidates |
| JaccardScoring | Jaccard coefficient computation |
| ConfidenceAutoResolve | Auto-resolve when ratio ≥ 3× |
| EqualScoresAmbiguous | Remains ambiguous with equal scores |
| ObserverSelfEdgeGuard | No self-edges |
| OrphanPrefix | Unresolved prefix handling |
| AffinityScore_Fresh | Score ≈ 1.0 for fresh high-count |
| AffinityScore_Decayed | Score ≈ 0.5 at 7-day half-life |
| AffinityScore_LowCount | Score ≈ 0.05 for count=5 |
| AffinityScore_StaleAndLow | Score ≈ 0 for old low-count |
| CountAccumulation | 5 observations → count=5 |
| MultipleObservers | Observer set tracks all witnesses |
| TimeDecayOldObservations | Month-old edge scores very low |
| ADVERTOnlyConstraint | Non-ADVERTs don't create originator edges |
| CacheTTL | Stale detection works correctly |

## Not in scope (future milestones)
- API endpoints (M2)
- Frontend integration (M3-M5)
- Debug tools (M6)
- Analytics visualization (M7)

Part of #482

---------

Co-authored-by: you <you@example.com>
2026-04-02 21:14:58 -07:00