meshcore-analyzer

dandri/meshcore-analyzer

Fork 0

mirror of https://github.com/Kpa-clawbot/meshcore-analyzer.git synced 2026-07-20 14:20:55 +00:00

Files

T

History

b21badbcbd fix(#1225 ): paginate channel messages at SQL level — 30s → <500ms (#1226 )

## Summary
Fixes #1225 — channel messages endpoint took ~30s on staging.

## Root cause
`(*DB).GetChannelMessages` SELECTed every observation row for the
channel (one row per observation, not per transmission),
JSON-unmarshalled each row into a Go map, dedupe-folded by `(sender,
packetHash)`, then sliced the tail in Go for pagination.

On staging `#wardriving`:
- `transmissions` rows with `channel_hash='#wardriving' AND
payload_type=5`: **5,703**
- `observations` joined to those: **274,632** (~48× amplification)
- `time curl /api/channels/%23wardriving/messages?limit=50`: **30.04s /
31.41s / 31.48s / 35.33s / 34.05s** (5 calls before I killed the loop)

`EXPLAIN QUERY PLAN` showed the index `idx_tx_channel_hash` was being
used — the cost was entirely in fetching, unmarshalling, and folding the
full observation set per request even for `limit=50`.

Hypothesis #1 from the issue (full table scan on `messages/decoded`) is
rejected; #2 (missing index) is rejected; the actual cause was
**pagination in Go instead of SQL** — request cost was O(observations)
not O(limit).

## Fix
Move pagination into SQL on the `transmissions` table. Because
`transmissions.hash` is `UNIQUE` and the original dedup key was
`(sender, hash)`, each transmission collapses to exactly one logical
message — paginating on transmissions is semantically equivalent to the
prior in-Go dedup + tail slice.

New shape:
1. `COUNT(*)` on transmissions for total (uses `idx_tx_channel_hash`).
2. `SELECT id FROM transmissions … ORDER BY first_seen DESC LIMIT ?
OFFSET ?` to pick the page of newest transmissions.
3. `SELECT … FROM observations WHERE transmission_id IN (…page ids…)` —
typically 50 ids → a few hundred observation rows.
4. Reassemble in pageIDs order, preserving the ASC-by-`first_seen` API
contract.

Region filtering, observation-count-as-`repeats`, and "first observation
wins for hops/snr/observer" semantics are preserved (observations are
scanned `ORDER BY o.id ASC`).

## Perf measurements
**Before** (staging `#wardriving`, limit=50, 5 samples killed mid-loop):
30.04s, 31.41s, 31.48s, 35.33s, 34.05s.
**Synthetic regression test**
(`TestGetChannelMessagesPerfLargeChannel`): 3000 tx × 50 obs.
- Broken impl: ~4.5s (test fails the 500ms budget — the RED commit).
- Fixed impl: well under 500ms (test passes).
**After (staging)**: will measure post-deploy and post-comment on issue
with numbers. Synthetic scaling: staging is ~2× the test's transmission
count, fixed-path cost scales with `limit` (50) + `COUNT(*)` (~5k rows
on index) — expect <100ms p99.

## TDD
- RED: `697c290d` — perf test asserts <500ms on 3k×50 dataset; fails at
~4.5s.
- GREEN: `3f1f82d3` — fix; full suite green, perf test passes.

## Hypotheses status
| # | Hypothesis | Verdict |
|---|---|---|
| 1 | Endpoint slow on prod-sized data | **CONFIRMED** (different
mechanism — see root cause) |
| 2 | Missing channel_hash index | Rejected (`idx_tx_channel_hash`
exists & used) |
| 3 | Frontend re-render storm | Not investigated (backend was clearly
the bottleneck) |
| 4 | Decode in request path | Rejected (decode is at ingest time; JSON
unmarshal of cached `decoded_json` is the cost, addressed by reducing
row count) |
| 5 | WS subscription failure | Rejected |
| 6 | Staging artifact | Rejected (reproducible) |

## Out of scope
- The in-memory `(*PacketStore).GetChannelMessages` path (used when
`s.db == nil`) has the same shape but operates on bounded in-memory
data; not touched. If we ever fall back to it in production we'll
revisit.

---------

Co-authored-by: clawbot <bot@corescope>

2026-05-16 17:28:40 +00:00

testdata/golden

feat(#847 ): dedupe Top Longest Hops by pair + add obs count and SNR cues (#848 )

2026-04-21 09:09:39 -07:00

advert_pubkey_test.go

perf: track advert pubkeys incrementally, eliminate per-request JSON parsing (#360 ) (#544 )

2026-04-03 13:51:13 -07:00

apikey_security_test.go

fix: reject weak/default API keys + startup warning (#532 ) (#628 )

2026-04-05 14:50:40 -07:00

backfill_async_test.go

perf: async chunked backfill — HTTP serves within 2 minutes (#612 ) (#614 )

2026-04-05 09:49:39 -07:00

backup_test.go

feat: /api/backup — one-click SQLite database export (#474 ) (#1022 )

2026-05-03 17:56:42 -07:00

backup.go

feat: /api/backup — one-click SQLite database export (#474 ) (#1022 )

2026-05-03 17:56:42 -07:00

bounded_load_test.go

feat(startup): hot startup — load hotStartupHours synchronously, fill retentionHours in background (#1187 )

2026-05-15 22:46:25 -07:00

cache_invalidation_test.go

fix: cache invalidation tuning — 7% → 50-80% hit rate (#721 )

2026-04-12 18:09:23 -07:00

channel_analytics_test.go

feat(analytics): selectable timeframes via ?window/?from/?to (#842 ) (#1018 )

2026-05-03 17:41:22 -07:00

channel_filter_test.go

Fix channel filter on Packets page (UI + API) — #812 (#816 )

2026-04-20 21:46:34 -07:00

clock_skew_test.go

feat(#690 ): expose observer skew + per-hash evidence in clock UI (#906 )

2026-05-02 10:30:54 -07:00

clock_skew.go

feat(#690 ): expose observer skew + per-hash evidence in clock UI (#906 )

2026-05-02 10:30:54 -07:00

collision_details_test.go

feat: show collision details in Hash Usage Matrix for all hash sizes (#758 )

2026-04-16 00:18:25 -07:00

config_knobs_test.go

perf: async chunked backfill — HTTP serves within 2 minutes (#612 ) (#614 )

2026-04-05 09:49:39 -07:00

config_test.go

feat: add observer retention — remove stale observers after configurable days (#764 )

2026-04-17 09:24:40 -07:00

config.go

feat(#1228 ): reject geo-implausible neighbor-graph edges at build time (#1230 )

2026-05-16 10:14:44 -07:00

cors_test.go

feat(server): explicit CORS policy with configurable origin allowlist (#883 ) (#971 )

2026-05-02 12:04:37 -07:00

cors.go

feat(server): explicit CORS policy with configurable origin allowlist (#883 ) (#971 )

2026-05-02 12:04:37 -07:00

coverage_test.go

feat(startup): hot startup — load hotStartupHours synchronously, fill retentionHours in background (#1187 )

2026-05-15 22:46:25 -07:00

db_channel_messages_perf_test.go

fix(#1225 ): paginate channel messages at SQL level — 30s → <500ms (#1226 )

2026-05-16 17:28:40 +00:00

db_test.go

fix(#1143 ): structural pubkey attribution via from_pubkey column (#1152 )

2026-05-06 23:50:44 -07:00

db_vacuum_test.go

fix: enable SQLite incremental auto-vacuum so DB shrinks after retention (#919 ) (#920 )

2026-04-30 23:45:00 -07:00

db.go

fix(#1225 ): paginate channel messages at SQL level — 30s → <500ms (#1226 )

2026-05-16 17:28:40 +00:00

decoder_bounds_test.go

fix(#1211 ): bounds-check path length to prevent slice [218:15] panic in MQTT decode (#1214 )

2026-05-15 22:34:21 -07:00

decoder_test.go

fix(#1211 ): bounds-check path length to prevent slice [218:15] panic in MQTT decode (#1214 )

2026-05-15 22:34:21 -07:00

decoder.go

fix(#1211 ): bounds-check path length to prevent slice [218:15] panic in MQTT decode (#1214 )

2026-05-15 22:34:21 -07:00

discovered_channels_test.go

fix(#688 ): auto-discover hashtag channels from message text (#1071 )

2026-05-05 01:16:57 -07:00

discovered_channels.go

fix(#688 ): auto-discover hashtag channels from message text (#1071 )

2026-05-05 01:16:57 -07:00

encrypted_channels_test.go

fix: channel query performance — add channel_hash column, SQL-level filtering (#762 ) (#763 )

2026-04-16 00:09:36 -07:00

ensure_indexes_test.go

feat(startup): hot startup — load hotStartupHours synchronously, fill retentionHours in background (#1187 )

2026-05-15 22:46:25 -07:00

ensure_indexes.go

feat(startup): hot startup — load hotStartupHours synchronously, fill retentionHours in background (#1187 )

2026-05-15 22:46:25 -07:00

eviction_test.go

perf(#800 ): remove per-StoreTx ResolvedPath, replace with membership index + on-demand decode (#806 )

2026-04-20 19:55:00 -07:00

foreign_advert_test.go

feat(#730 ): foreign-advert detection — flag instead of silent drop (#1084 )

2026-05-05 01:58:52 -07:00

from_pubkey_attribution_test.go

fix(#1143 ): structural pubkey attribution via from_pubkey column (#1152 )

2026-05-06 23:50:44 -07:00

from_pubkey_migration.go

fix(#1143 ): structural pubkey attribution via from_pubkey column (#1152 )

2026-05-06 23:50:44 -07:00

geo_filter.go

feat: geo_filter enforcement, DB pruning, geofilter-builder tool, HB column (#215 )

2026-03-31 01:10:56 -07:00

go.mod

perf: cancelled writes + ingestor I/O + threshold tests (#1120 follow-up) (#1167 )

2026-05-08 16:29:23 -07:00

go.sum

feat: add Go web server (cmd/server/) — full API + WebSocket + static files

2026-03-27 01:16:59 -07:00

hash_migrate_test.go

fix: use payload type bits only in content hash (not full header byte) (#787 )

2026-04-18 11:52:22 -07:00

hash_migrate.go

fix: use payload type bits only in content hash (not full header byte) (#787 )

2026-04-18 11:52:22 -07:00

healthz_test.go

fix(#1143 ): structural pubkey attribution via from_pubkey column (#1152 )

2026-05-06 23:50:44 -07:00

healthz.go

fix(#1143 ): structural pubkey attribution via from_pubkey column (#1152 )

2026-05-06 23:50:44 -07:00

helpers_test.go

perf: replace O(n²) selection sort with sort.Slice (#354 ) (#542 )

2026-04-03 13:11:59 -07:00

hop_context_bench_test.go

fix(#1199 ): 6 deferred quality items from PR #1198 r2 review (#1200 )

2026-05-15 16:21:14 +00:00

hop_disambig_e2e_test.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

hop_disambig_tier1_test.go

test(#1201 ): regression coverage for hop disambiguator tier-1 + end-to-end top-hops fixture (#1202 )

2026-05-15 20:24:55 -07:00

hot_startup_consistency_test.go

feat(startup): hot startup — load hotStartupHours synchronously, fill retentionHours in background (#1187 )

2026-05-15 22:46:25 -07:00

hot_startup_test.go

feat(startup): hot startup — load hotStartupHours synchronously, fill retentionHours in background (#1187 )

2026-05-15 22:46:25 -07:00

issue673_test.go

fix(#673 ): replace raw JSON text search with byNode index for node packet queries (#803 )

2026-04-20 22:15:02 -07:00

issue804_repeater_region_test.go

fix(#804 ): attribute analytics by repeater home region, not observer (#1025 )

2026-05-03 20:10:02 -07:00

issue810_repro_test.go

fix(#810 ): /health.recentPackets resolved_path falls back to longest sibling obs (#821 )

2026-04-21 04:51:24 +00:00

issue871_test.go

fix: drop/filter packets with null hash or timestamp (closes #871 ) (#993 )

2026-05-02 20:35:15 -07:00

main.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

memlimit_test.go

feat(memlimit): GOMEMLIMIT support, derive from packetStore.maxMemoryMB (#836 ) (#1077 )

2026-05-05 01:33:23 -07:00

memlimit.go

feat(memlimit): GOMEMLIMIT support, derive from packetStore.maxMemoryMB (#836 ) (#1077 )

2026-05-05 01:33:23 -07:00

memory.go

obs: surface real RSS alongside tracked store bytes in /api/stats (#832 ) (#835 )

2026-04-20 23:10:33 -07:00

multibyte_capability_test.go

feat: validate advert signatures on ingest, reject corrupt packets (#794 )

2026-04-18 11:39:13 -07:00

multibyte_enrich_test.go

feat: show multi-byte hash support indicator on map markers (#1002 )

2026-05-03 08:56:09 -07:00

multibyte_region_filter_test.go

fix(analytics): multiByteCapability missing under region filter → all rows 'unknown' (#1049 )

2026-05-05 06:42:58 +00:00

neighbor_api_test.go

feat: observer graph representation (M1+M2) (#774 )

2026-04-16 21:35:14 -07:00

neighbor_api.go

feat(#1228 ): reject geo-implausible neighbor-graph edges at build time (#1230 )

2026-05-16 10:14:44 -07:00

neighbor_debug_test.go

feat: affinity debugging tools (#482 ) — milestone 6 (#521 )

2026-04-02 23:45:03 -07:00

neighbor_debug.go

fix(#1197 ): plumb hop-context + observation-count tiebreak to disambiguator (#1198 )

2026-05-15 09:16:39 -07:00

neighbor_dedup_test.go

fix(#1197 ): plumb hop-context + observation-count tiebreak to disambiguator (#1198 )

2026-05-15 09:16:39 -07:00

neighbor_graph_geo_test.go

feat(#1228 ): reject geo-implausible neighbor-graph edges at build time (#1230 )

2026-05-16 10:14:44 -07:00

neighbor_graph_test.go

fix: exclude non-repeater nodes from path-hop resolution (#935 ) (#936 )

2026-04-30 09:25:51 -07:00

neighbor_graph.go

feat(#1228 ): reject geo-implausible neighbor-graph edges at build time (#1230 )

2026-05-16 10:14:44 -07:00

neighbor_persist_test.go

feat: separate "Last Status Update" from "Last Packet Observation" for observers (v3 rebase) (#969 )

2026-05-02 12:03:42 -07:00

neighbor_persist.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

node_battery_test.go

feat(node-battery): voltage trend chart + /api/nodes/{pubkey}/battery (#663 ) (#1082 )

2026-05-05 01:41:00 -07:00

node_battery.go

feat(node-battery): voltage trend chart + /api/nodes/{pubkey}/battery (#663 ) (#1082 )

2026-05-05 01:41:00 -07:00

node_blacklist_test.go

feat: add nodeBlacklist config to hide abusive/troll nodes (#742 )

2026-04-17 23:43:05 +00:00

obs_dedup_test.go

perf: replace O(n²) observation dedup with map-based O(n) (#355 ) (#543 )

2026-04-03 13:33:26 -07:00

observer_blacklist_test.go

feat(ingestor + server): observerBlacklist config (#962 ) (#963 )

2026-05-01 23:11:27 -07:00

openapi_test.go

feat: auto-generated OpenAPI 3.0 spec endpoint + Swagger UI (#530 ) (#632 )

2026-04-05 15:05:20 -07:00

openapi.go

feat: /api/backup — one-click SQLite database export (#474 ) (#1022 )

2026-05-03 17:56:42 -07:00

parity_test.go

feat: implement packet store eviction/aging to prevent OOM (#273 )

2026-03-30 03:42:11 +00:00

path_inspect_atomic_race_test.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

path_inspect_coldstart_test.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

path_inspect_panic_safety_test.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

path_inspect_singleflight_test.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

path_inspect_swr_test.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

path_inspect_test.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

path_inspect.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

paths_through_test.go

fix(paths): exclude false-positive paths from short-prefix collisions (#930 )

2026-05-02 11:15:25 -07:00

perf_io_bench_test.go

perf: cancelled writes + ingestor I/O + threshold tests (#1120 follow-up) (#1167 )

2026-05-08 16:29:23 -07:00

perf_io_carmack_test.go

perf: cancelled writes + ingestor I/O + threshold tests (#1120 follow-up) (#1167 )

2026-05-08 16:29:23 -07:00

perf_io_followup_test.go

perf: cancelled writes + ingestor I/O + threshold tests (#1120 follow-up) (#1167 )

2026-05-08 16:29:23 -07:00

perf_io_freshness_test.go

perf: cancelled writes + ingestor I/O + threshold tests (#1120 follow-up) (#1167 )

2026-05-08 16:29:23 -07:00

perf_io_test.go

feat(perf): per-component disk I/O + write source metrics on Perf page (#1120 ) (#1123 )

2026-05-05 17:56:56 -07:00

perf_io.go

perf: cancelled writes + ingestor I/O + threshold tests (#1120 follow-up) (#1167 )

2026-05-08 16:29:23 -07:00

perfstats_race_test.go

fix: add mutex synchronization to PerfStats to eliminate data races (#469 )

2026-04-01 19:26:11 -07:00

prefix_map_role_test.go

fix(#1197 ): plumb hop-context + observation-count tiebreak to disambiguator (#1198 )

2026-05-15 09:16:39 -07:00

region_filter_test.go

fix(#770 ): treat region 'All' as no-filter + document region behavior (#1026 )

2026-05-03 19:50:01 -07:00

repeater_liveness_test.go

fix(#662 ): GetRepeaterRelayInfo also looks up byPathHop by 1-byte prefix (#1086 )

2026-05-05 02:33:27 -07:00

repeater_liveness.go

fix(#662 ): GetRepeaterRelayInfo also looks up byPathHop by 1-byte prefix (#1086 )

2026-05-05 02:33:27 -07:00

repeater_usefulness_test.go

feat(repeater): usefulness score — traffic axis (#672 ) (#1079 )

2026-05-05 01:34:08 -07:00

repeater_usefulness.go

feat(repeater): usefulness score — traffic axis (#672 ) (#1079 )

2026-05-05 01:34:08 -07:00

resolve_context_callsites_test.go

fix(#1199 ): 6 deferred quality items from PR #1198 r2 review (#1200 )

2026-05-15 16:21:14 +00:00

resolve_context_test.go

fix(#1197 ): plumb hop-context + observation-count tiebreak to disambiguator (#1198 )

2026-05-15 09:16:39 -07:00

resolved_index_test.go

perf(#800 ): remove per-StoreTx ResolvedPath, replace with membership index + on-demand decode (#806 )

2026-04-20 19:55:00 -07:00

resolved_index.go

fix(#810 ): /health.recentPackets resolved_path falls back to longest sibling obs (#821 )

2026-04-21 04:51:24 +00:00

role_analytics_test.go

feat(roles): /#/roles page + /api/analytics/roles endpoint (Fixes #818 ) (#1023 )

2026-05-03 17:56:12 -07:00

role_analytics.go

feat(roles): /#/roles page + /api/analytics/roles endpoint (Fixes #818 ) (#1023 )

2026-05-03 17:56:12 -07:00

routes_test.go

fix(#827 ): /api/packets/{hash} falls back to DB when in-memory store misses (#831 )

2026-04-20 22:50:01 -07:00

routes.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

rw_cache_test.go

fix: cache RW SQLite connection + dedup DBConfig (closes #921 ) (#982 )

2026-05-02 20:15:30 -07:00

rw_cache.go

fix: cache RW SQLite connection + dedup DBConfig (closes #921 ) (#982 )

2026-05-02 20:15:30 -07:00

schema_degradation_per_store_test.go

fix(#1199 ): 6 deferred quality items from PR #1198 r2 review (#1200 )

2026-05-15 16:21:14 +00:00

short_url_test.go

feat(#772 ): short pubkey-prefix URLs for mesh sharing (#1016 )

2026-05-03 17:40:54 -07:00

stats_memory_test.go

obs: surface real RSS alongside tracked store bytes in /api/stats (#832 ) (#835 )

2026-04-20 23:10:33 -07:00

store_tophops_test.go

feat(#847 ): dedupe Top Longest Hops by pair + add obs count and SNR cues (#848 )

2026-04-21 09:09:39 -07:00

store.go

fix(#1203 ): path-inspector — singleflight + stale-while-revalidate (#1208 )

2026-05-15 22:46:28 -07:00

time_window_test.go

feat(analytics): selectable timeframes via ?window/?from/?to (#842 ) (#1018 )

2026-05-03 17:41:22 -07:00

time_window.go

feat(analytics): selectable timeframes via ?window/?from/?to (#842 ) (#1018 )

2026-05-03 17:41:22 -07:00

topology_dedup_test.go

feat(analytics): selectable timeframes via ?window/?from/?to (#842 ) (#1018 )

2026-05-03 17:41:22 -07:00

touch_last_seen_test.go

perf(#800 ): remove per-StoreTx ResolvedPath, replace with membership index + on-demand decode (#806 )

2026-04-20 19:55:00 -07:00

tracked_bytes_test.go

perf(#800 ): remove per-StoreTx ResolvedPath, replace with membership index + on-demand decode (#806 )

2026-04-20 19:55:00 -07:00

types.go

feat(startup): hot startup — load hotStartupHours synchronously, fill retentionHours in background (#1187 )

2026-05-15 22:46:25 -07:00

vacuum.go

fix: cache RW SQLite connection + dedup DBConfig (closes #921 ) (#982 )

2026-05-02 20:15:30 -07:00

websocket_test.go

feat: implement packet store eviction/aging to prevent OOM (#273 )

2026-03-30 03:42:11 +00:00

websocket.go

fix: graceful container shutdown for reliable deployments (#453 )

2026-04-01 12:19:20 -07:00