Observability:
- Add DBStats struct with atomic counters for tx_inserted, tx_dupes,
obs_inserted, node_upserts, observer_upserts, write_errors
- Log SQLite config on startup (busy_timeout, max_open_conns, journal)
- Periodic stats logging every 5 minutes + final stats on shutdown
- Instrument all write paths with counter increments
Tests:
- TestConcurrentWrites: 20 goroutines × 50 writes (1000 total) with
interleaved InsertTransmission + UpsertNode + UpsertObserver calls.
Verifies zero errors and data integrity under concurrent load.
- TestDBStats: verifies counter accuracy for inserts, duplicates,
upserts, and that LogStats does not panic
Three changes to eliminate concurrent write collisions:
1. Add _busy_timeout=5000 to ingestor SQLite DSN (matches server)
- SQLite will wait up to 5s for the write lock instead of
immediately returning SQLITE_BUSY
2. Set SetMaxOpenConns(1) on ingestor DB connection pool
- Serializes all DB access at the Go sql.DB level
- Prevents multiple goroutines from opening overlapping writes
3. Change SetOrderMatters(false) to SetOrderMatters(true)
- MQTT handlers now run sequentially per client
- Eliminates concurrent handler execution that caused
overlapping multi-statement write flows
Root cause: concurrent MQTT handlers (SetOrderMatters=false) each
performed multiple separate writes (transmission lookup/insert,
observation insert, node upsert, observer upsert) without transactions
or connection limits. SQLite only permits one writer at a time, so
under bursty MQTT traffic the ingestor was competing with itself.
#210: Add role="img" aria-label to 9 Chart.js canvases in node-analytics.js
and observer-detail.js with descriptive labels.
#211: Add scope="col" to all <th> elements across analytics.js, audio-lab.js,
compare.js, node-analytics.js, nodes.js, observer-detail.js, observers.js,
and packets.js (40+ headers).
#212: Add aria-label to packet filter input and time window select in
packets.js. Add for/id associations to all customize.js inputs: branding,
theme colors, node/type colors, heatmap sliders, onboarding fields, and
export controls.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
#203: Live page node detail panel becomes a bottom-sheet on mobile
(width:100%, bottom:0, max-height:60vh, rounded top corners).
#204: Perf page reduces padding to 12px, perf-cards stack in 2-col
grid, tables get smaller font/padding on mobile.
#205: Nodes table hides Public Key column on mobile via .col-pubkey
class (same pattern as packets page .col-region/.col-rpt).
Cache busters bumped in index.html.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Build args ensure version badge shows correctly. Health timeout
bumped from 20s to 90s for Go store loading time.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Changed all 'Node.js' references to generic 'Server' in:
- verify_health() - health check messages
- show_container_status() - stats display comment
- cmd_status() - service health output
The Go backend runs behind Caddy just like the Node version did,
so the health checks via docker exec localhost:3000 remain correct.
Only the messaging needed updating.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- app.js: render engine badge with .engine-badge span (was plain text)
- test: fix #pktRight waitForSelector to use state:'attached' (hidden by detail-collapsed)
- test: fix map heat persist race — wait for async init to restore checkbox state
- test: fix live heat persist race — test via localStorage set+reload instead of click
- test: fix live matrix toggle race — wait for Leaflet tiles before clicking
- test: increase packet detail timeouts for remote server resilience
- test: make close-button test self-contained (navigate if #pktRight missing)
- bump cache busters
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem 1 (Go staging timeout): Increased healthcheck from 60s to 120s to allow 50K+ packets to load into memory.
Problem 2 (Node staging timeout): Added forced cleanup of stale containers, volumes, and ports before starting staging containers to prevent conflicts.
Problem 3 (Proto validation WS timeout): Made WebSocket message capture non-blocking using timeout command. If no live packets are available, it now skips with a warning instead of failing the entire proto validation pipeline.
Problem 4 (Playwright E2E failures): Added forced cleanup of stale server on port 13581 before starting test server, plus better diagnostics on failure.
All health checks now include better logging (tail 50 instead of 30 lines) for debugging.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
UpsertNode only updates name/role/lat/lon/last_seen. The advert_count
field is modified exclusively by IncrementAdvertCount, which is called
separately in the MQTT handler. The test incorrectly expected count=2
after two UpsertNode calls; the correct value is 0 (the schema default).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Lead with performance stats and Go architecture. Update project
structure to reflect two-process model (Go server + Go ingestor).
Remove Node.js-specific sections (npm install, node server.js).
Keep screenshots, features, quick start, and deployment docs.
Add developer section with 380 Go tests + 150+ Node tests + E2E.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Go server is production-ready. Users upgrading via git pull + manage.sh
get Go automatically. No flags, no engine selection, no decision needed.
- Dockerfile (was Dockerfile.go) — Go multi-stage build
- Dockerfile.node — archived Node.js build for rollback
- docker-compose staging-go now builds from Dockerfile
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- User directive: Soft-delete nodes (inactive flag instead of deletion)
- Merged copilot-directive-soft-delete-nodes.md into Active Decisions section
- Removed processed inbox file
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace expensive per-request distance computation (1.2s cold) with
precomputed distance index built during Load() and incrementally
updated on IngestNewFromDB/IngestNewObservations.
- Add distHopRecord/distPathRecord types for precomputed hop distances
- buildDistanceIndex() iterates all packets once during Load(), computing
haversine distances and storing results in distHops/distPaths slices
- computeDistancesForTx() handles per-packet distance computation,
shared between full rebuild and incremental ingest
- IngestNewFromDB appends distance records for new packets (no rebuild)
- IngestNewObservations triggers full rebuild only if paths changed
- computeAnalyticsDistance() now aggregates from precomputed records
instead of re-iterating all packets with JSON parsing + haversine
Cold request path: ~10-20ms (filter + sort precomputed records)
vs previous: ~1.2s (iterate 30K+ packets, parse JSON, resolve hops,
compute haversine for each).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Channels with garbage-decrypted names (pre-#197 data still in DB) are now
filtered at the API level using the same non-printable character heuristic
from #197. Applied in both Node.js server.js and Go server (store.go, db.go).
No data is deleted — only filtered from API responses.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The subpaths analytics endpoint iterated ALL packets on every cold query,
taking ~900ms. The TTL cache only masked the problem.
Fix: maintain a precomputed raw-hop subpath index (map[string]int) that
is built once during Load() and incrementally updated during
IngestNewFromDB() and IngestNewObservations().
At query time the fast path iterates only unique raw subpaths (typically
a few thousand entries) instead of all packets (30K+), resolves hop
prefixes to names, and merges counts. Region-filtered queries still
fall back to the O(N) path since they require per-transmission observer
checks.
Expected cold-hit improvement: ~900ms → <5ms for the common no-region
case.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Create inactive_nodes table with identical schema to nodes
- Add retention.nodeDays config (default 7) in Node.js and Go
- On startup: move nodes not seen in N days to inactive_nodes
- Daily timer (24h setInterval / goroutine ticker) repeats the move
- Log 'Moved X nodes to inactive_nodes (not seen in N days)'
- All existing queries unchanged — they only read nodes table
- Add 14 new tests for moveStaleNodes in test-db.js
- Both Node (db.js/server.js) and Go (ingestor/server) implemented
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
advert_count was incremented on every upsertNode call, meaning each
observation of the same ADVERT packet inflated the count. Node N6NU
showed 4191 'adverts' but only had 77 unique ADVERT transmissions.
Changes:
- db.js: Remove advert_count increment from upsertNode SQL. Add
separate incrementAdvertCount() called only for new transmissions.
insertTransmission() now returns isNew flag.
- server.js: All three ADVERT processing paths (MQTT format 1,
companion bridge, API) now check isNew before incrementing.
- cmd/ingestor/db.go: Same fix in Go — UpsertNode no longer
increments, new IncrementAdvertCount method added.
InsertTransmission returns (bool, error) with isNew flag.
- cmd/ingestor/main.go: Check isNew before calling IncrementAdvertCount.
- One-time startup migration recalculates advert_count from
transmissions table (payload_type=4 matching node public_key).
Fixes#200
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The #197 decryption fix added channelKeys parameter to decodePayload and
DecodePacket, but the test call sites were malformed:
- DecodePacket(hex, nil + stringExpr) → nil concatenated with string (type error)
- decodePayload(type, make([]byte, N, nil)) → nil used as make capacity (type error)
Fixed to:
- DecodePacket(hex + stringExpr, nil) → string concat then nil channelKeys
- decodePayload(type, make([]byte, N), nil) → proper 3-arg call
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1. Update golden shapes.json goRuntime keys to match new struct fields
(goroutines, heapAllocMB, heapSysMB, etc. replacing heapMB, sysMB, etc.)
2. Fix analytics_hash_sizes hourly element shape — use explicit keys instead
of dynamicKeys to avoid flaky validation when map iteration picks 'hour'
string value against number valueShape
3. Update TestPerfEndpoint to check new goRuntime field names
4. Guard +Inf in handlePerf: use safeAvg() instead of raw division that
produces infinity when endpoint count is 0
5. Fix TestBroadcastMarshalError: use func(){} in map instead of chan int
to avoid channel-related marshal errors in test output
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Added 'set -e -o pipefail' to both Go test steps. Without pipefail, the exit code from 'go test' was being lost when piped to tee, causing test failures to appear as successes.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
IngestNewFromDB was appending new transmissions to byPayloadType slices,
breaking the newest-first ordering established by Load(). This caused
GetChannelMessages (which iterates backwards assuming newest-first) to
place newly ingested messages at the wrong position, making them invisible
when returning the latest messages from the tail.
Changed append to prepend, matching the existing s.packets prepend pattern
on line 881. Added regression test.
fixes#198
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
#195 — /api/nodes/:pubkey/analytics was hitting SQL (packets_v view)
for all queries. Added store.GetNodeAnalytics(pubkey, days) that uses
the byNode[pubkey] index + text search through decoded_json, computing
all analytics (timeline, SNR trend, type breakdown, observer coverage,
hop distribution, peer interactions, uptime heatmap, computed stats)
entirely in-memory. Route handler now uses store path when available,
falling back to SQL only when store is nil.
#196 — recentPackets from /api/nodes/:pubkey/health were missing the
_parsedPath field that Node.js includes (lazy-cached parsed path_json
array). Added _parsedPath to txToMap() output using txGetParsedPath(),
matching the Node.js packet shape.
fixes#195, fixes#196
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After decryption produces text, validate it's printable UTF-8.
If it contains more than 2 non-printable characters (excluding
newline/tab), mark as decryption_failed with text: null.
Applied to both Node (decoder.js) and Go (cmd/ingestor/decoder.go)
decoders. Added tests for garbage and valid text in both.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Frontend reads goroutines/pauseTotalMs/lastPauseMs/heapAllocMB/heapSysMB/
heapInuseMB/heapIdleMB/numCPU but Go was returning heapMB/sysMB/numGoroutine/
gcPauseMs. All showed as undefined.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add defensive type safety to node detail page rendering:
- Wrap all .toFixed() calls with Number() to handle string values from Go backend
- Use Array.isArray() for hash_sizes_seen instead of || [] fallback
- Apply same fixes to both full-screen and side-panel views
- Add 9 new tests for renderHashInconsistencyWarning and renderNodeBadges
with hash_size_inconsistent data (including non-array edge cases)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
#191: Hash collision matrix now filters to role=repeater only (routing-relevant)
#192: expand=observations in /api/packets now returns full observation details (txToMap includes observations, stripped by default)
#193: /api/nodes/:pubkey/health uses in-memory PacketStore when available instead of slow SQL queries
#194: goRuntime (heapMB, sysMB, numGoroutine, numGC, gcPauseMs) restored in /api/perf response
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
#184: Strip non-printable chars (<0x20 except tab/newline) from ADVERT
names in Go server decoder, Go ingestor decoder, and Node decoder.js.
#185: Add visual (N) badge next to node names when multiple nodes share
the same display name (case-insensitive). Shows in list, side pane, and
full detail page with 'also known as' links to other keys.
#186: Add packetsLast24h field to /api/stats response.
#187#188: Cache runtime.ReadMemStats() with 5s TTL in Go server.
#189: Temporarily patch HTMLCanvasElement.prototype.getContext during
L.heatLayer().addTo(map) to pass { willReadFrequently: true }, preventing
Chrome console warning about canvas readback performance.
Tests: 10 new tests for buildDupNameMap + dupNameBadge (143 total frontend).
Cache busters bumped.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
decodePath() trusted the pathByte hop count without checking available
buffer space. Corrupt packet cbecda1c7d37d4c0 (route_type=3, pathByte
0xAD) claimed 45 hops × 3 bytes = 135 bytes, but only 65 bytes existed
past the header. Node's Buffer.subarray silently returns empty buffers
for out-of-range slices, producing 23 empty-string hops in the output.
Fix: clamp hashCount to floor(available / hashSize). Add a 'truncated'
flag so consumers know the path was incomplete. No empty hops are ever
returned now.
fixes#183
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add per-payload-type packet detail fixtures captured from production:
- packet-type-advert.json (payload_type=4, ADVERT)
- packet-type-grptxt-decrypted.json (payload_type=5, decrypted GRP_TXT)
- packet-type-grptxt-undecrypted.json (payload_type=5, decryption_failed GRP_TXT)
- packet-type-txtmsg.json (payload_type=1, TXT_MSG)
- packet-type-req.json (payload_type=0, REQ)
Update validate-protos.py to validate all 5 new fixtures against
PacketDetailResponse proto message.
Update CI deploy workflow to automatically capture per-type fixtures
on each deploy, including both decrypted and undecrypted GRP_TXT.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The /api/observers handler assumed byObserver arrays were sorted
newest-first and used an early break when hitting an old timestamp.
In reality, byObserver is only roughly DESC from the initial DB load;
live-ingested observations are appended at the end (oldest-to-newest).
After ~1 hour of uptime, the first element is old, the break fires
immediately, and every observer returns packetsLastHour=0.
Fix: full scan without break — the array is not uniformly sorted.
The endpoint is cached so performance is unaffected.
fixes#182
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- #178: Use strftime ISO 8601 format instead of datetime() for observation
timestamps in all SQL queries (v3 + v2 views). Add normalizeTimestamp()
helper for non-v3 paths that may store space-separated timestamps.
- #179: Strip internal fields (decoded_json, direction, payload_type,
raw_hex, route_type, score, created_at) from ObservationResp. Only
expose id, transmission_id, observer_id, observer_name, snr, rssi,
path_json, timestamp — matching Node.js parity.
- #180: Remove _parsedDecoded and _parsedPath from node detail
recentAdverts response. These internal/computed fields were leaking
to the API. Updated golden shapes.json accordingly.
- #181: Use mux route template (GetPathTemplate) for perf stats path
normalization, converting {param} to :param for Node.js parity.
Fallback to hex regex for unmatched routes. Compile regexes once at
package level instead of per-request.
fixes#178, fixes#179, fixes#180, fixes#181
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Deduplicated protobuf contract + fixture directive into single entry.
Protobuf API contract is now single source of truth for all frontend/backend interfaces, with fixture capture running against prod (stable).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Go ingestor never had channel decryption — GRP_TXT packets were stored
with raw encrypted data while Node.js decoded them successfully.
Changes:
- decoder.go: Add decryptChannelMessage() implementing MeshCore channel
crypto (HMAC-SHA256 MAC verification + AES-128-ECB decryption), matching
the algorithm in @michaelhart/meshcore-decoder. Update decodeGrpTxt(),
decodePayload(), and DecodePacket() to accept and pass channel keys.
Add Payload fields: ChannelHashHex, DecryptionStatus, Channel, Text,
Sender, SenderTimestamp.
- config.go: Add ChannelKeysPath and ChannelKeys fields to Config struct.
- main.go: Add loadChannelKeys() that loads channel-rainbow.json (same
file used by Node.js server) from beside the config file, with env var
and config overrides. Pass loaded keys through the decoder pipeline.
- decoder_test.go: Add 14 channel decryption tests covering valid
decryption, MAC failure, wrong key, no-sender messages, bracket
sender exclusion, key iteration, channelHashHex formatting, and
decryption status states. Cross-validated against Node.js output.
- Update all DecodePacket/decodePayload/decodeGrpTxt/handleMessage call
sites in test files to pass the new channelKeys parameter.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Automated script that compares all 13 major API endpoints between
Go staging (meshcore-staging-go) and Node prod (meshcore-prod)
containers. Uses python3 for JSON field diffing and reports
MATCH/PARTIAL/MISMATCH per endpoint.
Usage: scp to server then run, or pipe via ssh.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The env var was overriding the config and forcing Go staging to only
connect to its own empty local mosquitto, missing all external data.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The poller only queried WHERE t.id > sinceID, which missed new
observations added to transmissions already in the store. The trace
page was correct because it always queries the DB directly.
Add IngestNewObservations() that polls observations by o.id watermark,
adds them to existing StoreTx entries, re-picks best observation, and
invalidates analytics caches. The Poller now tracks both lastTxID and
lastObsID watermarks.
Includes tests for v3, v2, dedup, best-path re-pick, and
GetMaxObservationID.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>