## Summary
Closes#871
The `/api/packets` endpoint could return packets with `null` hash or
timestamp fields. This was caused by legacy data in SQLite (rows with
empty `hash` or `NULL`/empty `first_seen`) predating the ingestor's
existing validation guard (`if hash == "" { return false, nil }` at
`cmd/ingestor/db.go:610`).
## Root Cause
`cmd/server/store.go` `filterPackets()` had no data-integrity guard.
Legacy rows with empty `hash` or `first_seen` were loaded into the
in-memory store and returned verbatim. The `strOrNil("")` helper then
serialized these as JSON `null`.
## Fix
Added a data-integrity predicate at the top of `filterPackets`'s scan
callback (`cmd/server/store.go:2278`):
```go
if tx.Hash == "" || tx.FirstSeen == "" {
return false
}
```
This filters bad legacy rows at query time. The write path (ingestor)
already rejects empty hashes, so no new bad data enters.
## TDD Evidence
- **Red commit:** `15774c3` — test `TestIssue871_NoNullHashOrTimestamp`
asserts no packet in API response has null/empty hash or timestamp
- **Green commit:** `281fd6f` — adds the filter guard, test passes
## Testing
- `go test ./...` in `cmd/server` passes (full suite)
- Client-side defensive filter from PR #868 remains as defense-in-depth
---------
Co-authored-by: you <you@example.com>
## Summary
Closes#889.
When a TRACE packet's payload is too short to decode (< 9 bytes),
`decodeTrace` returns an error in `Payload.Error` but the observation is
still stored with empty `Path.Hops`. Previously this was completely
silent — no log, no anomaly flag, no indication the row is degraded.
This fix populates `DecodedPacket.Anomaly` with the decode error message
(e.g., `"TRACE payload decode failed: too short"`) so operators and
downstream consumers can identify degraded observations.
## TDD Commit History
1. **Red commit** `04e0165` — failing test asserting `Anomaly` is set
when TRACE payload decode fails
2. **Green commit** `d3e72d1` — 3-line fix in `decoder.go` line 601-603:
check `payload.Error != ""` for TRACE packets and set anomaly
## What Changed
`cmd/ingestor/decoder.go` (lines 601-603): Added a check before the
existing TRACE path-parsing block. If `payload.Error` is non-empty for a
TRACE packet, `anomaly` is set to `"TRACE payload decode failed:
<error>"`.
`cmd/ingestor/decoder_test.go`: Added
`TestDecodeTracePayloadFailSetsAnomaly` — constructs a TRACE packet with
a 4-byte payload (too short), asserts the packet is still returned
(observation stored) and `Anomaly` is populated.
## Verification
- `go build ./...` ✓
- `go test ./...` ✓ (all pass including new test)
- Anti-tautology: reverting the fix causes the new test to fail (asserts
`pkt.Anomaly == ""` → error)
---------
Co-authored-by: you <you@example.com>
Closes#921
## Summary
Follow-up to #920 (incremental auto-vacuum). Addresses both items from
the adversarial review:
### 1. RW connection caching
Previously, every call to `openRW(dbPath)` opened a new SQLite RW
connection and closed it after use. This happened in:
- `runIncrementalVacuum` (~4x/hour)
- `PruneOldPackets`, `PruneOldMetrics`, `RemoveStaleObservers`
- `buildAndPersistEdges`, `PruneNeighborEdges`
- All neighbor persist operations
Now a single `*sql.DB` handle (with `MaxOpenConns(1)`) is cached
process-wide via `cachedRW(dbPath)`. The underlying connection pool
manages serialization. The original `openRW()` function is retained for
one-shot test usage.
### 2. DBConfig dedup
`DBConfig` was defined identically in both `cmd/server/config.go` and
`cmd/ingestor/config.go`. Extracted to `internal/dbconfig/` as a shared
package; both binaries now use a type alias (`type DBConfig =
dbconfig.DBConfig`).
## Tests added
| Test | File |
|------|------|
| `TestCachedRW_ReturnsSameHandle` | `cmd/server/rw_cache_test.go` |
| `TestCachedRW_100Calls_SingleConnection` |
`cmd/server/rw_cache_test.go` |
| `TestGetIncrementalVacuumPages_Default` |
`internal/dbconfig/dbconfig_test.go` |
| `TestGetIncrementalVacuumPages_Configured` |
`internal/dbconfig/dbconfig_test.go` |
## Verification
```
ok github.com/corescope/server 20.069s
ok github.com/corescope/ingestor 47.117s
ok github.com/meshcore-analyzer/dbconfig 0.003s
```
Both binaries build cleanly. 100 sequential `cachedRW()` calls return
the same handle with exactly 1 entry in the cache map.
---------
Co-authored-by: you <you@example.com>
## Summary
Fixes#756 — the customizer timestamp format setting (ISO/ISO+ms/locale)
and timezone (UTC/local) were not applied to chart X-axis labels,
tooltips, or certain inline timestamps in the analytics pages.
## Changes
### `public/app.js`
- Added `formatChartAxisLabel(date, shortForm)` — a shared helper that
reads the customizer's `timestampFormat` and `timestampTimezone`
preferences and formats dates for chart axes accordingly.
`shortForm=true` returns time-only (for intra-day charts),
`shortForm=false` returns date+time (for multi-day ranges).
### `public/analytics.js`
- `rfXAxisLabels()`: now calls `formatChartAxisLabel()` instead of
hardcoded `toLocaleTimeString()`
- `rfTooltipCircles()`: tooltip timestamps now use
`formatAbsoluteTimestamp()` instead of raw ISO
- Subpath detail first/last seen: now uses `formatAbsoluteTimestamp()`
- Neighbor graph last_seen: now uses `formatAbsoluteTimestamp()`
### `public/node-analytics.js`
- Packet timeline chart labels: now use `formatChartAxisLabel()`
(respects short vs long form based on time range)
- SNR over time chart labels: now use `formatChartAxisLabel()`
## Behavior by setting
| Setting | Chart axis (short) | Chart axis (long) |
|---------|-------------------|-------------------|
| ISO | `14:30` | `05-03 14:30` |
| ISO+ms | `14:30:05` | `05-03 14:30:05` |
| Locale | `2:30 PM` | `May 3, 2:30 PM` |
All respect the UTC/local timezone toggle.
## Testing
- Server builds cleanly (`go build`)
- Served `app.js` contains `formatChartAxisLabel` (verified via curl)
- Graceful fallback: all callsites check `typeof formatChartAxisLabel
=== 'function'` before calling, preserving backward compat if script
load order changes
---------
Co-authored-by: you <you@example.com>
## Summary
Adds an idempotent startup migration to the ingestor that backfills
`observations.path_json` from per-observation `raw_hex` (added in #882).
**Approach: Server-side migration (Option B)** — runs automatically at
startup, chunked in batches of 1000, tracked via `_migrations` table.
Chosen over a standalone script because:
1. Follows existing migration pattern (channel_hash, last_packet_at,
etc.)
2. Zero operator action required — just deploy
3. Idempotent — safe to restart mid-migration (uncommitted rows get
picked up next run)
## What it does
- Selects observations where `raw_hex` is populated but `path_json` is
NULL/empty/`[]`
- Excludes TRACE packets (`payload_type = 9`) at the SQL level — their
header bytes are SNR values, not hops
- Decodes hops via `packetpath.DecodePathFromRawHex` (reuses existing
helper)
- Updates `path_json` with the decoded JSON array
- Marks rows with undecoded/empty hops as `'[]'` to prevent infinite
re-scanning
- Records `backfill_path_json_from_raw_hex_v1` in `_migrations` when
complete
## Safety
- **Never overwrites** existing non-empty `path_json` — only fills where
missing
- **Batched** (1000 rows per iteration) — won't OOM on large DBs
- **TRACE-safe** — excluded at query level per
`packetpath.PathBytesAreHops` semantics
## Test
`TestBackfillPathJsonFromRawHex` — creates synthetic observations with:
- Empty path_json + valid raw_hex → verifies backfill populates
correctly
- NULL path_json → verifies backfill populates
- Existing path_json → verifies NO overwrite
- TRACE packet → verifies skip
Anti-tautology: test asserts specific decoded values (`["AABB","CCDD"]`)
from known raw_hex input, not just "something changed."
Closes#888
Co-authored-by: you <you@example.com>
## Summary
Closes#978 — analytics channels duplicated by encrypted/decrypted split
+ rainbow-table collisions.
## Root cause
Two distinct bugs in `computeAnalyticsChannels` (`cmd/server/store.go`):
1. **Encrypted/decrypted split**: The grouping key included the decoded
channel name (`hash + "_" + channel`), so packets from observers that
could decrypt a channel created a separate bucket from packets where
decryption failed. Same physical channel, two entries.
2. **Rainbow-table collisions**: Some observers' lookup tables map hash
bytes to wrong channel names. E.g., hash `72` incorrectly claimed to be
`#wardriving` (real hash is `129`). This created ghost 1-message
entries.
## Fix
1. **Always group by hash byte alone** (drop `_channel` suffix from
`chKey`). When any packet decrypts successfully, upgrade the bucket's
display name from placeholder (`chN`) to the real name
(first-decrypter-wins for stability).
2. **Validate channel names** against the firmware hash invariant:
`SHA256(SHA256("#name")[:16])[0] == channelHash`. Mismatches are treated
as encrypted (placeholder name, no trust in decoded channel). Guard is
in the analytics handler (not the ingestor) to avoid breaking other
surfaces that use the decoded field for display.
## Verification (e2e-fixture.db)
| Metric | BEFORE | AFTER |
|--------|--------|-------|
| Total channels | 22 | 19 |
| Duplicate hash bytes | 3 (hashes 217, 202, 17) | 0 |
## Tests added
- `TestComputeAnalyticsChannels_MergesEncryptedAndDecrypted` — same
hash, mixed encrypted/decrypted → ONE bucket
- `TestComputeAnalyticsChannels_RejectsRainbowTableMismatch` — hash 72
claimed as `#wardriving` (real=129) → rejected, stays `ch72`
- `TestChannelNameMatchesHash` — unit test for hash validation helper
- `TestIsPlaceholderName` — unit test for placeholder detection
Anti-tautology gate: both main tests fail when their respective fix
lines are reverted.
Co-authored-by: you <you@example.com>
Rebased version of #973 onto current master, with greybeard review
fixes.
## Changes from #973
- **Stowaway revert dropped**: The original PR branched from older
master and inadvertently reverted PR #926's MQTT connect-retry fix
(`cmd/ingestor/main.go` + `cmd/ingestor/main_test.go`). After rebasing
onto current master (which includes #926 + #970), these files no longer
appear in the diff.
- **Greybeard M1 fixed**: Time-window filter (`savedTimeWindowMin`,
`fTimeWindow` dropdown, `localStorage 'meshcore-time-window'`) is now
reset by the clear-filters button. The clear-button visibility predicate
also accounts for non-default time window.
- **Greybeard m1 fixed**: Replaced 7 tautological source-grep tests with
8 behavioral vm-sandbox tests that extract and execute the actual clear
handler + `updatePacketsUrl`, asserting real state transitions.
## Original feature (from #973)
Clear-filters button for the packets page — resets all filter state
(hash, node, observer, channel, type, expression, myNodes, time window,
region) and refreshes. Button visibility auto-toggles based on active
filter state.
Closes#964
Supersedes #973
## Tests
- `node test-clear-filters.js` — 8 behavioral tests ✅
- `node test-packets.js` — 82 tests ✅
- `cd cmd/ingestor && go test ./...` — ✅
---------
Co-authored-by: you <you@example.com>
## Summary
Per-source MQTT connect timeout, correctly targeting the `WaitTimeout`
startup gate (#931).
## What changed
- Added `connectTimeoutSec` field to `MQTTSource` struct (per-source,
not global) — `config.go:24`
- Added `ConnectTimeoutOrDefault()` helper returning configured value or
30 (default from #926) — `config.go:29`
- Replaced hardcoded `WaitTimeout(30 * time.Second)` with
`WaitTimeout(time.Duration(connectTimeout) * time.Second)` —
`main.go:173`
- Updated `config.example.json` with field at source level
- Unit tests for default (30) and custom values
## Why this supersedes #976
PR #976 made paho's `SetConnectTimeout` (per-TCP-dial, was 10s)
configurable via a **global** `mqttConnectTimeoutSeconds` field. Issue
#931 explicitly references the **30s timeout** — which is
`WaitTimeout(30s)`, the startup gate from #926. It also requests
**per-source** config, not global.
This PR targets the correct timeout at the correct granularity.
## Live verification (Rule 18)
Two sources pointed at unreachable brokers:
- `fast` (`connectTimeoutSec: 5`): timed out in 5s ✅
- `default` (unset): timed out in 30s ✅
```
19:00:35 MQTT [fast] connect timeout: 5s
19:00:40 MQTT [fast] initial connection timed out — retrying in background
19:00:40 MQTT [default] connect timeout: 30s
19:01:10 MQTT [default] initial connection timed out — retrying in background
```
Closes#931
Supersedes #976
Co-authored-by: you <you@example.com>
## fix(ingestor): address review BLOCKERs from PR #926 (goroutine leak +
guard semantics)
Supersedes #970. Rebased onto current master to resolve merge conflicts.
### Changes (same as #970)
- **BL1 (goroutine leak):** Call `client.Disconnect(0)` on the error
path after `Connect()` fails with `ConnectRetry=true`, preventing Paho's
internal retry goroutines from leaking.
- **BL2 (guard semantics):** Use `connectedCount == 0` instead of
`len(clients) == 0` to detect zero-connected state, since timed-out
clients are appended to the slice.
- **Tests:** `TestBL1_GoroutineLeakOnHardFailure` and
`TestBL2_ZeroConnectedFatals` covering both blockers.
### Context
- Fixes blockers raised in review of #926
- Related: #910 (original hang bug)
Co-authored-by: you <you@example.com>
Rebased version of #968 (which was itself a rebase of #905) — resolves
merge conflict with #906 (clock-skew UI) that landed on master.
## Conflict resolution
**`public/observers.js`** — master (#906) added "Clock Offset" column to
observer table; #968 split "Last Seen" into "Last Status" + "Last
Packet" columns. Combined both: the table now has Status | Name | Region
| Last Status | Last Packet | Packets | Packets/Hour | Clock Offset |
Uptime.
## What this PR adds (unchanged from #968/#905)
- `last_packet_at` column in observers DB table
- Separate "Last Status Update" and "Last Packet Observation" display in
observers list and detail page
- Server-side migration to add the column automatically
- Backfill heuristic for existing data
- Tests for ingestor and server
## Verification
- All Go tests pass (`cmd/server`, `cmd/ingestor`)
- Frontend tests pass (`test-packets.js`, `test-hash-color.js`)
- Built server, hit `/api/observers` — `last_packet_at` field present in
JSON
- Observer table header has all 9 columns including both Last Packet and
Clock Offset
## Prior PRs
- #905 — original (conflicts with master)
- #968 — first rebase (conflicts after #906 landed)
- This PR — second rebase, resolves#906 conflict
Supersedes #968. Closes#905.
---------
Co-authored-by: you <you@example.com>
## Summary
- With `ConnectRetry=true`, paho's `token.Wait()` only returns on
success — it blocks forever for unreachable brokers, stalling the entire
startup loop before any other source connects
- Switches to `token.WaitTimeout(30s)`: on timeout the client is still
tracked so `ConnectRetry` keeps retrying in background; `OnConnect`
fires and subscribes when it eventually connects
- Adds `TestMQTTConnectRetryTimeoutDoesNotBlock` to confirm
`WaitTimeout` returns within deadline for unreachable brokers
(regression guard for this exact failure mode)
Fixes#910
## Test plan
- [x] Two MQTT sources configured, one unreachable: ingestor reaches
`Running` status and ingests from the reachable source immediately on
startup
- [x] Unreachable source logs `initial connection timed out — retrying
in background` and reconnects automatically when the broker comes back
- [x] Single source, reachable: behaviour unchanged (`Running — 1 MQTT
source(s) connected`)
- [x] Single source, unreachable: `Running — 0 MQTT source(s) connected,
1 retrying in background`; ingestion starts once broker is available
- [x] `go test ./...` passes (excluding pre-existing
`TestOpenStoreInvalidPath` failure on master)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
## Summary
Fill the remaining gaps in payload-type lookup tables noted out-of-scope
on #965. Every firmware-defined payload type (0–11, 15) now has entries
in all four frontend tables.
## Changes
Three types were missing from one or more tables:
| Type | Name | `PAYLOAD_COLORS` (app.js) | `TYPE_NAMES` (packets.js) |
`TYPE_COLORS` (roles.js) | `TYPE_BADGE_MAP` (roles.js) |
|------|------|--------------------------|--------------------------|-------------------------|---------------------------|
| 10 | Multipart | added | added | added `#0d9488` | added |
| 11 | Control | added | ✅ (already) | added `#b45309` | added |
| 15 | Raw Custom | added | added | added `#c026d3` | added |
## Color choices
- **MULTIPART** `#0d9488` (teal) — multi-fragment stitching, distinct
from PATH's `#14b8a6`
- **CONTROL** `#b45309` (amber) — warm brown, distinct hue from ACK's
grey `#6b7280`
- **RAW_CUSTOM** `#c026d3` (fuchsia) — magenta, distinct from TRACE's
pink `#ec4899`
All pass WCAG 3:1 contrast against both white and dark (#1e1e1e)
backgrounds.
## Tests
- `test-packets.js`: 82/82 ✅
- `test-hash-color.js`: 32/32 ✅
Badge CSS auto-generation: `syncBadgeColors()` in `roles.js` iterates
`TYPE_BADGE_MAP` keyed against `TYPE_COLORS`, so the three new entries
automatically get `.type-badge.multipart`, `.type-badge.control`, and
`.type-badge.raw-custom` CSS rules injected at page load.
Firmware source: `firmware/src/Packet.h:19-32` — types 0x00–0x0B and
0x0F. Types 0x0C–0x0E are not defined.
Follows up on #965.
---------
Co-authored-by: you <you@example.com>
Fixes#929
## Summary
- `handleNodePaths` pulls candidates from `byPathHop` using 2-char and
4-char prefix keys (e.g. `"7a"` for a node using 1-byte adverts)
- When two nodes share the same short prefix, paths through the *other*
node are included as candidates
- The `resolved_path` post-filter covers decoded packets but falls
through conservatively (`inIndex = true`) when `resolved_path` is NULL,
letting false positives reach the response
**Fix:** during the aggregation phase (which already calls `resolveHop`
per hop), add a `containsTarget` check. If every hop resolves to a
different node's pubkey, skip the path. Packets confirmed via the
full-pubkey index key or via SQL bypass the check. Unresolvable hops are
kept conservatively.
## Test plan
- [x] `TestHandleNodePaths_PrefixCollisionExclusion`: two nodes sharing
`"7a"` prefix; verifies the path with no `resolved_path` (false
positive) is excluded and the SQL-confirmed path (true positive) is
included
- [x] Full test suite: `go test github.com/corescope/server` — all pass
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
## Bug
Packet type 6 (`PAYLOAD_TYPE_GRP_DATA` per `firmware/src/Packet.h:25`)
was missing from three frontend lookup tables:
- `public/app.js:7` — `PAYLOAD_COLORS` had no entry for 6 → badge color
fell back to `unknown` (grey)
- `public/packets.js:29` — `TYPE_NAMES` (used by the Packets page
type-filter dropdown) had no entry for 6 → "Group Data" missing from the
menu
- `public/roles.js:17,24` — `TYPE_COLORS` and `TYPE_BADGE_MAP` had no
`GRP_DATA` entry → no dedicated CSS class
The packet detail page already handled it (via `PAYLOAD_TYPES` in
`app.js:6` which had `6: 'Group Data'`) so individual GRP_DATA packets
render correctly. The gap was only in the filter UI + badge styling.
## Fix
Add the missing entry in each table. 4 lines across 3 files.
- `app.js`: add `6: 'grp-data'` to `PAYLOAD_COLORS`
- `packets.js`: add `6:'Group Data'` to `TYPE_NAMES`
- `roles.js`: add `GRP_DATA: '#8b5cf6'` to `TYPE_COLORS` and `GRP_DATA:
'grp-data'` to `TYPE_BADGE_MAP`
Color choice `#8b5cf6` (violet) — distinct from GRP_TXT's blue but
visually adjacent so operators read them as related types.
## Verification (rule 18 + 19)
Built server locally, served the JS files, grepped the rendered output:
```
$ curl -s http://localhost:13900/packets.js | grep TYPE_NAMES
const TYPE_NAMES = { ... 5:'Channel Msg', 6:'Group Data', 7:'Anon Req' ... };
$ curl -s http://localhost:13900/app.js | grep PAYLOAD_TYPES
const PAYLOAD_TYPES = { ... 5: 'Channel Msg', 6: 'Group Data', 7: 'Anon Req' ... };
$ curl -s http://localhost:13900/roles.js | grep GRP_DATA
ADVERT: '#22c55e', GRP_TXT: '#3b82f6', GRP_DATA: '#8b5cf6', ...
ADVERT: 'advert', GRP_TXT: 'grp-txt', GRP_DATA: 'grp-data', ...
```
Frontend tests pass: `test-packets.js` 82/82, `test-hash-color.js`
32/32.
## Out of scope
Consolidating the duplicated PAYLOAD_TYPES / TYPE_NAMES tables into a
single source of truth is a separate cleanup. Two parallel name maps
continues to be a footgun (this is the second time a new type's been
added to one but not the other).
Co-authored-by: Kpa-clawbot <bot@example.invalid>
## Summary
Implements `observerBlacklist` config — mirrors the existing
`nodeBlacklist` pattern for observers. Drop observers by pubkey at
ingest, with defense-in-depth filtering on the server side.
Closes#962
## Changes
### Ingestor (`cmd/ingestor/`)
- **`config.go`**: Added `ObserverBlacklist []string` field +
`IsObserverBlacklisted()` method (case-insensitive, whitespace-trimmed)
- **`main.go`**: Early return in `handleMessage` when `parts[2]`
(observer ID from MQTT topic) matches blacklist — before status
handling, before IATA filter. No UpsertObserver, no observations, no
metrics insert. Log line: `observer <pubkey-short> blacklisted,
dropping`
### Server (`cmd/server/`)
- **`config.go`**: Same `ObserverBlacklist` field +
`IsObserverBlacklisted()` with `sync.Once` cached set (same pattern as
`nodeBlacklist`)
- **`routes.go`**: Defense-in-depth filtering in `handleObservers` (skip
blacklisted in list) and `handleObserverDetail` (404 for blacklisted ID)
- **`main.go`**: Startup `softDeleteBlacklistedObservers()` marks
matching rows `inactive=1` so historical data is hidden
- **`neighbor_persist.go`**: `softDeleteBlacklistedObservers()`
implementation
### Tests
- `cmd/ingestor/observer_blacklist_test.go`: config method tests
(case-insensitive, empty, nil)
- `cmd/server/observer_blacklist_test.go`: config tests + HTTP handler
tests (list excludes blacklisted, detail returns 404, no-blacklist
passes all, concurrent safety)
## Config
```json
{
"observerBlacklist": [
"EE550DE547D7B94848A952C98F585881FCF946A128E72905E95517475F83CFB1"
]
}
```
## Verification (Rule 18 — actual server output)
**Before blacklist** (no config):
```
Total: 31
DUBLIN in list: True
```
**After blacklist** (DUBLIN Observer pubkey in `observerBlacklist`):
```
[observer-blacklist] soft-deleted 1 blacklisted observer(s)
Total: 30
DUBLIN in list: False
```
Detail endpoint for blacklisted observer returns **404**.
All existing tests pass (`go test ./...` for both server and ingestor).
---------
Co-authored-by: you <you@example.com>
## The actual root cause
PR #954 added `WHERE inactive IS NULL OR inactive = 0` to the server's
observer queries, but the `inactive` column is only added by the
**ingestor** migration (`cmd/ingestor/db.go:344-354`). When the server
runs against a DB the ingestor never touched (e.g. the e2e fixture), the
column doesn't exist:
```
$ sqlite3 test-fixtures/e2e-fixture.db "SELECT COUNT(*) FROM observers WHERE inactive IS NULL OR inactive = 0;"
Error: no such column: inactive
```
The server's `db.QueryRow().Scan()` swallows that error →
`totalObservers` stays 0 → `/api/observers` returns empty → map test
fails with "No map markers/overlays found".
This explains all the failing CI runs since #954 merged. PR #957
(freshen fixture) helped with the `nodes` time-rot but couldn't fix the
missing-column problem. PR #960 (freshen observers) added the right
timestamps but the column was still missing. PR #959 (data-loaded in
finally) fixed a different real bug. None of those touched the actual
mechanism.
## Fix
Mirror the existing `ensureResolvedPathColumn` pattern: add
`ensureObserverInactiveColumn` that runs at server startup, checks if
the column exists via `PRAGMA table_info`, adds it with `ALTER TABLE
observers ADD COLUMN inactive INTEGER DEFAULT 0` if missing.
Wired into `cmd/server/main.go` immediately after
`ensureResolvedPathColumn`.
## Verification
End-to-end on a freshened fixture:
```
$ sqlite3 /tmp/e2e-verify.db "PRAGMA table_info(observers);" | grep inactive
(no output — column absent)
$ ./cs-fixed -port 13702 -db /tmp/e2e-verify.db -public public &
[store] Added inactive column to observers
$ curl 'http://localhost:13702/api/observers'
returned=31 # was 0 before fix
```
`go test ./...` passes (19.8s).
## Lessons
I should have run `sqlite3 fixture "SELECT ... WHERE inactive ..."`
directly the first time the map test failed after #954 instead of
writing four "fix" PRs that didn't address the actual mechanism.
Apologies for the wild goose chase.
Co-authored-by: Kpa-clawbot <bot@example.invalid>
## Bug
Master CI failing on `Map page loads with markers: No map
markers/overlays found` since #954 (observer filter) merged.
## Root cause chain
1. Fixture has 31 observers, all dated `2026-03-26` to `2026-03-29` (33+
days old)
2. PR #957's `tools/freshen-fixture.sh` shifts `nodes`, `transmissions`,
`neighbor_edges` timestamps but NOT `observers.last_seen`
3. Server startup runs `RemoveStaleObservers(14)` per
`cmd/server/main.go:382` — marks all 33-day-old observers `inactive=1`
4. PR #954's `GetObservers` filter then excludes them
5. `/api/observers` returns 0 → map has no observer markers → test
asserts >0 → fails
Server log line confirms: `[db] transmissions=499 observations=500
nodes=200 observers=0`
## Fix
Extend `freshen-fixture.sh` to also shift `observers.last_seen` (same
algorithm — preserve relative ordering, max anchored to now). Also
defensively clear any stale `inactive=1` flags from prior failed runs.
The `inactive` column may not exist on a fresh fixture (server adds via
migration); script silently no-ops if column absent.
## Verification
```
$ bash tools/freshen-fixture.sh /tmp/test.db
nodes: min=2026-05-01T11:07:29Z max=2026-05-01T18:49:02Z
observers: count=31 max=2026-05-01T18:49:02Z
```
After: 31 observers, oldest 3 days old, within the 14d retention window.
Server's startup prune won't touch them.
Co-authored-by: Kpa-clawbot <bot@example.invalid>
## Bug
`/api/observers` returned soft-deleted (inactive=1) observers. Operators
saw stale observers in the UI even after the auto-prune marked them
inactive on schedule. Reproduced on staging: 14 observers older than 14
days returned by the API; all of them had `inactive=1` in the DB.
## Root cause
`DB.GetObservers()` (`cmd/server/db.go:974`) ran `SELECT ... FROM
observers ORDER BY last_seen DESC` with no WHERE filter. The
`RemoveStaleObservers` path correctly soft-deletes by setting
`inactive=1`, but the read path didn't honor it.
`statsRow` (`cmd/server/db.go:234`) had the same bug — `totalObservers`
count included soft-deleted rows.
## Fix
Add `WHERE inactive IS NULL OR inactive = 0` to both:
```go
// GetObservers
"SELECT ... FROM observers WHERE inactive IS NULL OR inactive = 0 ORDER BY last_seen DESC"
// statsRow.TotalObservers
"SELECT COUNT(*) FROM observers WHERE inactive IS NULL OR inactive = 0"
```
`NULL` check preserves backward compatibility with rows from before the
`inactive` migration.
## Tests
Added regression `TestGetObservers_ExcludesInactive`:
- Seed two observers, mark one inactive, assert `GetObservers()` returns
only the other.
- **Anti-tautology gate verified**: reverting the WHERE clause causes
the test to fail with `expected 1 observer, got 2` and `inactive
observer obs2 should be excluded`.
`go test ./...` passes (19.6s).
## Out of scope
- `GetObserverByID` lookup at line 1009 still returns inactive observers
— this is intentional, so an old deep link to `/observers/<id>` shows
"inactive" rather than 404.
- Frontend may also have its own caching layer; this fix is server-side
only.
---------
Co-authored-by: Kpa-clawbot <bot@example.invalid>
Co-authored-by: you <you@example.com>
Co-authored-by: KpaBap <kpabap@gmail.com>
## Bug
PR #958 added `data-loaded="true"` attributes for E2E sync, but placed
the `setAttribute` call inside the `try` block of `loadNodes()` /
`loadPackets()` / `loadNodes()` (map). When the API call failed (e.g.
`/api/observers` returns 500, or any other exception), the `catch`
swallowed the error and `setAttribute` was never reached. E2E tests then
waited 15s for `[data-loaded="true"]` and timed out.
This blocked PR #954 CI repeatedly with `Map page loads with markers:
page.waitForSelector: Timeout 15000ms exceeded`.
## Fix
Move `setAttribute('data-loaded', 'true')` to a `finally` block in all
three handlers (`map.js`, `nodes.js`, `packets.js`). The attribute now
fires on both success and error paths, so E2E tests proceed (test still
asserts on the actual rendered state — markers, rows, etc — so an empty
page still fails the right assertion, just much faster).
Removed the duplicate setAttribute calls inside the try blocks (the
finally is the single source of truth now).
## Verification
- `node test-packets.js` 82/82 ✅
- `node test-hash-color.js` 32/32 ✅
- Code reading: each `finally` runs after either success or catch, sets
the same attribute on the same container element.
## Why CI didn't catch this on #958
The PR #958 tests passed because the staging fixture happened to load
successfully when those tests ran. The flake only manifests when an
upstream fetch fails (e.g. observer API returning unexpected shape,
network blip, server still warming).
Co-authored-by: Kpa-clawbot <bot@example.invalid>
## Summary
Fixes the chained async init race identified in RCA #3 of #955.
`navigate()` (which dispatches page handlers and fetches data) was gated
behind `/api/config/theme` resolving via `.finally()`. Tests use
`waitUntil: 'domcontentloaded'` which returns BEFORE theme fetch
resolves, creating a race condition where 3+ serial network requests
must complete before any DOM rows appear.
## Changes
### Decouple navigate() from theme fetch (public/app.js)
- Move `navigate()` call out of the theme fetch `.finally()` block
- Call it immediately on DOMContentLoaded — theme is purely cosmetic and
applies in parallel
### Add data-loaded sync attributes (public/nodes.js, map.js,
packets.js)
- Set `data-loaded="true"` on the container element after each page's
data fetch resolves and DOM renders
- Nodes: set on `#nodesLeft` after `loadNodes()` renders rows
- Map: set on `#leaflet-map` after `renderMarkers()` completes
- Packets: set on `#pktLeft` after `loadPackets()` renders rows
### Update E2E tests (test-e2e-playwright.js)
- Add `await page.waitForSelector('[data-loaded="true"]', { timeout:
15000 })` before row/marker assertions
- Increase map marker timeout from 3s to 8s as additional safety margin
- Tests now synchronize on data readiness rather than racing DOM
appearance
## Verification
- Spun up local server on port 13586 with e2e-fixture.db
- Confirmed navigate() is called immediately (not gated on theme)
- Confirmed data-loaded attributes are present in served JS
- API returns data correctly (2 nodes from fixture)
Closes#955 (RCA #3)
Co-authored-by: you <you@example.com>
## Problem
The E2E fixture DB (`test-fixtures/e2e-fixture.db`) has static
timestamps from March 29, 2026. The map page applies a default
`lastHeard=30d` filter, so once the fixture ages past 30 days all nodes
are excluded from `/api/nodes?lastHeard=30d` — causing the "Map page
loads with markers" test to fail deterministically.
This started blocking all CI on ~April 28, 2026 (30 days after March
29).
Closes#955 (RCA #1: time-based fixture rot)
## Fix
Added `tools/freshen-fixture.sh` — a small script that shifts all
`last_seen`/`first_seen` timestamps forward so the newest is near
`now()`, preserving relative ordering between nodes. Runs in CI before
the Go server starts. Does **not** modify the checked-in fixture (no
binary blob churn).
## Verification
```
$ cp test-fixtures/e2e-fixture.db /tmp/fix4.db
$ bash tools/freshen-fixture.sh /tmp/fix4.db
Fixture timestamps freshened in /tmp/fix4.db
nodes: min=2026-05-01T07:10:00Z max=2026-05-01T14:51:33Z
$ ./corescope-server -port 13585 -db /tmp/fix4.db -public public &
$ curl -s "http://localhost:13585/api/nodes?limit=200&lastHeard=30d" | jq '{total, count: (.nodes | length)}'
{
"total": 200,
"count": 200
}
```
All 200 nodes returned with the 30-day filter after freshening (vs 0
without the fix).
Co-authored-by: you <you@example.com>