Commit Graph

2323 Commits

Author SHA1 Message Date
Kpa-clawbot f68fb2208e ci: update frontend-tests.json [skip ci] 2026-05-29 08:19:34 +00:00
Kpa-clawbot 5ccb9201e4 ci: update frontend-coverage.json [skip ci] 2026-05-29 08:19:33 +00:00
Kpa-clawbot c80637ddb9 ci: update e2e-tests.json [skip ci] 2026-05-29 08:19:31 +00:00
Kpa-clawbot 43b93c6bb9 feat(observers): surface naive-clock observers as ⚠️ chip + detail banner (#1478) (#1480)
## Summary

Issue #1478 — surface observers whose envelope timestamps are being
clamped because they're emitting zone-less local-time strings (UTC-N
observers showed up perpetually as "Stale" before #1466, and per-packet
rxTime is still clamped to ingest time for them, muddying
propagation-delay analytics).

Now the UI tells operators which observers are misconfigured + how to
fix it.

## What changed

### Ingestor (cmd/ingestor)
- New `observers_clock_naive_v1` migration adds three columns to
`observers`:
- `clock_skew_seconds INTEGER` (signed: negative = behind UTC, positive
= ahead)
  - `clock_skew_count_24h INTEGER` (rolling 24h event count)
  - `clock_last_naive_at TEXT` (RFC3339 timestamp of last clamp)
- `resolveRxTime` now returns `(rxTime, naiveSkewSec)`. The
packet-handler call site invokes `store.RecordNaiveSkew(observerID,
deltaSec)` whenever a naive envelope is clamped (the existing >15 min
naive-tolerance path). The counter resets to 1 if no event in the prior
24h, else increments. Single INSERT-or-UPDATE round trip per clamp.

### Server (cmd/server)
- `Observer` struct + `GetObservers` / `GetObserverByID` extended to
scan the three new columns.
- `ObserverResp` gains four JSON fields exposed by `/api/observers` and
`/api/observers/{id}`:
- `clock_naive` (bool, derived from `clock_last_naive_at` being within
24h)
  - `clock_skew_seconds`, `clock_skew_count_24h`, `clock_last_naive_at`
- Decay is **read-side**: a stale event yields `clock_naive=false` with
zero counts. No background sweep, no writes from the read-only server,
no race with the ingestor.

### Frontend (public)
- `window.ObserversNaiveChip.render(o)` — total render helper, returns
⚠️ chip HTML when `o.clock_naive===true`, `""` otherwise. Used inline in
the observers-list `name` cell and in the row-detail slide-over. Tooltip
explains magnitude + direction + count + fix.
- `window.ObserverDetailNaiveBanner.render(obs)` — yellow alert banner
at the top of the observer-detail page with the skew magnitude,
last-event timestamp, and the actionable fix ("Set host clock to UTC, OR
emit Z-suffixed/offset-aware timestamps from the observer script").

## TDD trail
- `5ddd5b42` red: backend `cmd/server/observer_naive_clock_1478_test.go`
(3 tests asserting JSON fields + 24h decay) + frontend
`test-observer-naive-clock-1478.js` (8 jsdom-style tests asserting
helpers exist and render correctly). Both failed on master with
field-missing / export-missing assertions.
- `4ecc79c8` green backend: schema + Observer / GetObservers /
ObserverResp / handler decay.
- `2137ab81` green frontend: chip + banner helpers and call sites.

## Tests
- `cd cmd/server && go test ./...` → all green (full suite, 46s)
- `cd cmd/ingestor && go test ./...` → all green (full suite, 98s)
- `node test-observer-naive-clock-1478.js` → 8/8 pass
- `node test-frontend-helpers.js` → unchanged from master (pre-existing
failures only)

## Acceptance (issue #1478)
-  Observer running with `python datetime.now().isoformat()` (naive,
off by N hours) → `clock_naive=true` after the next clamp → UI shows ⚠️
chip + banner.
-  Observer with `datetime.now(timezone.utc).isoformat()` (Z-suffixed)
→ never clamped → never flagged.
-  Observer that fixed its clock → `clock_naive` returns to `false` 24h
after the last clamp event (read-side decay).

Closes #1478.

---------

Co-authored-by: openclaw <bot@openclaw.local>
2026-05-29 01:08:12 -07:00
Kpa-clawbot 1fd95f6771 ci: update go-server-coverage.json [skip ci] 2026-05-29 07:57:46 +00:00
Kpa-clawbot f135e114f5 ci: update go-ingestor-coverage.json [skip ci] 2026-05-29 07:57:45 +00:00
Kpa-clawbot d9593df243 ci: update frontend-tests.json [skip ci] 2026-05-29 07:57:44 +00:00
Kpa-clawbot f1be30dc1f ci: update frontend-coverage.json [skip ci] 2026-05-29 07:57:43 +00:00
Kpa-clawbot 50bc073813 ci: update e2e-tests.json [skip ci] 2026-05-29 07:57:42 +00:00
efiten d4280befd4 fix(packets): use route-aware path byte offset for HB column (#1469)
## Summary

- The **HB** (hash bytes) column in the packet list always read byte 1
of `raw_hex` to compute the hash size
- For TRANSPORT routes (`route_type` 0 or 3), the path_len byte sits at
offset 5 — bytes 1–4 are transport codes
- Reading byte 1 for these packets produced the wrong hash size (e.g.
`0xBB` → bits 7-6 = `10` → **3** instead of the correct **2**)
- Fix: use `getPathLenOffset(route_type)` at all three render sites
(grouped header, grouped children, flat row)
- For grouped children that have no `raw_hex`, fall back to deriving
hash size from the path_json hop string lengths

## Test plan

- [ ] Open a TRANSPORT FLOOD packet (`route_type=0`) in the packet list
— HB column now shows the correct value (e.g. 2 instead of 3)
- [ ] Verify FLOOD packets (`route_type=1`) still show the correct hash
size (byte 1 unchanged for non-transport routes)
- [ ] Expand a grouped packet row and confirm child rows show correct
hash size from path_json hop lengths

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 00:52:53 -07:00
efiten b71b26a438 fix(live): decouple live animation from VCR.speed — always 1× in LIVE mode (#1427)
## Summary

- `drawAnimatedLine` and `drawMatrixLine` both used `33 / VCR.speed` and
`1100 / VCR.speed` as timing constants
- `VCR.speed` persists in localStorage, so a 4× or 8× replay setting
carried into live mode made packet animations run near-instantaneously
(8.25ms steps vs 33ms)
- Guard both constants behind `VCR.mode === 'REPLAY'` so live mode
always animates at the baseline rate regardless of saved speed

## Test plan

- [ ] Set replay speed to 4×, end replay, reload page → live animation
runs at normal speed (~660ms for a full hop animation)
- [ ] Verify replay still respects slow-mo: 0.25× is visibly slower, 4×
is faster
- [ ] Verify live animations are unaffected by the stored
`live-vcr-speed` localStorage value

Closes #1346

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 00:52:31 -07:00
efiten 8151185ede fix(ci): Dockerfile COPY invariant check — prevent missing internal/<pkg> Docker failures (#1316) (#1432)
## Summary

- Adds `scripts/check-dockerfile-internal-pkgs.sh`: reads `replace =>
../../internal/<pkg>` directives from `cmd/server/go.mod` and
`cmd/ingestor/go.mod`, then verifies each referenced package has the
correct number of `COPY internal/<pkg>/` lines in `Dockerfile` (one per
builder section that needs it)
- Wired into CI as a step in the `go-test` job, before CSS lint — runs
on every PR, adds ~0.1s
- Prevents the recurring failure pattern (#1316): new `internal/<pkg>`
added to go.mod but COPY line forgotten in Dockerfile; non-Docker CI
passes, Docker build fails after merge with a cryptic module error

Key details:
- Counts COPY occurrences per package: if a pkg is referenced in both
go.mods (both binaries need it), it must appear in at least 2 builder
sections
- Anchored regex: only matches actual `replace` directives (not
comments)
- Anchored grep: skips commented-out `COPY internal/...` lines

Closes #1316.

## Test plan

- [ ] Run `bash scripts/check-dockerfile-internal-pkgs.sh` locally —
exits 0 on current Dockerfile
- [ ] Manually remove a `COPY internal/perfio/` line from Dockerfile →
script exits 1 with a clear error
- [ ] CI step visible in the `go-test` job on this PR

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 00:52:08 -07:00
Kpa-clawbot e11ce54059 fix(#1480): update E2E #534 to click navbar mirror; simplify CSS (#1484)
Sequence of errors:
- #1475: hid in-page button with visibility:hidden \u2192 Playwright
won't click visibility:hidden \u2192 broke E2E #534
- #1482: tried opacity:0 instead \u2192 Playwright won't click opacity:0
either \u2192 still broken
- This PR: UPDATE THE TEST instead of fighting Playwright. The mobile UX
since #1471 is: operator-visible Filters control = navbar mirror
(.filter-toggle-btn-mirror). The test should click THAT, not the
now-hidden in-page button.

Test now tries the mirror first, falls back to in-page button for any
test rig without the mirror script. CSS simplified to display:none.

Unblocks #1480 (#1478 naive-TS observer UI surface) CI. Also any other
PR inheriting this same regression.

Hot-deploy candidate (CSS + test only).

Co-authored-by: openclaw-bot <bot@openclaw.local>
2026-05-29 07:38:20 +00:00
Kpa-clawbot b6e005009c fix(#1475 followup): opacity:0 not visibility:hidden so E2E #534 click works (#1482)
Regression I introduced in #1475. Playwright's elementHandle.click()
refuses to act on elements with visibility:hidden — the in-page Filters
button became unclickable, breaking E2E test #534 'Mobile filter toggle
expands filter bar on packets page'.

Caught by CI on #1480.

Switch to opacity:0 + 0×0 + position:absolute. Element renders zero
pixels for the user but stays 'visible' per Playwright's actionability
check — E2E #534 click works, no duplicate Filters button visible.

Hot-deploy candidate (CSS-only).

Co-authored-by: openclaw-bot <bot@openclaw.local>
2026-05-29 07:15:43 +00:00
hrtndev 462cb2cb5a chore: update MeshCore URLs to use new site (#1445)
# Summary
The main MeshCore website is https://meshcore.io. Reasons for the new
website are listed here: https://blog.meshcore.io/2026/04/23/the-split

# Changes
Any occurrence of `meshcore.co.uk` was replaced with `meshcore.io`. No
logic was changed, only updated strings.

Co-authored-by: hrtndev <hrtndev@users.noreply.github.com>
2026-05-29 00:06:29 -07:00
Kpa-clawbot 0a58aa146a fix(ingestor): silence per-message naive-timestamp log (#1478 followup) (#1479)
Operator on prod reports the per-message naive-timestamp warning drowns
the log when an observer's local clock isn't UTC.

Since observer.last_seen already uses ingest time regardless of envelope
(#1466), and per-packet rxTime is already clamped (#1464), the
per-message console log adds nothing actionable.

This PR silences the log. #1478 tracks the proper followup: surface
broken observers in the UI (chip + banner on observer detail).

Backend-only, hot-deployable via image pull (no API/schema change).

Co-authored-by: openclaw-bot <bot@openclaw.local>
2026-05-29 06:27:58 +00:00
efiten 196f1c6720 fix(ingestor): don't stamp timestamp in procIO snapshot on os.Open failure (#1428)
## Summary

- `readProcSelfIO()` stamped `at=time.Now()` before attempting to open
`/proc/self/io`
- On non-Linux hosts or when the kernel file is unavailable, it returned
a snapshot with `ok=false` but a fresh timestamp
- The rate calculator used `prevIO.at` for delta computation, so the
next successful read produced a phantom rate spike spanning the entire
failure interval
- Fix: move the timestamp stamp to after successful `os.Open`, so failed
opens return a zero-value snapshot with no timestamp — `procIORate`
short-circuits on `prev.ok=false` and returns nil

## Test plan

- [ ] `go test ./...` in `cmd/ingestor` — both new unit tests pass:
- `TestProcIORate_ZeroValuePrevSuppressesRate` — asserts nil rate when
prev is zero-value
- `TestProcIORate_NormalPath` — asserts correct rate for valid prev/cur
pair
- [ ] On Linux: confirm `procIO` block still appears in the stats file
after 2 ticks

Closes #1169

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 22:50:23 -07:00
Kpa-clawbot 451b5e8848 fix: add default Public channel key to rainbow table (#897)
## Problem
The MeshCore default `Public` channel uses the well-known PSK
`8b3387e9c5cdea6ac9e5edbaa115cd72` (channel-hash byte `0x11`) per the
[companion protocol
spec](https://github.com/ripplebiz/MeshCore/blob/main/docs/companion_protocol.md#default-public-channel).

This key is **missing from `channel-rainbow.json`** in the repo. As a
result, the ingestor sees GRP_TXT messages on the default Public channel
(the most common channel on the mesh), can't find a key for hash `0x11`
(the only entry that hashes to 0x11 in the current rainbow is `#bogota`,
which obviously isn't the right key), and reports `decryption_failed`.
Fresh deploys see almost no decrypted public traffic.

## Fix
Add the well-known Public channel key to the rainbow as `"Public":
"8b3387e9c5cdea6ac9e5edbaa115cd72"`.

## Verification
```
python3 -c "import hashlib; print(hex(hashlib.sha256(bytes.fromhex('8b3387e9c5cdea6ac9e5edbaa115cd72')).digest()[0]))"
# 0x11
```

Matches the channel-hash byte we observe on incoming Public channel
GRP_TXT packets.

## Discovered via
Fresh MikroTik container deploy with no local channel additions — every
Public message showed up as `decryption_failed` while `#LongFast` etc
decrypted fine.

---------

Co-authored-by: you <you@example.com>
2026-05-28 22:50:20 -07:00
Kpa-clawbot 497e419f83 fix(#1471 followup): re-inject Customizer/Search/Favorites mirrors when More sheet opens (#1476)
**Problem:** Operator reports Customizer link missing from the
bottom-nav More sheet on prod (v3.8.2). bottom-nav.js builds the sheet
lazily on first More-click. mobile-page-actions.js calls
addMissingMoreSheetItems() at DOMContentLoaded + retries 10×500ms — so
if operator doesn't tap More within 5s of page load, mirrors never
appear.

**Root cause:** The earlier polish round (commit 70a570c6 within #1471)
dropped the click-listener that re-attempted injection. Init-time retry
alone isn't enough; bottom-nav builds the sheet ON DEMAND.

**Fix:** Re-add the catch-all click delegate that fires
addMissingMoreSheetItems on any More button click (with
belt-and-suspenders 50ms + 250ms timeouts to handle slow builds).

Hot-deploy candidate (JS-only).

Co-authored-by: openclaw-bot <bot@openclaw.local>
2026-05-29 04:08:03 +00:00
Kpa-clawbot f0da38f435 fix(#1471 followup): hide duplicate in-page Filters button on mobile (#1475)
**Problem:** Operator on prod reports two Filters buttons rendering on
mobile — the navbar mirror (#1467/#1471) AND the original
`.filter-toggle-btn` inside `.filter-bar`. Both are clickable, both
toggle filters, confusing UI.

**Root cause:** Commit `f88c413d` from #1471 deliberately kept
`.filter-bar` visible to satisfy E2E #534 (which queries
`.filter-toggle-btn` and clicks it). The in-page button stayed
display:flex while the navbar mirror was added — duplicate.

**Fix:** Switch the in-page button to `visibility: hidden` + 0×0 size +
`position: absolute` on mobile. Element stays in DOM,
`page.$('.filter-toggle-btn').click()` still works (visibility:hidden
elements are clickable in Playwright), but takes zero visual space.
Navbar mirror is the visible affordance.

**Test:** existing E2E #534 should pass unchanged (verifiable by running
test-e2e-playwright.js locally after this lands).

Hot-deployable (CSS only).

Closes the regression introduced in #1471.

Co-authored-by: openclaw-bot <bot@openclaw.local>
2026-05-29 03:57:12 +00:00
Kpa-clawbot c0f3ac4455 ci: update go-server-coverage.json [skip ci] 2026-05-29 02:03:49 +00:00
Kpa-clawbot 070d2b3bb7 ci: update go-ingestor-coverage.json [skip ci] 2026-05-29 02:03:48 +00:00
Kpa-clawbot 6a7e901f3f ci: update frontend-tests.json [skip ci] 2026-05-29 02:03:47 +00:00
Kpa-clawbot 9beb0aa277 ci: update frontend-coverage.json [skip ci] 2026-05-29 02:03:45 +00:00
Kpa-clawbot 004aa98474 ci: update e2e-tests.json [skip ci] 2026-05-29 02:03:44 +00:00
Kpa-clawbot 93b2f4b6bb fix(#1473): treat 0x00 and 0xFF as reserved prefixes (matrix + generator) (#1474)
## Summary

Two CoreScope surfaces treated `0x00` and `0xFF` as ordinary node
prefixes, but the MeshCore firmware actively rerolls any identity whose
public-key first byte is `0x00` or `0xFF` (see
[`examples/simple_repeater/main.cpp:64`](https://github.com/meshcore-dev/MeshCore/blob/6b52fb32301c273fc78d96183501eb23ad33c5bb/examples/simple_repeater/main.cpp#L64)):

```cpp
while (count < 10 && (the_mesh.self_id.pub_key[0] == 0x00
                   || the_mesh.self_id.pub_key[0] == 0xFF)) {
  // reserved id hashes
  the_mesh.self_id = radio_new_identity(); count++;
}
```

As a result the analyzer was steering new operators toward identities
the firmware will silently refuse — `0xFF` is also used as a wildcard
flood marker in parts of the routing flow, so this isn't cosmetic.

Reporter: **@halo779** (community).

## What this PR does

* **`public/prefix-reserved.js`** — small new module, single source of
truth. Exposes `isReservedPrefix`, `filterReserved`, `reservedCount`,
`markReservedCells`. Firmware citation lives in the file header.
* **Hash matrix (1-byte view)** — cells `00` and `FF` get the
`.prefix-reserved` class, lose `.hash-active` so the matrix click
handler skips them, and pick up an `aria-disabled` + a tooltip
explaining why.
* **Prefix generator** — random sampling, enumeration fallback, and the
"available count" all filter out reserved prefixes. A visible note under
the generator card cites `simple_repeater/main.cpp:64` directly.
* **Prefix checker** — pasting a reserved prefix or full pubkey now
surfaces a red `⚠️ Reserved prefix` alert above the per-tier breakdown.
* **`public/style.css`** — `.prefix-reserved` greys + strikes through
the cell and sets `pointer-events: none`.
* **`public/index.html`** — loads `prefix-reserved.js` before
`analytics.js`.

## Tests

Red-then-green visible in commit history:
* `test-issue-1473-reserved-prefixes.js` — `isReservedPrefix()`
semantics (case + multi-byte) and `markReservedCells()` behavior on a
mock 256-cell matrix.
* `test-issue-1473-prefix-generator.js` — `filterReserved`,
`reservedCount` per byte length, RNG-bias simulator showing the
generator never returns a reserved prefix, enumeration-first-free skips
`00`, and an assertion that `analytics.js` actually wires
`PrefixReserved` into the generator.

Both added to `test-all.sh`.

Fixes #1473

---------

Co-authored-by: clawbot <bot@openclaw.invalid>
v3.8.2
2026-05-28 18:43:03 -07:00
Kpa-clawbot ff76b1bf71 ci: update go-server-coverage.json [skip ci] 2026-05-29 00:45:28 +00:00
Kpa-clawbot 2de53d19a3 ci: update go-ingestor-coverage.json [skip ci] 2026-05-29 00:45:27 +00:00
Kpa-clawbot 1e88d00ee9 ci: update frontend-tests.json [skip ci] 2026-05-29 00:45:26 +00:00
Kpa-clawbot e26c961138 ci: update frontend-coverage.json [skip ci] 2026-05-29 00:45:25 +00:00
Kpa-clawbot 68d5c3ae82 ci: update e2e-tests.json [skip ci] 2026-05-29 00:45:25 +00:00
efiten cc37f9f689 fix(ci): stop cancelling master runs — only cancel stale PR builds (#1426)
## Summary

- `cancel-in-progress: true` was silently killing staging deploys
whenever a new commit landed on master during an active CI run
- During burst-merge sessions (7 cancelled runs documented in #1395),
staging drifted hours behind master with no failure signal (cancelled =
grey, not red)
- Fix: evaluate to `true` only for `pull_request` events, so PR branches
still drop stale runs but master runs always complete

## Test plan

- [ ] Verify expression evaluates correctly: PRs → `true` (cancel
stale), master push → `false` (never cancel), `workflow_dispatch` →
`false` (let manual runs complete)
- [ ] Manually trigger: merge 3 PRs in quick succession, confirm all 3
staging deploys complete
- [ ] Confirm no master CI run shows `cancelled` status after the fix

Closes #1395

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 17:25:49 -07:00
Kpa-clawbot 0386eba374 ci: update go-server-coverage.json [skip ci] 2026-05-28 23:31:37 +00:00
Kpa-clawbot 884e60d2b5 ci: update go-ingestor-coverage.json [skip ci] 2026-05-28 23:31:36 +00:00
Kpa-clawbot 7e2b5f2878 ci: update frontend-tests.json [skip ci] 2026-05-28 23:31:35 +00:00
Kpa-clawbot 03e1d135d6 ci: update frontend-coverage.json [skip ci] 2026-05-28 23:31:34 +00:00
Kpa-clawbot 784f44d213 ci: update e2e-tests.json [skip ci] 2026-05-28 23:31:33 +00:00
Kpa-clawbot d964c27964 feat(mobile): packets UX overhaul + nav surface + map inset + channel synthesis fixes (#1471)
## Summary

Mobile UX overhaul for the packets surface plus two discoverable defects
found along the way. All UI changes are mobile-only (`@media (max-width:
900px)` or `isMobile()` gates) — desktop unchanged.

## Closes
- #1415 — packets layout cross-viewport jank
- #1458 — Tufte mobile packets critique (P0s)
- #1461 — Tufte v2 mobile packets critique (P0/P1)
- #1467 — Favorites/Search/Customize unreachable on mobile
- #1468 — client-side "unknown" channel synthesis
- #1470 — node-detail map inset doesn't honor customizer dark provider

## Commits

1. `fix(#1468): drop client-side "unknown" channel synthesis` —
`channels.js`
2. `feat(#1470): node-detail map inset honors customizer dark-tile
provider` — `nodes.js`, `roles.js`
3. `feat(mobile): packets UX overhaul + bottom-nav More controls (#1415,
#1458, #1461, #1467)` — `style.css`, `index.html`,
`mobile-page-actions.js` (new)

## Mobile-list view changes
- Kill empty chevron rail
- Slim sticky THEAD (24px, retains sort affordance per operator
preference)
- Hide entire page-header on mobile
- Mirror pause + Filters pill into navbar via new
`mobile-page-actions.js`
- Convert group-header `toggle-select` → `select-hash` on mobile (no
dead-end expand)

## Mobile detail-panel changes
- Drop redundant src→dst line (identity already in sticky header)
- Hide boxed "decoded message" duplication card
- Hide PAYLOAD TYPE row (already in header badge)
- 2-col label/value grid (cuts panel height ~40%)
- Sticky in-sheet header for packet identity
- Kill iOS-style drag handle (conflicts with browser pull-to-refresh)
- Make ✕ close visible + always reachable
- Outer sheet `overflow:hidden`, inner content `overflow-y:auto`
(scrollable region distinct, scrollbar visible)
- Bottom-nav clearance (`padding-bottom: 60px`)
- Close detail sheet on route change away from /packets
- Tap-to-toast popovers for score tooltips (`title=` doesn't fire on
touch)

## Mobile nav surface
- Mirror Favorites  / Search 🔍 / Customize 🎨 into bottom-nav More sheet
(#1467)
- Brand stays in top nav; per-page controls (pause, Filters) injected
into `.nav-left`

## Other fixes shipped together
- **#1468**: drop CHAN messages with no decoded channel name (eliminates
fake "unknown" channel row)
- **#1470**: `_applyTilesToNodeMap` helper — node-detail inset map reads
from `MC_TILE_PROVIDERS[active]` instead of hardcoded OSM; honors
customizer's dark-tile provider pick + applies invert filter for
inverted variants
- `getTileUrl()` + new `getActiveTileProvider()` in `roles.js` now
consult `MC_TILE_PROVIDERS`

## CDP verification (local chromium)

Tested on staging at viewport 390×844 + 1206×928.

| Surface | Before | After |
|---|---|---|
| Chrome above first data row | 231px (27% viewport) | ~80px (10%
viewport) |
| Packets visible above fold | 10 | 17 |
| Detail panel duplications | 3× identity | 1× (header only) |
| Mobile group-expand UX | dead-end (no chevron) | converts to
select-hash |
| Score tooltips on touch | broken (title= silent) | tap → toast popover
|
| Node detail map inset (dark mode) | always OSM light tiles | honors
customizer provider + invert filter |
| Bottom-nav More controls | Dark mode only | + Favorites, Search,
Customize |

## What's NOT in this PR
- Paths-through-node sort fix lives in #1431 (parallel PR for #1145)
- Detail-panel hex byte-grid behind disclosure — operator wants it;
follow-up
- Group-header row sizing (some render 200–700px tall) — existing
behavior, follow-up

## Test plan
- [ ] Existing frontend tests stay green
(`test-issue-1415-packets-layout.js`,
`test-issue-1420-tile-providers.js`,
`test-issue-1454-channels-toggle.js` all pass locally on this branch)
- [ ] Existing Playwright E2E stays green
- [ ] CDP on local chromium: 390×844 mobile + 1024×768 tablet + 1440×900
desktop — no regressions

---------

Co-authored-by: openclaw-bot <bot@openclaw.local>
2026-05-28 16:11:25 -07:00
Kpa-clawbot fe997fefb2 ci: update go-server-coverage.json [skip ci] 2026-05-28 22:26:47 +00:00
Kpa-clawbot df60aa1d9f ci: update go-ingestor-coverage.json [skip ci] 2026-05-28 22:26:46 +00:00
Kpa-clawbot 92afdd6dce ci: update frontend-tests.json [skip ci] 2026-05-28 22:26:45 +00:00
Kpa-clawbot 4364f34b85 ci: update frontend-coverage.json [skip ci] 2026-05-28 22:26:45 +00:00
Kpa-clawbot b5b0cfcb60 ci: update e2e-tests.json [skip ci] 2026-05-28 22:26:44 +00:00
efiten 7c40e24a35 feat(server): warn at startup when GOMEMLIMIT < 50% of container memory limit (#1264) (#1429)
## Summary

- Adds `readCgroupMemoryMB()` to detect container memory ceiling from
cgroup v2 (`/sys/fs/cgroup/memory.max`) and v1
(`/sys/fs/cgroup/memory.limit_in_bytes`)
- Adds `warnIfMemlimitUnderprovisioned()` called once from `main()`
after the existing memlimit block — logs a `[memlimit] WARN` at startup
if the effective GOMEMLIMIT is below 50% of the container limit
- Works whether the limit was set via `GOMEMLIMIT` env var or derived
from `packetStore.maxMemoryMB`
- Adds `readCgroupMemoryMBFn` package-level hook for test injection
(same pattern as `readProcSelfIOFn` in the ingestor)

Fixes #1264. In the reported incident, GOMEMLIMIT was 1536 MiB on a 7.7
GB container; GC consumed 82% of CPU and all endpoints were 3–100×
slower. This warning fires at startup so operators catch the
misconfiguration before it causes an incident.

## Test plan

- [ ] `TestWarnIfMemlimitUnderprovisioned_EmitsWarning` — warning fires
when effective < 50% of cgroup
- [ ] `TestWarnIfMemlimitUnderprovisioned_NoWarnWhenAdequate` — no
warning at boundary (effective = 1024 MiB, cgroup = 1536 MiB)
- [ ] `TestWarnIfMemlimitUnderprovisioned_NoCgroupNoLog` — silent on
non-container hosts
- [ ] `TestWarnIfMemlimitUnderprovisioned_NoneSource` — no warning when
`source="none"` (no limit configured, runtime returns math.MaxInt64)
- [ ] `TestMemlimitUnderprovisioned` — boundary table for the comparison
helper
- [ ] All existing `TestApplyMemoryLimit_*` still pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 15:06:30 -07:00
efiten ad45a774d7 test(paths): regression test for #1144 — hop name mis-resolution on prefix collision (#1433)
## Summary

- Adds `TestHandleNodePaths_HopName_CanonicalPathShowsTarget_1144` as a
regression test for issue #1144
- When two nodes share a short pubkey prefix (e.g. `"37"`), the biased
hop resolver (`resolveWithContext`) could pick a GPS-having sibling over
the actual target node, producing the wrong name in hop display
- The bug was already fixed during the #1352 canonical-path work: the
canonical-path branch (Option A) uses `lookupNode(resolvedPK)` with the
full pubkey from `resolved_path`, bypassing the biased resolver entirely
- This PR documents and locks in the correct behaviour with a targeted
test

## Test setup

- `targetPK` (`37cf...`): no GPS
- `siblingPK` (`37bb...`): has GPS — the biased resolver's tier-3 picks
this without the fix
- One TX with `resolved_path = [targetPK]` → Option A fires →
`lookupNode(targetPK)` → hop shows `"CJS SF Mission"`, not `"Templeton
Hills"`

If Option A were removed (bug re-introduced), `resolveWithContext("37",
...)` on the two candidates would return the GPS-having sibling,
triggering the test failure.

## Test plan

- [x] `go test -run TestHandleNodePaths_HopName -v` passes
- [x] Full `go test ./...` passes
- [x] Code review addressed (collapsed redundant error checks)

Closes #1144

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 15:02:59 -07:00
efiten 981664528e perf(server): serve stale repeater enrich cache instead of inline rebuild (#1272) (#1436)
## Summary

- Removes the TTL-based inline rebuild from `GetRepeaterRelayInfoMap`
and `GetRepeaterUsefulnessScoreMap`
- When the cache is non-nil it is returned immediately, regardless of
age — no more 700ms on-request recompute
- Inline compute is retained only as a nil-cache guard (edge case: tests
without a running recomputer)
- Fixes the stale `// 15s-TTL gate` comment in
`recomputeRepeaterEnrichmentSafe`

**Root cause:** `computeRepeaterRelayInfoMap` runs inline when the TTL
expires, taking ~700ms on a busy instance.
`StartRepeaterEnrichmentRecomputer` (introduced in #1262) already keeps
the cache warm via synchronous prewarm at startup + 5-min ticks, making
the inline path dead code that fires only when the TTL is shorter than
the recomputer interval (e.g. custom `analytics.defaultIntervalSeconds >
600`).

## Test plan

- [ ] `TestGetRepeaterRelayInfoMap_ServesStaleOnTTLExpiry` — regression
guard: stale sentinel is returned without recompute
- [ ] `TestGetRepeaterUsefulnessScoreMap_ServesStaleOnTTLExpiry` — same
for usefulness score map
- [ ] `TestGetRepeaterRelayInfoMap_BuildsWhenNil` — nil-cache fallback
still works
- [ ] Full `-short` suite passes (`go test -short ./...`)

Closes #1272

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 15:01:58 -07:00
efiten 52f131e2dc fix(ingestor): add hourly WAL checkpoint to prevent unbounded WAL growth (#1435)
Fixes #1434.

## Problem

The ingestor's `Checkpoint()` (`PRAGMA wal_checkpoint(TRUNCATE)`) was
only called on shutdown. SQLite's built-in auto-checkpoint runs in
PASSIVE mode which cannot truncate the WAL while the server holds an
active read connection. Result: the WAL grows at ~40–50 MB/hour and is
never reset during a running instance.

Observed on analyzer.on8ar.eu: **183.4 MB WAL** after ~4h uptime.

## Changes

**`cmd/ingestor/main.go`**
- Add a periodic goroutine that calls `Checkpoint()` every hour,
staggered 30s after startup
- Hoist `walCheckpointTicker` to function scope so it is stopped cleanly
at shutdown alongside all other tickers

**`cmd/ingestor/db.go`**
- Switch `Checkpoint()` from `Exec` to `QueryRow(...).Scan` to capture
SQLite's 3-column result (`busy`, `log`, `checkpointed`)
- Return the checkpointed frame count (callers that discard it are
unaffected)
- Log only when `walFrames > 0` — silent when WAL is already empty,
avoiding log spam
- Log `blocked=true/false` instead of raw `busy` integer to make it
clear when the server's read lock is preventing full truncation

## Behaviour after fix

Each hourly tick flushes all WAL frames not held by an active server
reader. Worst-case WAL size is now bounded to roughly one hour of write
traffic (~45 MB) instead of unbounded growth. If the server holds a read
lock at checkpoint time, the log shows `blocked=true` and remaining
frames are retried on the next tick.

## Test plan

- [x] `go build ./...` (ingestor module)
- [x] `go test ./...` passes
- [x] Code review addressed (ticker stop on shutdown, log message
clarity)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 15:01:54 -07:00
Eric Muehlstein 29432d4fe0 feat(ingestor): document and test ws:// / wss:// WebSocket MQTT broker support (#902)
## Summary

CoreScope's ingestor already supports WebSocket MQTT connections today —
`paho.mqtt.golang` v1.5.0 handles `ws://` and `wss://` natively via
gorilla/websocket. However this support was **undocumented, untested,
and had a TLS gap** for `wss://` connections.

This PR closes those gaps without any breaking changes.

## Changes

### `cmd/ingestor/config.go`
- Added godoc comment to `ResolvedSources()` explaining all four
supported schemes and which ones require translation vs. pass-through
- `ws://` and `wss://` explicitly documented as native paho schemes
requiring no mapping

### `cmd/ingestor/main.go`
- Extended TLS config to cover `wss://` in addition to `ssl://`
- Before: `wss://` connections would use paho's default TLS (no explicit
`tls.Config` set), which works for valid certs but doesn't apply the
same predictable setup as `ssl://`
- After: both `ssl://` and `wss://` get `tls.Config{}` (system CA pool),
matching behavior; `rejectUnauthorized: false` still works for
self-signed certs on both schemes

### `cmd/ingestor/config_test.go`
Two new tests:
- `TestResolvedSourcesSchemeMapping`: validates all six scheme
variations (`mqtt://`, `mqtts://`, `tcp://`, `ssl://`, `ws://`,
`wss://`) including paths like `wss://host/mqtt`
- `TestLoadConfigWSSource`: full round-trip of a dual-source config (TCP
+ wss:// with username/password), verifies scheme unchanged through
`LoadConfig` and `ResolvedSources`

### `config.example.json`
- Added `wsmqtt` example entry showing `wss://` with username/password
- Updated `_comment_mqttSources` to enumerate all supported schemes:
`mqtt://`, `mqtts://`, `ws://`, `wss://`

## Motivation

We run
[meshcore-mqtt-broker](https://github.com/andrewjfreyer/meshcore-mqtt-broker)
(a WebSocket MQTT bridge with JWT auth) alongside Mosquitto, and
subscribe to both via `mqttSources`. The dual-source config works in
production but nothing in the docs or example config made this
discoverable for other operators.

## Testing

```
cd cmd/ingestor && go test ./...
ok    github.com/corescope/ingestor  1.568s
```

All existing tests pass. Two new tests added.

## No breaking changes

- Existing configs: no change in behavior
- `ws://` / `wss://` configs that were already working: same behavior +
explicit TLS setup for `wss://`
2026-05-28 14:58:52 -07:00
efiten b3e55ae8d5 fix(nodes): sort paths-through-node by recency, count as tiebreaker (#1145) (#1431)
## Summary

- `/api/nodes/{pk}/paths` returned paths in non-deterministic map
iteration order; with many paths the UI showed a random ordering on each
page load
- Now sorted by `LastSeen` descending (newest-first), with `Count` as a
tiebreaker (higher first)
- Nil `LastSeen` sorts last (treated as oldest)
- `LastSeen` is an RFC 3339 string so lexicographic comparison is
correct

Closes #1145.

## Test plan

- [ ] `TestHandleNodePaths_SortByRecency_1145` — 3 distinct paths (via
relay1, relay2, direct), verifies newest appears first
- [ ] `TestHandleNodePaths_SortCountTiebreaker_1145` — two paths with
identical `LastSeen`, verifies higher-count path wins the tiebreak
- [ ] All existing `TestHandleNodePaths_*` tests still pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 14:55:59 -07:00
Kpa-clawbot 889a785058 ci: update go-server-coverage.json [skip ci] 2026-05-28 19:38:42 +00:00