Commit Graph

16 Commits

Author SHA1 Message Date
Kpa-clawbot 094a96bd6c perf(#1258): /#/perf — parallel health fetch, sort endpoints, pause refresh while hidden (#1261)
Fixes #1258 — Perf dashboard (/#/perf) was slow because of three
frontend issues; backend APIs were never the problem.

## Findings

1. **`/api/health` fetched sequentially after `Promise.all`** in
`refresh()` — added a full RTT (~50-200ms) on every 5s tick on top of
the parallel batch.
2. **Endpoints table not actually sorted** despite the heading "sorted
by total time". JSON shape is `map[string]EndpointStatsResp` (no defined
order); frontend rendered map iteration order. Visible correctness bug
surfaced during investigation.
3. **`setInterval(refresh, 5000)` kept firing while tab was hidden**,
rebuilding the entire ~10-section `innerHTML` (cards + 3 tables) in the
background. On tab return the user saw a backlog thrash + felt the page
was "slow to render".

## Fix (`public/perf.js`)

- Move `/api/health` into the same `Promise.all` as the other 4
endpoints — saves one RTT per refresh.
- Sort `Object.entries(server.endpoints)` by `count * avgMs` DESC
client-side.
- Add `document.hidden` guard in the interval tick + `visibilitychange`
listener that refreshes once on return; `destroy()` removes the
listener.

## Tests

`test-perf-render-1258.js` (new):
- All 5 initial fetches issued in parallel (including `/api/health`)
- Refresh suppressed while `document.hidden`
- Endpoints table sorted by total time DESC, regardless of input map
order

RED commit first (`6b54f9e8`, 0/3 pass) → GREEN commit (`be81303b`, 3/3
pass). Existing `test-perf-go-runtime.js` (13/13) and
`test-perf-disk-io-1120.js` (15/15) still green.

## Investigation exemption

No Playwright timing test — sandbox can't run a real browser. Static
analysis + render-shape unit tests cover the three identified
bottlenecks. Documented per AGENTS "investigation surfaces" exemption.

## Measurement

Before: refresh = parallel batch (~max(server-side)) + sequential
`/api/health` (~50ms) + full innerHTML rebuild every 5s including hidden
tabs.
After: refresh = single parallel batch, runs only while visible.
Expected improvement on tab-return ≈ -1 RTT per refresh + zero
background work.

---------

Co-authored-by: corescope-bot <bot@corescope.local>
2026-05-18 08:02:27 -07:00
Kpa-clawbot f4cf2acbc0 perf: cancelled writes + ingestor I/O + threshold tests (#1120 follow-up) (#1167)
Red commit: e964ec9c46 (CI run: pending —
workflow only triggers on PR open)

Partial fix for #1120 — finishes the four follow-up items left open
after PR #1123 (cancelled writes, ingestor I/O, threshold-flag tests,
docs).

## What's done

- **`cancelledWriteBytesPerSec`** — server `/proc/self/io` parser
handles `cancelled_write_bytes`; `/api/perf/io` exposes the per-second
rate; Perf page renders it next to Read/Write with ⚠️ when sustained >1
MB/s.
- **Ingestor `/proc/<pid>/io`** — `cmd/ingestor/stats_file.go` samples
its own `/proc/self/io` each tick and includes `procIO` in the snapshot.
The server's `/api/perf/io` reads it and surfaces `.ingestor`. Frontend
renders an `Ingestor process` Disk I/O block alongside the existing
`server process` block (issue mockup: "Both ingestor and server").
- **Threshold + anomaly tests** — `test-perf-disk-io-1120.js` now
asserts ⚠️ fires/suppresses on WAL>100MB, cache_hit<90%, and the
backfill-rate-vs-tx-rate guard with the `tx_inserted >= 100` baseline
floor. Drops the tautological `|| ... === false` short-circuits flagged
in MINOR m4.
- **Docs (m8)** — `config.example.json` adds `_comment_ingestorStats`
(env var, default path, shared-tmp security note);
`cmd/ingestor/README.md` adds `CORESCOPE_INGESTOR_STATS` to the env-var
table plus a `Stats file` section.

## What's NOT done (deferred)

m1 sync.Map → map+RWMutex, m2 perfIOMu rate caching, m3 negative
cacheSize translation, m5 deterministic-write test, m7 ctx-aware
shutdown — pure polish; will file a follow-up issue if the operator
wants them tracked.

## TDD

- Red: `e964ec9` — adds failing tests + stub field/handler shape
(cancelled missing from struct, ingestor stub returns nil, ingestor
procIO absent).
- Green: `1240703` — wires up the parser case, ingestor sampler,
frontend rendering, docs.

E2E assertion added: test-perf-disk-io-1120.js:108

---------

Co-authored-by: clawbot <clawbot@users.noreply.github.com>
Co-authored-by: Kpa-clawbot <bot@kpa-clawbot.local>
Co-authored-by: Kpa-clawbot <bot@kpa-clawbot>
2026-05-08 16:29:23 -07:00
Kpa-clawbot 74dffa2fb7 feat(perf): per-component disk I/O + write source metrics on Perf page (#1120) (#1123)
## Summary

Implements per-component disk I/O + write source metrics on the Perf
page so operators can self-diagnose write-volume anomalies (cf. the
BackfillPathJSON loop debugged in #1119) without SSHing in to run
iotop/fatrace.

Partial fix for #1120

## What's done (4/6 ACs)
-  `/api/perf/io` — server-process `/proc/self/io` delta rates
(read/write bytes per sec, syscalls)
-  `/api/perf/sqlite` — WAL size, page count, page size, cache hit rate
-  `/api/perf/write-sources` — per-component counters from ingestor
(tx/obs/upserts/backfill_*)
-  Frontend Perf page — three new sections with anomaly thresholds +
per-second rate columns

## What's NOT done (deferred to follow-up)
-  `cancelledWriteBytesPerSec` field — issue #1120 lists this under
server-process I/O ("writes the kernel discarded — interesting signal");
not exposed in this PR
-  Ingestor `/proc/<pid>/io` — issue #1120 says "Both ingestor and
server"; only server-process I/O lands here. Adding ingestor I/O
requires either a unix socket back to the server, or surfacing the
ingestor pid through the stats file. Doable without changing the
existing API shape.
-  Adaptive baselining — anomaly thresholds remain static (10×, 100 MB,
90%); steady-state baselining can come once we have enough deployed
Perf-page telemetry

Per AGENTS.md rule 34, this PR uses "Partial fix for #1120" rather than
"Fixes #1120" so the issue stays open until the remaining ACs land.

## Backend

**Server (`cmd/server/perf_io.go`)**
- `GET /api/perf/io` — reads `/proc/self/io` and returns delta-rate
`{readBytesPerSec, writeBytesPerSec, syscallsRead, syscallsWrite}` since
last call (in-memory tracker, no allocation per sample).
- `GET /api/perf/sqlite` — returns `{walSize, walSizeMB, pageCount,
pageSize, cacheSize, cacheHitRate}`. `cacheHitRate` is proxied from the
in-process row cache (closest available signal under the modernc sqlite
driver).
- `GET /api/perf/write-sources` — reads the ingestor's stats JSON file
and returns a flat `{sources: {...}, sampleAt}` payload.

**Ingestor (`cmd/ingestor/`)**
- `DBStats` gains `WALCommits atomic.Int64` (incremented on every
successful `tx.Commit()` and on every auto-commit `InsertTransmission`
write) and `BackfillUpdates sync.Map` keyed by backfill name with
`IncBackfill(name)` / `SnapshotBackfills()` helpers.
- `BackfillPathJSONAsync` now increments `BackfillUpdates["path_json"]`
per row write — the BackfillPathJSON-style infinite loop becomes
immediately visible at `backfill_path_json` in the Write Sources table.
- New `StartStatsFileWriter` publishes a JSON snapshot to
`/tmp/corescope-ingestor-stats.json` (override via
`CORESCOPE_INGESTOR_STATS`) every second using atomic tmp+rename. The
tmp file is opened with `O_CREATE|O_WRONLY|O_TRUNC|O_NOFOLLOW` mode
`0o600` so a pre-planted symlink in a world-writable `/tmp` cannot
redirect the write to an arbitrary file.

## Frontend (`public/perf.js`)

Three new sections on the Perf page, all auto-refreshed via the existing
5s interval:

- **Disk I/O (server process)** — read/write rates (formatted
B/KB/MB-per-sec) + syscall counts. Write rate >10 MB/s flags ⚠️.
- **Write Sources** — sorted table of per-component counters with a
per-second rate column derived from snapshot deltas. Backfill rows show
⚠️ only when `tx_inserted >= 100` (meaningful baseline) AND the
backfill's per-second rate exceeds 10× the live tx rate. Avoids the
startup-spurious-alarm where cumulative-vs-cumulative was a tautology.
- **SQLite (WAL + Cache Hit)** — WAL size (⚠️ when >100 MB), page count,
page size, cache hit rate (⚠️ when <90%).

## Tests

- **Backend** (`cmd/server/perf_io_test.go`) —
`TestPerfIOEndpoint_ReturnsValidJSON`,
`TestPerfSqliteEndpoint_ReturnsValidJSON`,
`TestPerfWriteSourcesEndpoint_ReturnsSources` exercise the three new
endpoints. Skips the `/proc/self/io` non-zero-rate assertion when
`/proc` is unavailable.
- **Frontend** (`test-perf-disk-io-1120.js`) — vm-sandbox runs `perf.js`
with stubbed `fetch`, asserts the three new sections render with their
headings + values.

E2E assertion added: test-perf-disk-io-1120.js:91

## TDD

1. Red commit (`21abd22`) — added the three handlers as no-op stubs
returning empty values; tests fail on assertion mismatches (non-zero
rate, `pageSize > 0`, headings present).
2. Green commit (`d8da54c`) — fills in the real `/proc/self/io` parser,
PRAGMA queries, ingestor stats writer, and Perf page rendering.

---------

Co-authored-by: corescope-bot <bot@corescope.local>
Co-authored-by: Kpa-clawbot <kpa-clawbot@users.noreply.github.com>
2026-05-05 17:56:56 -07:00
Kpa-clawbot 7af91f7ef6 fix: perf page shows tracked memory instead of heap allocation (#718)
## Summary

The perf page "Memory Used" tile displayed `estimatedMB` (Go
`runtime.HeapAlloc`), which includes all Go runtime allocations — not
just packet store data. This made the displayed value misleading: it
showed ~2.4GB heap when only ~833MB was actual tracked packet data.

## Changes

### Frontend (`public/perf.js`)
- Primary tile now shows `trackedMB` as **"Tracked Memory"** — the
self-accounted packet store memory
- Added separate **"Heap (debug)"** tile showing `estimatedMB` for
runtime visibility

### Backend
- **`types.go`**: Added `TrackedMB` field to `HealthPacketStoreStats`
struct
- **`routes.go`**: Populate `TrackedMB` in `/health` endpoint response
from `GetPerfStoreStatsTyped()`
- **`routes_test.go`**: Assert `trackedMB` exists in health endpoint's
`packetStore`
- **`testdata/golden/shapes.json`**: Updated shape fixture with new
field

### What was already correct
- `/api/perf/stats` already exposed both `estimatedMB` and `trackedMB`
- `trackedMemoryMB()` method already existed in store.go
- Eviction logic already used `trackedBytes` (not HeapAlloc)

## Testing
- All Go tests pass (`go test ./... -count=1`)
- No frontend logic changes beyond template string field swap

Fixes #717

Co-authored-by: you <you@example.com>
2026-04-12 12:40:17 -07:00
Kpa-clawbot 10f712f9d7 fix: restructure scroll containers for iOS status bar tap-to-scroll (#330) (#554)
## Summary

Fixes #330 — iOS status bar tap-to-scroll broken because `#app` had
`overflow: hidden`, preventing `<body>` from being the scroll container.

## Approach: Option B from the issue

Instead of a JS polyfill, this restructures scroll containers so
`<body>` is the primary scroll container by default, which iOS Safari
requires for native status-bar tap-to-scroll.

### How it works

**`#app` default (body-scroll mode):** Uses `min-height` instead of
fixed `height`, no `overflow: hidden`. Content pushes beyond the
viewport and body scrolls naturally.

**`#app.app-fixed` (fixed-layout mode):** Restores the original `height:
calc(100dvh - 52px); overflow: hidden` for pages that need constrained
containers. The router in `app.js` toggles this class based on the
current page.

### Fixed-layout pages (`.app-fixed`)
These pages need fixed-height containers and are unchanged in behavior:
- **packets** — virtual scroll requires fixed-height `.panel-left` to
calculate visible rows
- **nodes** — split-panel layout with independently scrollable panels
- **map** — Leaflet requires fixed-dimension container
- **live** — Leaflet map (also has its own `#app:has(.live-page)`
override in live.css)
- **channels** — split-panel chat layout
- **audio-lab** — split-panel layout

### Body-scroll pages (no `.app-fixed`)
These pages now let the body scroll, enabling iOS tap-to-scroll:
- **analytics** — removed `overflow-y: auto; height: 100%`
- **observers** — removed `overflow-y: auto; height: calc(100vh - 56px)`
- **traces** — removed `overflow-y: auto; height: 100%`
- **home** — removed `#app:has(.home-hero)` override (no longer needed)
- **compare** — removed inline `overflow-y:auto; height:calc(100vh -
56px)`
- **perf** — removed inline `height:100%; overflow-y:auto`
- **observer-detail** — removed inline `overflow-y:auto;
height:calc(100vh - 56px)`
- **node-analytics** — removed inline `height:100%; overflow-y:auto`

### Files changed
| File | Change |
|------|--------|
| `public/style.css` | `#app` default → `min-height`; added `.app-fixed`
class |
| `public/app.js` | Router toggles `.app-fixed` based on page |
| `public/home.css` | Removed `#app:has()` workaround |
| `public/compare.js` | Removed inline overflow/height |
| `public/perf.js` | Removed inline overflow/height |
| `public/observer-detail.js` | Removed inline overflow/height |
| `public/node-analytics.js` | Removed inline overflow/height |

### What's preserved
- Sticky nav (`position: sticky; top: 0`) — works with body scroll
- Split-panel resize handles — unchanged, still in fixed containers
- Virtual scroll on packets page — unchanged, `.panel-left` still has
fixed height
- Leaflet maps — unchanged, containers still have fixed dimensions
- Mobile responsive overrides — unchanged

Co-authored-by: you <you@example.com>
2026-04-03 16:54:36 -07:00
efiten 8ede8427c8 fix: round Go Runtime floats to 1dp, prevent nav stats dot wrapping
- perf.js: toFixed(1) on all ms/MB values in Go Runtime section
- style.css: white-space: nowrap on .nav-stats to prevent the · separator
  from wrapping onto its own line

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 07:51:26 -07:00
Kpa-clawbot 71ec5e6fca rename: MeshCore Analyzer → CoreScope (frontend + .squad)
Phase 1 of the CoreScope rename — frontend display strings and
squad agent metadata only.

index.html:
- <title>, og:title, twitter:title → CoreScope
- Brand text span → CoreScope
- og:image/twitter:image URLs → corescope repo (placeholder)
- Cache busters bumped

public/*.js headers (19 files):
- All file header comments updated

public/*.css headers:
- style.css, home.css updated

JavaScript strings:
- app.js: GitHub URL → corescope
- home.js: 3 fallback siteName references
- customize.js: default siteName + heroTitle

Tests:
- test-e2e-playwright.js: title assertion → corescope
- test-frontend-helpers.js: GitHub URL constant
- benchmark.js: header string
- test-all.sh: header string

.squad:
- team.md, casting/history.json
- All 7 agent charters + 5 history files

NOT renamed (intentional):
- localStorage keys (meshcore-*)
- CSS classes (.meshcore-marker)
- Window globals (_meshcore*)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-28 14:03:32 -07:00
Kpa-clawbot 447c5d7073 fix: mobile responsive — #203 live bottom-sheet, #204 perf layout, #205 nodes col-hide
#203: Live page node detail panel becomes a bottom-sheet on mobile
      (width:100%, bottom:0, max-height:60vh, rounded top corners).
#204: Perf page reduces padding to 12px, perf-cards stack in 2-col
      grid, tables get smaller font/padding on mobile.
#205: Nodes table hides Public Key column on mobile via .col-pubkey
      class (same pattern as packets page .col-region/.col-rpt).

Cache busters bumped in index.html.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-28 02:38:23 -07:00
Kpa-clawbot 744702ccf6 feat(perf): show Go Runtime stats instead of Event Loop on Go backend
When engine=go, the perf page now renders Go-specific runtime stats
(goroutines, GC collections, GC pause times, heap breakdown, CPUs)
instead of the misleading Node.js Event Loop metrics. Falls back to
the existing Node UI when engine is not 'go' or goRuntime data is
missing. Includes color-coded GC pause thresholds.

fixes #153

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-27 11:49:10 -07:00
you e340949253 feat: optimize observations table — 478MB → 141MB
Schema v3 migration:
- Replace observer_id TEXT (64-char hex) with observer_idx INTEGER FK
- Drop redundant hash, observer_name, created_at columns
- Store timestamp as epoch integer instead of ISO string
- In-memory dedup Set replaces expensive unique index lookups
- Auto-migration on startup with timestamped backup (never overwrites)
- Detects already-migrated DBs via pragma user_version + column inspection

Fixes:
- disambiguateHops: restore 'known' field dropped during refactor (fba5649)
- Skip MQTT connections when NODE_ENV=test
- e2e test: encodeURIComponent for # channel hashes in URLs
- VACUUM + TRUNCATE checkpoint after migration (not just VACUUM)
- Daily TRUNCATE checkpoint at 2:00 AM UTC to reclaim WAL space

Observability:
- SQLite stats in /api/perf (DB size, WAL size, freelist, row counts, busy pages)
- Rendered in perf dashboard with color-coded thresholds

Tests: 839 pass (89 db + 30 migration + 70 helpers + 200 routes + 34 packet-store + 52 decoder + 255 decoder-spec + 62 filter + 47 e2e)
2026-03-25 22:33:39 +00:00
you 1cb3baf4ab fix: replace all hardcoded colors with CSS variables
- Move --status-green/yellow/red from home.css to style.css :root (light+dark)
- Replace hardcoded status colors in style.css (.tl-snr, .health-dot, .byop-err,
  .badge-hash-*, .fav-star.on, .spark-fill) with CSS variable references
- Replace hardcoded colors in live.css (VCR mode, stat pills, fdc-link, playhead)
- Replace --primary/--bg-secondary/--text-primary/--text-secondary dead vars with
  canonical --accent/--input-bg/--text/--text-muted in style.css, map.js, live.js,
  traces.js, packets.js
- Fix nodes.js legend colors to use ROLE_COLORS globals instead of hardcoded hex
- Replace hardcoded hex in home.js (SNR), perf.js (indicators), map.js (accuracy
  circles) with CSS variable references via getComputedStyle or var()
- Add --detail-bg to customizer (THEME_CSS_MAP, DEFAULTS, ADVANCED_KEYS, labels)
- Move font/mono out of ADVANCED_KEYS into separate Fonts section in customizer
- Remove debug console.log lines from customize.js
- Bump cache busters in index.html
2026-03-23 03:29:38 +00:00
you 46d9b690ee fix: close if(health) block in perf dashboard — was swallowing all content 2026-03-20 05:47:11 +00:00
you 2e51e5f743 feat: system health + SWR stats in perf dashboard
Perf page now shows: heap usage, RSS, event loop p95/max/current,
WS client count, stale-while-revalidate hits, recompute count.
Color-coded: green/yellow/red based on thresholds.
2026-03-20 05:45:51 +00:00
you 706227b106 feat: add Perf dashboard to nav bar, show packet store stats
Perf page now accessible from main nav ( Perf).
Shows in-memory packet store metrics: packets in RAM, memory used/limit,
queries served, live inserts, evictions, index sizes.
2026-03-20 04:25:05 +00:00
you 2b7ed064d1 perf page: show server + client cache stats 2026-03-20 02:10:27 +00:00
you 4fff11976e feat: performance instrumentation — server timing middleware, client API tracking, /api/perf endpoint, #/perf dashboard 2026-03-20 01:34:25 +00:00