Files
meshcore-analyzer/scripts
Kpa-clawbot a2004351d3 fix(#1684): staging disk monitor + cleanup cron (#1686)
## Summary
Adds a staging VM disk-usage monitor + daily cleanup cron, fixing the
gap surfaced by #1684 (staging hit 100% disk during a hot-patch, no
alert, no cleanup).

## What landed
- **`scripts/staging/disk-monitor.sh`** — parses `df -P <mount>`,
classifies usage `<80 ok / >=80 warn / >=90 error / >=95 alert`, emits
to stderr + journald via `logger -p`, exits non-zero on `error|alert` so
the systemd unit surfaces as failed.
- **`scripts/staging/disk-cleanup.sh`** — daily prune of `/tmp` snapshot
patterns (`*.db`, `staging-snap.*`, `cs-*`, `node-compile-cache`) older
than 7d + `docker builder/image prune --filter until=72h --filter
label!=keep`. Honors `CORESCOPE_CLEANUP_DRY_RUN=1`.
- **`scripts/staging/test-disk-monitor.sh`** — pure-bash unit tests for
the testable helpers (22 cases covering threshold boundaries, df
parsing, invalid input, severity→priority mapping).
- **`DEPLOY.md`** — install one-liner with full inline systemd unit +
timer content (15-min monitor, daily 03:30 cleanup). Uses
`<STAGING_HOST>` placeholder.
- **`.github/workflows/deploy.yml`** — wires `test-disk-monitor.sh` into
the Go build & test job.

## TDD
- Commit `26185967` (RED): tests against stub helpers — `PASS=5 FAIL=17`
on assertions.
- Commit `d31a1082` (GREEN): real helpers — `PASS=22 FAIL=0`.

## Phase 3 — `staging-snap.db` root cause
`grep -rn staging-snap.db cmd/ public/ scripts/` → **zero hits**. The
4.4 GB orphan was a manual debug artifact, not committed code. The
cleanup retention rule prevents recurrence.

Partial fix for #1684 — leaves issue open for operator to verify install
on staging and confirm alert fires at 85%.

---------

Co-authored-by: corescope-bot <bot@corescope.local>
Co-authored-by: clawbot <bot@openclaw.dev>
2026-06-12 11:38:39 -07:00
..