Files
meshcore-analyzer/cmd/server/readonly_invariant_test.go
T
Kpa-clawbot 1da2034341 refactor(db): move all writes from server to ingestor; server truly read-only (fixes #1283) (#1286)
**Red commit:** f6290b63 — CI run will appear at
https://github.com/Kpa-clawbot/CoreScope/actions

Fixes #1283.

## What

Moves all four DB write operations out of `cmd/server/` into
`cmd/ingestor/`, making the server truly read-only and eliminating the
SQLITE_BUSY VACUUM bug at its root: the server can no longer race the
ingestor for the write lock because the server has no write path.

## The four operations

| # | Was in | Now in |
|---|--------|--------|
| 1 | `cmd/server/vacuum.go` (`checkAutoVacuum`, full VACUUM +
`auto_vacuum=INCREMENTAL` migration) | `cmd/ingestor/db.go`
`Store.CheckAutoVacuum` (already existed; ingestor runs it at startup
**before** the MQTT subscriber starts → no contention) |
| 2 | `cmd/server/db.go` `PruneOldPackets` (`DELETE FROM transmissions`)
| `cmd/ingestor/maintenance.go` `Store.PruneOldPackets` (new) + 24h
ticker in `cmd/ingestor/main.go` |
| 3 | `cmd/server/db.go` `PruneOldMetrics` (`DELETE FROM
observer_metrics`) | `cmd/ingestor/db.go` `Store.PruneOldMetrics`
(already existed) |
| 4 | `cmd/server/db.go` `RemoveStaleObservers` (`UPDATE observers SET
inactive=1`) | `cmd/ingestor/db.go` `Store.RemoveStaleObservers`
(already existed) |

## HTTP surface

- **Removed:** `POST /api/admin/prune` (`handleAdminPrune`, route,
openapi entry). Operators trigger an ad-hoc prune by restarting the
ingestor.
- **Kept:** `GET /api/backup` — uses `VACUUM INTO` which writes to a
separate file, not the live DB; read-only-safe.

## Tests

- `cmd/server/readonly_invariant_test.go` (RED gate) — reflect-asserts
`PruneOldPackets`/`PruneOldMetrics`/`RemoveStaleObservers` are NOT
methods on the server's `*DB`. Fails on master, passes after this PR.
- `cmd/ingestor/issue1283_test.go` — exercises `Store.PruneOldPackets`
and the auto_vacuum=NONE → INCREMENTAL migration through
`Store.CheckAutoVacuum` with `vacuumOnStartup=true`.

## Why the bug is gone

The SQLITE_BUSY VACUUM failure happened because supervisord launched
both ingestor + server in one container; the ingestor took the write
lock for INSERTs and the server's `checkAutoVacuum` then failed to
acquire it within `busy_timeout=5000`. After this PR, only the ingestor
ever opens a writable connection, and it runs `CheckAutoVacuum`
**before** spawning the MQTT subscriber → no contention possible.

## Scope notes

- `cachedRW()` still has three pre-existing callers in `cmd/server/`
(`neighbor_persist.go`, `ensure_indexes.go`,
`from_pubkey_migration.go`). These pre-date #1283 and are not in the
issue's four-operation list. Leaving them for follow-up keeps this PR
honest about scope; AGENTS.md documents the invariant so new write paths
can't sneak in.
- PII preflight reports false positives on the Go method name
`requireAPIKey` in `routes.go` diff context — no real PII.
- Server-side neighbor-edge prune (`PruneNeighborEdges`) intentionally
left in place — out of scope of #1283.

---------

Co-authored-by: MeshCore Bot <bot@meshcore.local>
2026-05-18 23:52:27 -07:00

78 lines
2.5 KiB
Go

package main
import (
"database/sql"
"fmt"
"reflect"
"testing"
_ "modernc.org/sqlite"
)
// TestServerDBHasNoWriteMethods enforces the architectural invariant from
// issue #1283: cmd/server is the read path. All write/maintenance methods
// (PruneOldPackets, PruneOldMetrics, RemoveStaleObservers) MUST live on
// the ingestor's *Store, not on the server's *DB.
//
// Before the fix, these methods existed on cmd/server/*DB and used
// cachedRW(db.path) to acquire a write lock, racing with the ingestor's
// concurrent INSERTs and producing SQLITE_BUSY (the bug in #1283).
// After the fix, this test passes because the methods are gone.
func TestServerDBHasNoWriteMethods(t *testing.T) {
forbidden := []string{
"PruneOldPackets",
"PruneOldMetrics",
"RemoveStaleObservers",
}
typ := reflect.TypeOf((*DB)(nil))
for _, name := range forbidden {
if _, ok := typ.MethodByName(name); ok {
t.Errorf("server *DB exposes forbidden write method %q — must be relocated to ingestor (#1283)", name)
}
}
}
// TestServerDBConnIsReadOnly asserts that the *sql.DB the server opens
// cannot acquire a write lock. The server has always opened mode=ro, but
// before #1283 it routed around that by calling cachedRW(path) to get a
// second RW handle. After the fix, server-side writes are impossible
// because there is no helper to open a writable connection.
func TestServerDBConnIsReadOnly(t *testing.T) {
dir := t.TempDir()
path := dir + "/ro_invariant.db"
// Bootstrap a minimal DB with the ingestor-style WAL opener so the
// server can attach in read-only mode.
if err := bootstrapMinimalDB(path); err != nil {
t.Fatalf("bootstrap: %v", err)
}
d, err := OpenDB(path)
if err != nil {
t.Fatalf("OpenDB: %v", err)
}
defer d.conn.Close()
_, err = d.conn.Exec(`INSERT INTO nodes (public_key, name) VALUES ('x','y')`)
if err == nil {
t.Fatalf("expected INSERT via server *DB to fail (read-only invariant)")
}
}
// bootstrapMinimalDB creates a tiny DB with the columns these tests
// need, opened with WAL so the read-only opener in OpenDB can attach.
// Kept in *_test.go so it does NOT add any write capability to the
// production server binary.
func bootstrapMinimalDB(path string) error {
dsn := fmt.Sprintf("file:%s?_journal_mode=WAL&_busy_timeout=5000", path)
rw, err := sql.Open("sqlite", dsn)
if err != nil {
return err
}
defer rw.Close()
if _, err := rw.Exec(`CREATE TABLE IF NOT EXISTS nodes (public_key TEXT PRIMARY KEY, name TEXT)`); err != nil {
return err
}
return nil
}