mirror of
https://github.com/Kpa-clawbot/meshcore-analyzer.git
synced 2026-06-06 06:21:19 +00:00
1da2034341
**Red commit:** f6290b63 — CI run will appear at
https://github.com/Kpa-clawbot/CoreScope/actions
Fixes #1283.
## What
Moves all four DB write operations out of `cmd/server/` into
`cmd/ingestor/`, making the server truly read-only and eliminating the
SQLITE_BUSY VACUUM bug at its root: the server can no longer race the
ingestor for the write lock because the server has no write path.
## The four operations
| # | Was in | Now in |
|---|--------|--------|
| 1 | `cmd/server/vacuum.go` (`checkAutoVacuum`, full VACUUM +
`auto_vacuum=INCREMENTAL` migration) | `cmd/ingestor/db.go`
`Store.CheckAutoVacuum` (already existed; ingestor runs it at startup
**before** the MQTT subscriber starts → no contention) |
| 2 | `cmd/server/db.go` `PruneOldPackets` (`DELETE FROM transmissions`)
| `cmd/ingestor/maintenance.go` `Store.PruneOldPackets` (new) + 24h
ticker in `cmd/ingestor/main.go` |
| 3 | `cmd/server/db.go` `PruneOldMetrics` (`DELETE FROM
observer_metrics`) | `cmd/ingestor/db.go` `Store.PruneOldMetrics`
(already existed) |
| 4 | `cmd/server/db.go` `RemoveStaleObservers` (`UPDATE observers SET
inactive=1`) | `cmd/ingestor/db.go` `Store.RemoveStaleObservers`
(already existed) |
## HTTP surface
- **Removed:** `POST /api/admin/prune` (`handleAdminPrune`, route,
openapi entry). Operators trigger an ad-hoc prune by restarting the
ingestor.
- **Kept:** `GET /api/backup` — uses `VACUUM INTO` which writes to a
separate file, not the live DB; read-only-safe.
## Tests
- `cmd/server/readonly_invariant_test.go` (RED gate) — reflect-asserts
`PruneOldPackets`/`PruneOldMetrics`/`RemoveStaleObservers` are NOT
methods on the server's `*DB`. Fails on master, passes after this PR.
- `cmd/ingestor/issue1283_test.go` — exercises `Store.PruneOldPackets`
and the auto_vacuum=NONE → INCREMENTAL migration through
`Store.CheckAutoVacuum` with `vacuumOnStartup=true`.
## Why the bug is gone
The SQLITE_BUSY VACUUM failure happened because supervisord launched
both ingestor + server in one container; the ingestor took the write
lock for INSERTs and the server's `checkAutoVacuum` then failed to
acquire it within `busy_timeout=5000`. After this PR, only the ingestor
ever opens a writable connection, and it runs `CheckAutoVacuum`
**before** spawning the MQTT subscriber → no contention possible.
## Scope notes
- `cachedRW()` still has three pre-existing callers in `cmd/server/`
(`neighbor_persist.go`, `ensure_indexes.go`,
`from_pubkey_migration.go`). These pre-date #1283 and are not in the
issue's four-operation list. Leaving them for follow-up keeps this PR
honest about scope; AGENTS.md documents the invariant so new write paths
can't sneak in.
- PII preflight reports false positives on the Go method name
`requireAPIKey` in `routes.go` diff context — no real PII.
- Server-side neighbor-edge prune (`PruneNeighborEdges`) intentionally
left in place — out of scope of #1283.
---------
Co-authored-by: MeshCore Bot <bot@meshcore.local>
78 lines
2.5 KiB
Go
78 lines
2.5 KiB
Go
package main
|
|
|
|
import (
|
|
"database/sql"
|
|
"fmt"
|
|
"reflect"
|
|
"testing"
|
|
|
|
_ "modernc.org/sqlite"
|
|
)
|
|
|
|
// TestServerDBHasNoWriteMethods enforces the architectural invariant from
|
|
// issue #1283: cmd/server is the read path. All write/maintenance methods
|
|
// (PruneOldPackets, PruneOldMetrics, RemoveStaleObservers) MUST live on
|
|
// the ingestor's *Store, not on the server's *DB.
|
|
//
|
|
// Before the fix, these methods existed on cmd/server/*DB and used
|
|
// cachedRW(db.path) to acquire a write lock, racing with the ingestor's
|
|
// concurrent INSERTs and producing SQLITE_BUSY (the bug in #1283).
|
|
// After the fix, this test passes because the methods are gone.
|
|
func TestServerDBHasNoWriteMethods(t *testing.T) {
|
|
forbidden := []string{
|
|
"PruneOldPackets",
|
|
"PruneOldMetrics",
|
|
"RemoveStaleObservers",
|
|
}
|
|
typ := reflect.TypeOf((*DB)(nil))
|
|
for _, name := range forbidden {
|
|
if _, ok := typ.MethodByName(name); ok {
|
|
t.Errorf("server *DB exposes forbidden write method %q — must be relocated to ingestor (#1283)", name)
|
|
}
|
|
}
|
|
}
|
|
|
|
// TestServerDBConnIsReadOnly asserts that the *sql.DB the server opens
|
|
// cannot acquire a write lock. The server has always opened mode=ro, but
|
|
// before #1283 it routed around that by calling cachedRW(path) to get a
|
|
// second RW handle. After the fix, server-side writes are impossible
|
|
// because there is no helper to open a writable connection.
|
|
func TestServerDBConnIsReadOnly(t *testing.T) {
|
|
dir := t.TempDir()
|
|
path := dir + "/ro_invariant.db"
|
|
|
|
// Bootstrap a minimal DB with the ingestor-style WAL opener so the
|
|
// server can attach in read-only mode.
|
|
if err := bootstrapMinimalDB(path); err != nil {
|
|
t.Fatalf("bootstrap: %v", err)
|
|
}
|
|
|
|
d, err := OpenDB(path)
|
|
if err != nil {
|
|
t.Fatalf("OpenDB: %v", err)
|
|
}
|
|
defer d.conn.Close()
|
|
|
|
_, err = d.conn.Exec(`INSERT INTO nodes (public_key, name) VALUES ('x','y')`)
|
|
if err == nil {
|
|
t.Fatalf("expected INSERT via server *DB to fail (read-only invariant)")
|
|
}
|
|
}
|
|
|
|
// bootstrapMinimalDB creates a tiny DB with the columns these tests
|
|
// need, opened with WAL so the read-only opener in OpenDB can attach.
|
|
// Kept in *_test.go so it does NOT add any write capability to the
|
|
// production server binary.
|
|
func bootstrapMinimalDB(path string) error {
|
|
dsn := fmt.Sprintf("file:%s?_journal_mode=WAL&_busy_timeout=5000", path)
|
|
rw, err := sql.Open("sqlite", dsn)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
defer rw.Close()
|
|
if _, err := rw.Exec(`CREATE TABLE IF NOT EXISTS nodes (public_key TEXT PRIMARY KEY, name TEXT)`); err != nil {
|
|
return err
|
|
}
|
|
return nil
|
|
}
|