refactor(db): move all server writes to ingestor; server truly read-only (#1283)

Eliminates the SQLITE_BUSY VACUUM bug from #1283 by making cmd/server
truly read-only. The bug surfaced when supervisord launched both
ingestor + server in one container: the ingestor took the write lock for
INSERTs, then the server's VACUUM-on-startup immediately failed with
SQLITE_BUSY. Same race latently affected three other server-side writes.

Four write operations moved out of cmd/server/:

1. VACUUM / auto_vacuum migration (cmd/server/vacuum.go, entire file)
   → cmd/ingestor/db.go Store.CheckAutoVacuum (already existed;
     ingestor runs it BEFORE the MQTT subscriber starts so there is
     no contention with concurrent writes).

2. PruneOldPackets (DELETE FROM transmissions)
   cmd/server/db.go → cmd/ingestor/maintenance.go (new file,
     Store.PruneOldPackets) + main.go scheduler.

3. PruneOldMetrics (DELETE FROM observer_metrics)
   cmd/server/db.go → cmd/ingestor/db.go Store.PruneOldMetrics
     (already existed).

4. RemoveStaleObservers (UPDATE observers SET inactive = 1)
   cmd/server/db.go → cmd/ingestor/db.go Store.RemoveStaleObservers
     (already existed).

Server-side changes:
- vacuum.go deleted; checkAutoVacuum / runIncrementalVacuum gone.
- cmd/server/db.go: PruneOldPackets, PruneOldMetrics, RemoveStaleObservers
  deleted.
- cmd/server/main.go: packet/metrics/observer prune schedulers removed;
  the neighbor-edge prune scheduler (PruneNeighborEdges) is intentionally
  left in place — outside scope of #1283, tracked separately.
- routes.go + openapi.go: /api/admin/prune endpoint removed (prune is
  scheduled by the ingestor now; operators restart the ingestor for an
  ad-hoc pass).

Ingestor changes:
- New cmd/ingestor/maintenance.go with Store.PruneOldPackets.
- cmd/ingestor/config.go gains RetentionConfig.PacketDays and
  Config.PacketDaysOrZero().
- cmd/ingestor/main.go runs PruneOldPackets at startup (if
  packetDays > 0) and on a 24h ticker.

Docs:
- AGENTS.md: documents the read/write separation invariant.
- config.example.json: notes that retention + vacuumOnStartup are
  consumed by the ingestor.

TDD:
- Red: bb1d749a — invariant tests + Store.PruneOldPackets stub.
- Green: this commit — real implementation + server-side removals.

Note: cachedRW() still has three out-of-scope callers in cmd/server
(neighbor_persist.go, ensure_indexes.go, from_pubkey_migration.go).
Those are pre-existing write paths not covered by issue #1283 and are
left untouched per the issue scope. Future work can relocate them
under the same invariant.
This commit is contained in:
MeshCore Bot
2026-05-19 06:30:16 +00:00
parent f6290b6373
commit dbadef3e2f
11 changed files with 121 additions and 601 deletions
+2 -2
View File
@@ -9,12 +9,12 @@
"nodeDays": 7,
"observerDays": 14,
"packetDays": 30,
"_comment": "nodeDays: nodes not seen in N days moved to inactive_nodes (default 7). observerDays: observers not sending data in N days are removed (-1 = keep forever, default 14). packetDays: transmissions older than N days are deleted (0 = disabled)."
"_comment": "nodeDays: nodes not seen in N days moved to inactive_nodes (default 7). observerDays: observers not sending data in N days are removed (-1 = keep forever, default 14). packetDays: transmissions older than N days are deleted (0 = disabled). NOTE (#1283): all four retention fields are consumed by the INGESTOR process. The server is read-only and never prunes."
},
"db": {
"vacuumOnStartup": false,
"incrementalVacuumPages": 1024,
"_comment": "vacuumOnStartup: run one-time full VACUUM to enable incremental auto-vacuum on existing DBs (blocks startup for minutes on large DBs; requires 2x DB file size in free disk space). incrementalVacuumPages: free pages returned to OS after each retention reaper cycle (default 1024). See #919."
"_comment": "vacuumOnStartup: run one-time full VACUUM to enable incremental auto-vacuum on existing DBs. Executed by the INGESTOR at startup, BEFORE the MQTT subscriber starts (#1283), so there is no contention with concurrent writes. Blocks ingestor startup for minutes on large DBs; requires 2x DB file size in free disk space. incrementalVacuumPages: free pages returned to OS after each retention reaper cycle (default 1024). See #919."
},
"_comment_ingestorStats": "Ingestor publishes a 1-Hz stats snapshot consumed by the server's /api/perf/io and /api/perf/write-sources endpoints (#1120). Path is configured via the CORESCOPE_INGESTOR_STATS environment variable on the INGESTOR process. Default: /tmp/corescope-ingestor-stats.json. The writer uses O_NOFOLLOW + 0o600, so a pre-planted symlink in /tmp cannot be used to clobber an arbitrary file. SECURITY: in shared-tmp environments (multi-tenant hosts), point CORESCOPE_INGESTOR_STATS at a private directory like /var/lib/corescope/ingestor-stats.json that only the corescope user can write to.",
"https": {