Files
meshcore-analyzer/cmd/server/main.go
T
Kpa-clawbot fb744d895f fix(#1143): structural pubkey attribution via from_pubkey column (#1152)
Fixes #1143.

## Summary

Replaces the structurally unsound `decoded_json LIKE '%pubkey%'` (and
`OR LIKE '%name%'`) attribution path with an exact-match lookup on a
dedicated, indexed `transmissions.from_pubkey` column.

This closes both holes documented in #1143:
- **Hole 1** — same-name false positives via `OR LIKE '%name%'`
- **Hole 2a** — adversarial spoofing: a malicious node names itself with
another node's pubkey and gets attributed to the victim
- **Hole 2b** — accidental false positive when any free-text field (path
elements, channel names, message bodies) contains a 64-char hex
substring matching a real pubkey
- **Perf** — query now uses an index instead of a full-table scan
against `LIKE '%substring%'`

## TDD

Two-commit history shows red-then-green:

| Commit | Status | Purpose |
|---|---|---|
| `7f0f08e` | RED — tests assertion-fail on master behaviour |
Adversarial fixtures + spec |
| `59327db` | GREEN — schema + ingestor + server + migration |
Implementation |

The red commit's test schema includes the new column so the file
compiles, but the production code still uses LIKE — the assertions fail
because the malicious / same-name / free-text rows are returned. The
green commit changes the query plus adds the migration/ingest path.

## Changes

### Schema
- new column `transmissions.from_pubkey TEXT`
- new index `idx_transmissions_from_pubkey`

### Ingestor (`cmd/ingestor/`)
- `PacketData.FromPubkey` populated from decoded ADVERT `pubKey` at
write time. Cheap — already parsing `decoded_json`. Non-ADVERTs stay
NULL.
- `stmtInsertTransmission` writes the column.
- Migration `from_pubkey_v1` ALTERs legacy DBs to add the column +
index.
- Bonus: rewrote the recipe in the gated one-shot
`advert_count_unique_v1` migration to use `from_pubkey` (already marked
done on existing DBs; kept correct for fresh installs).

### Server (`cmd/server/`)
- `ensureFromPubkeyColumn` mirrors the ingestor migration so the server
can boot against a DB the ingestor has never touched (e2e fixture, fresh
installs).
- `backfillFromPubkeyAsync` runs **after** HTTP starts. Scans `WHERE
from_pubkey IS NULL AND payload_type = 4` in 5000-row chunks with a
100ms yield between chunks. Cannot block boot even on prod-sized DBs
(100K+ transmissions). Queries handle NULL gracefully (return empty for
that pubkey, same as today's unknown-pubkey path).
- All in-scope LIKE call sites switched to exact match:

| Site | Before | After |
|---|---|---|
| `buildPacketWhere` (was db.go:582) | `decoded_json LIKE '%pubkey%'` |
`from_pubkey = ?` |
| `buildTransmissionWhere` (was db.go:626) | `t.decoded_json LIKE
'%pubkey%'` | `t.from_pubkey = ?` |
| `GetRecentTransmissionsForNode` (was db.go:910) | `LIKE '%pubkey%' OR
LIKE '%name%'` | `t.from_pubkey = ?` |
| `QueryMultiNodePackets` (was db.go:1785) | `decoded_json LIKE
'%pubkey%' OR ...` | `t.from_pubkey IN (?, ?, ...)` |
| `advert_count_unique_v1` (was ingestor/db.go:257) | `decoded_json LIKE
'%' \|\| nodes.public_key \|\| '%'` | `t.from_pubkey = nodes.public_key`
|

`GetRecentTransmissionsForNode` signature simplifies: the `name`
parameter is gone (it was only ever used for the legacy `OR LIKE
'%name%'` fallback). Sole caller in `routes.go:1243` updated.

### Tests
- `cmd/server/from_pubkey_attribution_test.go` — adversarial fixtures +
Hole 1/2a/2b/QueryMultiNodePackets exact-match assertions, EXPLAIN QUERY
PLAN index check, migration backfill correctness.
- `cmd/ingestor/from_pubkey_test.go` — write-time correctness
(BuildPacketData populates FromPubkey for ADVERT only;
InsertTransmission persists it; non-ADVERTs stay NULL).
- Existing test schemas (server v2, server v3, coverage) get the new
column **plus a SQLite trigger** that auto-populates `from_pubkey` from
`decoded_json` on ADVERT inserts. This means existing fixtures (which
only seed `decoded_json`) keep attributing correctly without per-test
edits.
- `seedTestData`'s ADVERTs explicitly set `from_pubkey`.

## Performance — index is used

```
$ EXPLAIN QUERY PLAN SELECT id FROM transmissions WHERE from_pubkey = ?
SEARCH transmissions USING INDEX idx_transmissions_from_pubkey (from_pubkey=?)
```

Asserted in `TestFromPubkeyIndexUsed`.

## Migration approach

- **Sync at boot**: `ALTER TABLE transmissions ADD COLUMN from_pubkey
TEXT` is a metadata-only operation in SQLite — microseconds regardless
of table size. `CREATE INDEX IF NOT EXISTS
idx_transmissions_from_pubkey` is **not** metadata-only: it scans the
table once. Empirically a few hundred ms on a 100K-row table; expect a
few seconds on a 10M-row table (one-time cost, blocking boot during that
window). Subsequent boots no-op via `IF NOT EXISTS`. If this boot delay
becomes an operational concern at prod scale we can defer the `CREATE
INDEX` to a goroutine — for now a few-second one-time delay is
acceptable.
- **Async**: row-level backfill of legacy NULL ADVERTs (chunked 5000 /
100ms yield). On a 100K-ADVERT prod DB, this completes in seconds in the
background; HTTP is fully available throughout.
- **Safety**: queries handle NULL gracefully — a node whose ADVERTs
haven't backfilled yet returns empty, identical to today's behaviour for
unknown pubkeys. No half-state regression.

## Out of scope (intentionally)

The free-text `LIKE` paths the issue explicitly leaves alone (e.g.
user-typed packet search) are untouched. Only the pubkey-attribution
sites get the column treatment.



## Cycle-3 review fixes

| Finding | Status | Commit |
|---|---|---|
| **M1c** — async-contract test was tautological (test's own `go`, not
production's) | Fixed | `23ace71` (red) → `a05b50c` (green) |
| **m1c** — package-global atomic resets unsafe under `t.Parallel()` |
Fixed (`// DO NOT t.Parallel` comment + `Reset()` helper) | rolled into
`23ace71` / `241ec69` |
| **m2c** — `/api/healthz` read 3 atomics non-atomically (torn snapshot)
| Fixed (single RWMutex-guarded snapshot + race test) | `241ec69` |
| **n3c.m1** — vestigial OR-scaffolding in `QueryMultiNodePackets` |
Fixed (cleanup) | `5a53ceb` |
| **n3c.m2** — verify PR body language about `ALTER` vs `CREATE INDEX` |
Verified accurate (already corrected in cycle 2) | (no change) |
| **n3c.m3** — `json.Unmarshal` per row in backfill → could use SQL
`json_extract` | **Deferred as known followup** — pure perf optimization
(current per-row Unmarshal is correct, just slower); SQL rewrite would
unwind the chunked-yield architecture and is non-trivial. Acceptable for
one-time backfill at boot on legacy DBs. |

### M1c implementation detail

`startFromPubkeyBackfill(dbPath, chunkSize, yieldDuration)` is now the
single production entry point used by `main.go`. It internally does `go
backfillFromPubkeyAsync(...)`. The test calls `startFromPubkeyBackfill`
(no `go` prefix) and asserts the dispatch returns within 50ms — so if
anyone removes the `go` keyword inside the wrapper, the test fails.
**Manually verified**: removing the `go` keyword causes
`TestBackfillFromPubkey_DoesNotBlockBoot` to fail with "backfill
dispatch took ~1s (>50ms): not async — would block boot."

### m2c implementation detail

`fromPubkeyBackfillTotal/Processed/Done` are now plain `int64`/`bool`
package globals guarded by a single `sync.RWMutex`.
`fromPubkeyBackfillSnapshot()` returns all three under one RLock.
`TestHealthzFromPubkeyBackfillConsistentSnapshot` races a writer
(lock-step total/processed updates with periodic done flips) against 8
readers hammering `/api/healthz`, asserting `processed<=total` and
`(done => processed==total)` on every response. Verified the test
catches torn reads (manually injected a 3-RLock implementation; test
failed within milliseconds with "processed>total" and "done=true but
processed!=total" errors).

---------

Co-authored-by: openclaw-bot <bot@openclaw.local>
Co-authored-by: openclaw-bot <bot@openclaw.dev>
2026-05-06 23:50:44 -07:00

592 lines
18 KiB
Go

package main
import (
"context"
"database/sql"
"flag"
"fmt"
"log"
"net/http"
_ "net/http/pprof"
"os"
"os/exec"
"os/signal"
"path/filepath"
"strings"
"sync"
"syscall"
"time"
"github.com/gorilla/mux"
)
// Set via -ldflags at build time
var Version string
var Commit string
var BuildTime string
func resolveCommit() string {
if Commit != "" {
return Commit
}
// Try .git-commit file (baked by Docker / CI)
if data, err := os.ReadFile(".git-commit"); err == nil {
if c := strings.TrimSpace(string(data)); c != "" && c != "unknown" {
return c
}
}
// Try git rev-parse at runtime
if out, err := exec.Command("git", "rev-parse", "--short", "HEAD").Output(); err == nil {
return strings.TrimSpace(string(out))
}
return "unknown"
}
func resolveVersion() string {
if Version != "" {
return Version
}
return "unknown"
}
func resolveBuildTime() string {
if BuildTime != "" {
return BuildTime
}
return "unknown"
}
func main() {
// pprof profiling — off by default, enable with ENABLE_PPROF=true
if os.Getenv("ENABLE_PPROF") == "true" {
pprofPort := os.Getenv("PPROF_PORT")
if pprofPort == "" {
pprofPort = "6060"
}
go func() {
log.Printf("[pprof] profiling UI at http://localhost:%s/debug/pprof/", pprofPort)
if err := http.ListenAndServe(":"+pprofPort, nil); err != nil {
log.Printf("[pprof] failed to start: %v (non-fatal)", err)
}
}()
}
var (
configDir string
port int
dbPath string
publicDir string
pollMs int
)
flag.StringVar(&configDir, "config-dir", ".", "Directory containing config.json")
flag.IntVar(&port, "port", 0, "HTTP port (overrides config)")
flag.StringVar(&dbPath, "db", "", "SQLite database path (overrides config/env)")
flag.StringVar(&publicDir, "public", "public", "Directory to serve static files from")
flag.IntVar(&pollMs, "poll-ms", 1000, "SQLite poll interval for WebSocket broadcast (ms)")
flag.Parse()
// Load config
cfg, err := LoadConfig(configDir)
if err != nil {
log.Printf("[config] warning: %v (using defaults)", err)
}
// CLI flags override config
if port > 0 {
cfg.Port = port
}
if cfg.Port == 0 {
cfg.Port = 3000
}
if dbPath != "" {
cfg.DBPath = dbPath
}
if cfg.APIKey == "" {
log.Printf("[security] WARNING: no apiKey configured — write endpoints are BLOCKED (set apiKey in config.json to enable them)")
} else if IsWeakAPIKey(cfg.APIKey) {
log.Printf("[security] WARNING: API key is weak or a known default — write endpoints are vulnerable")
}
// Apply Go runtime soft memory limit (#836).
// Honors GOMEMLIMIT if set; otherwise derives from packetStore.maxMemoryMB.
{
_, envSet := os.LookupEnv("GOMEMLIMIT")
maxMB := 0
if cfg.PacketStore != nil {
maxMB = cfg.PacketStore.MaxMemoryMB
}
limit, source := applyMemoryLimit(maxMB, envSet)
switch source {
case "env":
log.Printf("[memlimit] using GOMEMLIMIT from environment (%s)", os.Getenv("GOMEMLIMIT"))
case "derived":
log.Printf("[memlimit] derived from packetStore.maxMemoryMB=%d → %d MiB (1.5x headroom)", maxMB, limit/(1024*1024))
default:
log.Printf("[memlimit] no soft memory limit set (GOMEMLIMIT unset, packetStore.maxMemoryMB=0); recommend setting one to avoid container OOM-kill")
}
}
// Resolve DB path
resolvedDB := cfg.ResolveDBPath(configDir)
log.Printf("[config] port=%d db=%s public=%s", cfg.Port, resolvedDB, publicDir)
if len(cfg.NodeBlacklist) > 0 {
log.Printf("[config] nodeBlacklist: %d node(s) will be hidden from API", len(cfg.NodeBlacklist))
for _, pk := range cfg.NodeBlacklist {
if trimmed := strings.ToLower(strings.TrimSpace(pk)); trimmed != "" {
log.Printf("[config] blacklisted: %s", trimmed)
}
}
}
// Open database
database, err := OpenDB(resolvedDB)
if err != nil {
log.Fatalf("[db] failed to open %s: %v", resolvedDB, err)
}
var dbCloseOnce sync.Once
dbClose := func() error {
var err error
dbCloseOnce.Do(func() { err = database.Close() })
return err
}
defer dbClose()
// Verify DB has expected tables
var tableName string
err = database.conn.QueryRow("SELECT name FROM sqlite_master WHERE type='table' AND name='transmissions'").Scan(&tableName)
if err == sql.ErrNoRows {
log.Fatalf("[db] table 'transmissions' not found — is this a CoreScope database?")
}
stats, err := database.GetStats()
if err != nil {
log.Printf("[db] warning: could not read stats: %v", err)
} else {
log.Printf("[db] transmissions=%d observations=%d nodes=%d observers=%d",
stats.TotalTransmissions, stats.TotalObservations, stats.TotalNodes, stats.TotalObservers)
}
// Check auto_vacuum mode and optionally migrate (#919)
checkAutoVacuum(database, cfg, resolvedDB)
// In-memory packet store
store := NewPacketStore(database, cfg.PacketStore, cfg.CacheTTL)
if err := store.Load(); err != nil {
log.Fatalf("[store] failed to load: %v", err)
}
// Initialize persisted neighbor graph
dbPath = database.path
if err := ensureNeighborEdgesTable(dbPath); err != nil {
log.Printf("[neighbor] warning: could not create neighbor_edges table: %v", err)
}
// Add resolved_path column if missing.
// NOTE on startup ordering (review item #10): ensureResolvedPathColumn runs AFTER
// OpenDB/detectSchema, so db.hasResolvedPath will be false on first run with a
// pre-existing DB. This means Load() won't SELECT resolved_path from SQLite.
// Async backfill runs after HTTP starts (see backfillResolvedPathsAsync below)
// AND to SQLite. On next restart, detectSchema finds the column and Load() reads it.
if err := ensureResolvedPathColumn(dbPath); err != nil {
log.Printf("[store] warning: could not add resolved_path column: %v", err)
} else {
database.hasResolvedPath = true // detectSchema ran before column was added; fix the flag
}
// Ensure observers.inactive column exists (PR #954 filters on it; ingestor migration
// adds it but server may run against DBs ingestor never touched, e.g. e2e fixture).
if err := ensureObserverInactiveColumn(dbPath); err != nil {
log.Printf("[store] warning: could not add observers.inactive column: %v", err)
}
// Ensure observers.last_packet_at column exists (PR #905 reads it; ingestor migration
// adds it but server may run against DBs ingestor never touched, e.g. e2e fixture).
if err := ensureLastPacketAtColumn(dbPath); err != nil {
log.Printf("[store] warning: could not add observers.last_packet_at column: %v", err)
}
// Ensure nodes.foreign_advert column exists (#730 reads it on every /api/nodes
// scan; ingestor migration foreign_advert_v1 adds it but server may run against
// DBs ingestor never touched, e.g. e2e fixture).
if err := ensureForeignAdvertColumn(dbPath); err != nil {
log.Printf("[store] warning: could not add nodes.foreign_advert column: %v", err)
}
// Ensure transmissions.from_pubkey column + index exists (#1143). Backfill
// for legacy NULL rows runs async after HTTP starts so it can't block boot
// even on prod-sized DBs (100K+ transmissions).
if err := ensureFromPubkeyColumn(dbPath); err != nil {
log.Printf("[store] warning: could not add transmissions.from_pubkey column: %v", err)
}
// Soft-delete observers that are in the blacklist (mark inactive=1) so
// historical data from a prior unblocked window is hidden too.
if len(cfg.ObserverBlacklist) > 0 {
softDeleteBlacklistedObservers(dbPath, cfg.ObserverBlacklist)
}
// WaitGroup for background init steps that gate /api/healthz readiness.
var initWg sync.WaitGroup
// Load or build neighbor graph
if neighborEdgesTableExists(database.conn) {
store.graph = loadNeighborEdgesFromDB(database.conn)
log.Printf("[neighbor] loaded persisted neighbor graph")
} else {
log.Printf("[neighbor] no persisted edges found, will build in background...")
store.graph = NewNeighborGraph() // empty graph — gets populated by background goroutine
initWg.Add(1)
go func() {
defer initWg.Done()
defer func() {
if r := recover(); r != nil {
log.Printf("[neighbor] graph build panic recovered: %v", r)
}
}()
rw, rwErr := cachedRW(dbPath)
if rwErr == nil {
edgeCount := buildAndPersistEdges(store, rw)
log.Printf("[neighbor] persisted %d edges", edgeCount)
}
built := BuildFromStore(store)
store.mu.Lock()
store.graph = built
store.mu.Unlock()
log.Printf("[neighbor] graph build complete")
}()
}
// Initial pickBestObservation runs in background — doesn't need to block HTTP.
// API serves best-effort data until this completes (~10s for 100K txs).
// Processes in chunks of 5000, releasing the lock between chunks so API
// handlers remain responsive.
initWg.Add(1)
go func() {
defer initWg.Done()
defer func() {
if r := recover(); r != nil {
log.Printf("[store] pickBestObservation panic recovered: %v", r)
}
}()
const chunkSize = 5000
store.mu.RLock()
totalPackets := len(store.packets)
store.mu.RUnlock()
for i := 0; i < totalPackets; i += chunkSize {
end := i + chunkSize
if end > totalPackets {
end = totalPackets
}
store.mu.Lock()
for j := i; j < end && j < len(store.packets); j++ {
pickBestObservation(store.packets[j])
}
store.mu.Unlock()
if end < totalPackets {
time.Sleep(10 * time.Millisecond) // yield to API handlers
}
}
log.Printf("[store] initial pickBestObservation complete (%d transmissions)", totalPackets)
}()
// Mark server ready once all background init completes.
go func() {
initWg.Wait()
readiness.Store(1)
log.Printf("[server] readiness: ready=true (background init complete)")
}()
// WebSocket hub
hub := NewHub()
// HTTP server
srv := NewServer(database, cfg, hub)
srv.store = store
router := mux.NewRouter()
srv.RegisterRoutes(router)
// WebSocket endpoint
router.HandleFunc("/ws", hub.ServeWS)
// Static files + SPA fallback
absPublic, _ := filepath.Abs(publicDir)
if _, err := os.Stat(absPublic); err == nil {
fs := http.FileServer(http.Dir(absPublic))
router.PathPrefix("/").Handler(wsOrStatic(hub, spaHandler(absPublic, fs)))
log.Printf("[static] serving %s", absPublic)
} else {
log.Printf("[static] directory %s not found — API-only mode", absPublic)
router.PathPrefix("/").HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/html")
w.Write([]byte(`<!DOCTYPE html><html><body><h1>CoreScope</h1><p>Frontend not found. API available at /api/</p></body></html>`))
})
}
// Start SQLite poller for WebSocket broadcast
poller := NewPoller(database, hub, time.Duration(pollMs)*time.Millisecond)
poller.store = store
go poller.Start()
// Start periodic eviction
stopEviction := store.StartEvictionTicker()
defer stopEviction()
// Auto-prune old packets if retention.packetDays is configured
vacuumPages := cfg.IncrementalVacuumPages()
var stopPrune func()
if cfg.Retention != nil && cfg.Retention.PacketDays > 0 {
days := cfg.Retention.PacketDays
pruneTicker := time.NewTicker(24 * time.Hour)
pruneDone := make(chan struct{})
stopPrune = func() {
pruneTicker.Stop()
close(pruneDone)
}
go func() {
defer func() {
if r := recover(); r != nil {
log.Printf("[prune] panic recovered: %v", r)
}
}()
time.Sleep(1 * time.Minute)
if n, err := database.PruneOldPackets(days); err != nil {
log.Printf("[prune] error: %v", err)
} else {
log.Printf("[prune] deleted %d transmissions older than %d days", n, days)
if n > 0 {
runIncrementalVacuum(resolvedDB, vacuumPages)
}
}
for {
select {
case <-pruneTicker.C:
if n, err := database.PruneOldPackets(days); err != nil {
log.Printf("[prune] error: %v", err)
} else {
log.Printf("[prune] deleted %d transmissions older than %d days", n, days)
if n > 0 {
runIncrementalVacuum(resolvedDB, vacuumPages)
}
}
case <-pruneDone:
return
}
}
}()
log.Printf("[prune] auto-prune enabled: packets older than %d days will be removed daily", days)
}
// Auto-prune old metrics
var stopMetricsPrune func()
{
metricsDays := cfg.MetricsRetentionDays()
metricsPruneTicker := time.NewTicker(24 * time.Hour)
metricsPruneDone := make(chan struct{})
stopMetricsPrune = func() {
metricsPruneTicker.Stop()
close(metricsPruneDone)
}
go func() {
defer func() {
if r := recover(); r != nil {
log.Printf("[metrics-prune] panic recovered: %v", r)
}
}()
time.Sleep(2 * time.Minute) // stagger after packet prune
database.PruneOldMetrics(metricsDays)
runIncrementalVacuum(resolvedDB, vacuumPages)
for {
select {
case <-metricsPruneTicker.C:
database.PruneOldMetrics(metricsDays)
runIncrementalVacuum(resolvedDB, vacuumPages)
case <-metricsPruneDone:
return
}
}
}()
log.Printf("[metrics-prune] auto-prune enabled: metrics older than %d days", metricsDays)
}
// Auto-prune stale observers
var stopObserverPrune func()
{
observerDays := cfg.ObserverDaysOrDefault()
if observerDays <= -1 {
// -1 means keep forever, skip
} else {
observerPruneTicker := time.NewTicker(24 * time.Hour)
observerPruneDone := make(chan struct{})
stopObserverPrune = func() {
observerPruneTicker.Stop()
close(observerPruneDone)
}
go func() {
defer func() {
if r := recover(); r != nil {
log.Printf("[observer-prune] panic recovered: %v", r)
}
}()
time.Sleep(3 * time.Minute) // stagger after metrics prune
database.RemoveStaleObservers(observerDays)
runIncrementalVacuum(resolvedDB, vacuumPages)
for {
select {
case <-observerPruneTicker.C:
database.RemoveStaleObservers(observerDays)
runIncrementalVacuum(resolvedDB, vacuumPages)
case <-observerPruneDone:
return
}
}
}()
log.Printf("[observer-prune] auto-prune enabled: observers not seen in %d days will be removed", observerDays)
}
}
// Auto-prune old neighbor edges
var stopEdgePrune func()
{
maxAgeDays := cfg.NeighborMaxAgeDays()
edgePruneTicker := time.NewTicker(24 * time.Hour)
edgePruneDone := make(chan struct{})
stopEdgePrune = func() {
edgePruneTicker.Stop()
close(edgePruneDone)
}
go func() {
defer func() {
if r := recover(); r != nil {
log.Printf("[neighbor-prune] panic recovered: %v", r)
}
}()
time.Sleep(4 * time.Minute) // stagger after metrics prune
store.mu.RLock()
g := store.graph
store.mu.RUnlock()
PruneNeighborEdges(dbPath, g, maxAgeDays)
runIncrementalVacuum(resolvedDB, vacuumPages)
for {
select {
case <-edgePruneTicker.C:
store.mu.RLock()
g := store.graph
store.mu.RUnlock()
PruneNeighborEdges(dbPath, g, maxAgeDays)
runIncrementalVacuum(resolvedDB, vacuumPages)
case <-edgePruneDone:
return
}
}
}()
log.Printf("[neighbor-prune] auto-prune enabled: edges older than %d days", maxAgeDays)
}
// Graceful shutdown
httpServer := &http.Server{
Addr: fmt.Sprintf(":%d", cfg.Port),
Handler: router,
ReadTimeout: 30 * time.Second,
WriteTimeout: 60 * time.Second,
IdleTimeout: 120 * time.Second,
}
go func() {
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
sig := <-sigCh
log.Printf("[server] received %v, shutting down...", sig)
// 1. Stop accepting new WebSocket/poll data
poller.Stop()
// 1b. Stop auto-prune ticker
if stopPrune != nil {
stopPrune()
}
if stopMetricsPrune != nil {
stopMetricsPrune()
}
if stopObserverPrune != nil {
stopObserverPrune()
}
if stopEdgePrune != nil {
stopEdgePrune()
}
// 2. Gracefully drain HTTP connections (up to 15s)
ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)
defer cancel()
if err := httpServer.Shutdown(ctx); err != nil {
log.Printf("[server] HTTP shutdown error: %v", err)
}
// 3. Close WebSocket hub
hub.Close()
// 4. Close database (release SQLite WAL lock)
if err := dbClose(); err != nil {
log.Printf("[server] DB close error: %v", err)
}
log.Println("[server] shutdown complete")
}()
log.Printf("[server] CoreScope (Go) listening on http://localhost:%d", cfg.Port)
// Start async backfill in background — HTTP is now available.
go backfillResolvedPathsAsync(store, dbPath, 5000, 100*time.Millisecond, cfg.BackfillHours())
// #1143: backfill from_pubkey for legacy ADVERT rows. Async so even
// 100K+ rows can't block boot; queries handle NULL gracefully.
// startFromPubkeyBackfill wraps the goroutine dispatch so the async
// contract is testable (see TestBackfillFromPubkey_DoesNotBlockBoot).
startFromPubkeyBackfill(dbPath, 5000, 100*time.Millisecond)
// Migrate old content hashes in background (one-time, idempotent).
go migrateContentHashesAsync(store, 5000, 100*time.Millisecond)
if err := httpServer.ListenAndServe(); err != http.ErrServerClosed {
log.Fatalf("[server] %v", err)
}
}
// spaHandler serves static files, falling back to index.html for SPA routes.
// It reads index.html once at creation time and replaces the __BUST__ placeholder
// with a Unix timestamp so browsers fetch fresh JS/CSS after each server restart.
func spaHandler(root string, fs http.Handler) http.Handler {
// Pre-process index.html: replace __BUST__ with a cache-bust timestamp
indexPath := filepath.Join(root, "index.html")
rawHTML, err := os.ReadFile(indexPath)
if err != nil {
log.Printf("[static] warning: could not read index.html for cache-bust: %v", err)
rawHTML = []byte("<!DOCTYPE html><html><body><h1>CoreScope</h1><p>index.html not found</p></body></html>")
}
bustValue := fmt.Sprintf("%d", time.Now().Unix())
indexHTML := []byte(strings.ReplaceAll(string(rawHTML), "__BUST__", bustValue))
log.Printf("[static] cache-bust value: %s", bustValue)
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Serve pre-processed index.html for root and /index.html
if r.URL.Path == "/" || r.URL.Path == "/index.html" {
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.Header().Set("Cache-Control", "no-cache, no-store, must-revalidate")
w.Write(indexHTML)
return
}
path := filepath.Join(root, r.URL.Path)
if _, err := os.Stat(path); os.IsNotExist(err) {
// SPA fallback — serve pre-processed index.html
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.Header().Set("Cache-Control", "no-cache, no-store, must-revalidate")
w.Write(indexHTML)
return
}
// Disable caching for JS/CSS/HTML
if filepath.Ext(path) == ".js" || filepath.Ext(path) == ".css" || filepath.Ext(path) == ".html" {
w.Header().Set("Cache-Control", "no-cache, no-store, must-revalidate")
}
fs.ServeHTTP(w, r)
})
}