## Root cause
`repeaterEnrichTTL` was **15 seconds**, but the background recomputer
(`StartRepeaterEnrichmentRecomputer`) runs every **5 minutes**.
After each recomputer tick, the relay/usefulness caches were valid for
15 seconds. For the remaining 4m45s, every `/api/nodes` request hit a
stale TTL gate in `GetRepeaterRelayInfoMap` /
`GetRepeaterUsefulnessScoreMap` and fell through to
`computeRepeaterRelayInfoMap` **on the request goroutine**. On
production (16k+ transmissions, 240k hop records) that rebuild takes ~18
seconds, making `/api/nodes?limit=5000` freeze on virtually every page
load.
The pattern was:
```
recomputer runs at T=0 → cache valid
T=15s → TTL expires
T=15s … T=5min → every request rebuilds on-thread (18s each)
T=5min → recomputer runs again → 15s valid window
repeat
```
## Fix
One line in `repeater_enrich_bulk.go`:
```go
// Before
const repeaterEnrichTTL = 15 * time.Second
// After
const repeaterEnrichTTL = 10 * time.Minute
```
The TTL now exceeds the recomputer interval so the cache is always warm
between background ticks. The TTL remains as a safety net for cases
where the recomputer isn't running (tests, early startup edge cases) —
it just no longer expires between ticks.
## Production results (analyzer.on8ar.eu)
Tested with binary injection on the live server before opening this PR.
| Metric | Before | After |
|--------|--------|-------|
| TTFB (`/api/nodes?limit=5000`) | 18.6 s | 0.47–0.54 s |
| Total response time | 18.9 s | 1.55–1.73 s |
| Improvement | — | **34–39×** |
Confirmed still fast at t+60s (well past the old 15s window).
## Test results
```
TestHandleNodesPerfLargeFleet elapsed=1.9ms budget=2s PASS
TestHandleNodesLimit2000ColdMiss elapsed=5.3ms budget=2s PASS
```
Both existing perf regression tests pass unchanged — the TTL change
doesn't affect their behavior (they test the cold-prewarm path, not TTL
expiry).
## Why this wasn't caught by tests
`TestHandleNodesLimit2000ColdMiss` only tests the cold-startup path
(cache nil → on-thread build → cache hit). It doesn't test the
TTL-expiry path (cache exists but stale → on-thread rebuild). A test
covering the latter would need to fast-forward time past the TTL, which
the existing fixture doesn't do.
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
// repeaterEnrichmentRecomputerInterval is the default tick interval
// for the steady-state recompute of the repeater enrichment bulk
// caches. The on-request 15s-TTL fallback in repeater_enrich_bulk.go
// is kept as a safety net — the recomputer just makes sure the cache
// is populated before any request arrives.
// caches. The on-request TTL fallback in repeater_enrich_bulk.go is
// kept as a safety net — the recomputer just makes sure the cache is
// populated before any request arrives.
//
// 5min mirrors the analytics_recomputer default from #1240 and is
// plenty fresh for an at-a-glance status column.
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.