mirror of
https://github.com/Kpa-clawbot/meshcore-analyzer.git
synced 2026-06-28 07:42:47 +00:00
Closes #1561. Follow-up to #1551. ## Why #1551 added `Cache-Control: no-store` to all `/api/*` responses. That's sufficient for CDNs that honour origin headers (Varnish, nginx). It is **not** sufficient for Cloudflare zones where Cache Rules / Page Rules override origin Cache-Control. Field evidence from the meshat.se diagnosis (2026-06-04): observers behind Cloudflare were returning `cf-cache-status: HIT` with `age` up to ~6 hours despite the origin emitting `no-store`. The CDN was caching per zone policy and ignoring the upstream directive — exactly the failure mode #1551 cannot reach. The application has no way to inject CDN rules; the only durable fix is operator-side. This PR makes that operator step discoverable and verifiable. ## What ### Server-side detection (log-only) `cmd/server/cdn_detection.go` adds a middleware wired into the `/api/*` chain after `noStoreAPIMiddleware`. On the **first** request bearing any CDN-typical header (`CF-Connecting-IP`, `CF-Ray`, `X-Forwarded-For`, `X-Real-IP`, `Fastly-Client-IP`, `True-Client-IP`) it logs: ``` [security] WARNING: detected request via CDN (CF-Ray header present). Ensure /api/* is bypassed in your CDN config — see docs/deployment-behind-cdn.md. Cached API responses cause observer-flap and incorrect dashboards. ``` `sync.Once` guarantees the warning fires at most once per process boot. The middleware never blocks, never modifies the response, never adds headers. Detection is observational only — operators who run behind a CDN without bypass have a real bug; the warning is appropriate. ### Operator documentation `docs/deployment.md` gains a new **"Behind a CDN (Cloudflare, Fastly)"** section covering: 1. Curl verification command + healthy vs unhealthy output examples 2. Cloudflare Cache Rule creation (URI Path starts-with `/api/` → Bypass cache) 3. Legacy Page Rules equivalent 4. Fastly note 5. Re-verification 6. Meaning of the startup log warning 7. Why we can't fix this server-side `docs/deployment-behind-cdn.md` is the canonical path the log message references — it's a short TL;DR that links back to the full section. ### Healthcheck script `scripts/check-cdn-bypass.sh` — POSIX sh, no dependencies beyond curl + grep + awk. Operators run: ```sh scripts/check-cdn-bypass.sh https://your-domain.example.com ``` Exits `0` with `OK: no CDN caching detected ...` or `1` with a precise diagnostic naming the offending header (`cf-cache-status: HIT` or stale `age`). ## TDD - **Red commit `e90ccaba`** (`test(security): RED ...`) — `cmd/server/cdn_detection_test.go` (4 Go tests + 6 subtests for each header) and `scripts/test-check-cdn-bypass.sh` (3 shell harness cases). Middleware stub returns `next` unchanged so tests compile and fail on assertions, not build errors. - **Green commit `5e6a60b5`** (`feat(security): GREEN ...`) — real middleware, wiring in `routes.go`, healthcheck script, doc. ## Deliverables | File | Status | Purpose | |------|--------|---------| | `cmd/server/cdn_detection.go` | new | middleware + sync.Once warning | | `cmd/server/cdn_detection_test.go` | new | 4 Go tests (1 stand-alone + 1 silence + 1 once + 1 table-driven over 6 headers) | | `cmd/server/routes.go` | modified | `r.Use(cdnDetectionMiddleware)` after no-store | | `docs/deployment.md` | modified | TOC entry + "Behind a CDN" section | | `docs/deployment-behind-cdn.md` | new | canonical path referenced by log message + script output | | `scripts/check-cdn-bypass.sh` | new | operator-runnable healthcheck | | `scripts/test-check-cdn-bypass.sh` | new | shell harness with fake curl | ## What this PR explicitly does NOT do - Does not block requests based on CDN detection (log-only). - Does not enforce CDN bypass (impossible — operator-controlled). - Does not spoof, strip or modify CDN headers. - Does not add CSP / HSTS / other security headers (out of scope). - Warning is not configurable — operators behind a CDN without bypass have a real bug, surfacing it is correct. ## Verification - `go test ./...` in `cmd/server/` — full suite green. - `sh scripts/test-check-cdn-bypass.sh` — 3/3 pass. - Preflight checklist — all 11 gates clean (PII, branch scope, red commit, CSS vars, CSS self-fallback, LIKE-on-JSON, sync migration, async-migration annotation, XSS sinks, img/SVG ratio, themed-img/SVG, fixture coverage). --------- Co-authored-by: openclaw-bot <bot@openclaw.local> Co-authored-by: clawbot <bot@clawbot.invalid>
This commit is contained in:
@@ -0,0 +1,87 @@
|
||||
package main
|
||||
|
||||
// Issue #1561: detect CDN-fronted deployments and warn ONCE.
|
||||
//
|
||||
// When operators put CoreScope behind Cloudflare/Fastly without
|
||||
// configuring a /api/* cache bypass, dashboards go stale — the origin
|
||||
// emits Cache-Control: no-store (#1551), but the CDN's zone-level
|
||||
// caching policy can still cache JSON responses for hours
|
||||
// (cf-cache-status: HIT, age > 0). We can't fix the CDN config from
|
||||
// the server side; the best we can do is detect the situation and
|
||||
// loudly tell the operator at the logs.
|
||||
//
|
||||
// Detection: presence of any CDN-specific request header
|
||||
// (CF-Connecting-IP, CF-Ray, Fastly-Client-IP, True-Client-IP).
|
||||
// We deliberately exclude X-Forwarded-For and X-Real-IP: every
|
||||
// generic reverse proxy (nginx, Caddy, Traefik, k8s ingress) sets
|
||||
// those, so including them would warn operators who aren't behind
|
||||
// a CDN at all and train them to ignore the warning entirely
|
||||
// (defeating the point of #1561).
|
||||
//
|
||||
// Side effects: a single log line per process boot — never blocks
|
||||
// the request, never modifies the response, never logs again.
|
||||
|
||||
import (
|
||||
"log"
|
||||
"net/http"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
)
|
||||
|
||||
var cdnWarnOnce sync.Once
|
||||
|
||||
// cdnWarned is set true after the first CDN-fronted request has been
|
||||
// observed and logged. Subsequent requests short-circuit before the
|
||||
// per-request header scan in firstCDNHeader — a hot-path optimization
|
||||
// for the steady state (warning already emitted, every /api request
|
||||
// otherwise pays for 4 http.Header.Get lookups forever).
|
||||
var cdnWarned atomic.Bool
|
||||
|
||||
// cdnHeaders are HTTP request headers injected ONLY by CDNs
|
||||
// (Cloudflare, Fastly, Akamai) — never by a generic reverse proxy.
|
||||
// Detected case-insensitively by http.Header.Get.
|
||||
//
|
||||
// X-Forwarded-For / X-Real-IP are intentionally NOT in this list:
|
||||
// every nginx/Caddy/Traefik/k8s-ingress deployment sets them, so
|
||||
// using them as a CDN signal produces a false positive on every
|
||||
// reverse-proxied install (issue #1561 round-1 review).
|
||||
var cdnHeaders = []string{
|
||||
"CF-Connecting-IP", // Cloudflare
|
||||
"CF-Ray", // Cloudflare
|
||||
"Fastly-Client-IP", // Fastly
|
||||
"True-Client-IP", // Akamai (also set by Cloudflare Enterprise)
|
||||
}
|
||||
|
||||
// cdnDetectionMiddleware inspects each incoming request for CDN
|
||||
// headers and, on the FIRST one observed, logs a single warning
|
||||
// pointing the operator at docs/deployment-behind-cdn.md. The
|
||||
// middleware always calls next; it never blocks or rewrites.
|
||||
func cdnDetectionMiddleware(next http.Handler) http.Handler {
|
||||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
// Fast path: once we've warned, skip the per-request header
|
||||
// scan entirely. Steady state for any CDN-fronted deploy is
|
||||
// ~every request hitting this branch.
|
||||
if cdnWarned.Load() {
|
||||
next.ServeHTTP(w, r)
|
||||
return
|
||||
}
|
||||
if hdr := firstCDNHeader(r.Header); hdr != "" {
|
||||
cdnWarnOnce.Do(func() {
|
||||
log.Printf("[security] WARNING: detected request via CDN (%s header present). "+
|
||||
"Ensure /api/* is bypassed in your CDN config — see docs/deployment-behind-cdn.md. "+
|
||||
"Cached API responses cause observer-flap and incorrect dashboards.", hdr)
|
||||
cdnWarned.Store(true)
|
||||
})
|
||||
}
|
||||
next.ServeHTTP(w, r)
|
||||
})
|
||||
}
|
||||
|
||||
func firstCDNHeader(h http.Header) string {
|
||||
for _, name := range cdnHeaders {
|
||||
if h.Get(name) != "" {
|
||||
return name
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
@@ -0,0 +1,276 @@
|
||||
package main
|
||||
|
||||
// Issue #1561: When the server is fronted by a CDN (Cloudflare, Fastly,
|
||||
// Akamai) we cannot guarantee /api/* responses are not cached unless
|
||||
// the operator configures a bypass rule. Detect CDN-specific request
|
||||
// headers at the first such request and log a one-shot warning
|
||||
// pointing the operator at the bypass doc.
|
||||
//
|
||||
// Contract:
|
||||
// - Warning logs ONLY when a CDN-specific header is present
|
||||
// (CF-Connecting-IP, CF-Ray, Fastly-Client-IP, True-Client-IP).
|
||||
// - Generic reverse-proxy headers (X-Forwarded-For, X-Real-IP) MUST
|
||||
// NOT trigger the warning — every nginx/Caddy/Traefik/k8s install
|
||||
// sets those, so warning on them defeats the entire signal.
|
||||
// - Warning logs at most ONCE per process boot (sync.Once), even
|
||||
// under concurrent first-request load.
|
||||
// - Middleware NEVER blocks the request — it always calls
|
||||
// next.ServeHTTP.
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"log"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"strings"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
"testing"
|
||||
)
|
||||
|
||||
// resetCDNDetectionOnce restores a fresh sync.Once so each test starts
|
||||
// from a clean "have not warned yet" state.
|
||||
func resetCDNDetectionOnce() {
|
||||
cdnWarnOnce = sync.Once{}
|
||||
cdnWarned.Store(false)
|
||||
}
|
||||
|
||||
// runWithCDNMiddleware fires the request through the middleware and
|
||||
// returns (log output, whether next was called). The sentinel proves
|
||||
// the middleware did not silently drop the request.
|
||||
func runWithCDNMiddleware(t *testing.T, req *http.Request) (string, bool) {
|
||||
t.Helper()
|
||||
var buf bytes.Buffer
|
||||
prev := log.Writer()
|
||||
log.SetOutput(&buf)
|
||||
defer log.SetOutput(prev)
|
||||
|
||||
nextCalled := false
|
||||
h := cdnDetectionMiddleware(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
nextCalled = true
|
||||
w.WriteHeader(http.StatusOK)
|
||||
}))
|
||||
w := httptest.NewRecorder()
|
||||
h.ServeHTTP(w, req)
|
||||
if w.Code != http.StatusOK {
|
||||
t.Fatalf("middleware must not block request; got status %d", w.Code)
|
||||
}
|
||||
return buf.String(), nextCalled
|
||||
}
|
||||
|
||||
func TestCDNDetection_LogsOnCFRayHeader(t *testing.T) {
|
||||
resetCDNDetectionOnce()
|
||||
req := httptest.NewRequest("GET", "/api/observers", nil)
|
||||
req.Header.Set("CF-Ray", "abc123-LAX")
|
||||
|
||||
out, nextCalled := runWithCDNMiddleware(t, req)
|
||||
|
||||
if !nextCalled {
|
||||
t.Fatal("middleware did not call next handler")
|
||||
}
|
||||
if !strings.Contains(out, "detected request via CDN") {
|
||||
t.Errorf("expected log to contain 'detected request via CDN', got: %q", out)
|
||||
}
|
||||
if !strings.Contains(out, "deployment-behind-cdn") {
|
||||
t.Errorf("expected log to reference deployment-behind-cdn doc, got: %q", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestCDNDetection_SilentWithoutCDNHeader(t *testing.T) {
|
||||
resetCDNDetectionOnce()
|
||||
req := httptest.NewRequest("GET", "/api/observers", nil)
|
||||
// No CDN-typical headers set.
|
||||
|
||||
out, nextCalled := runWithCDNMiddleware(t, req)
|
||||
|
||||
if !nextCalled {
|
||||
t.Fatal("middleware did not call next handler")
|
||||
}
|
||||
if strings.Contains(out, "detected request via CDN") {
|
||||
t.Errorf("expected no CDN warning without CDN headers, got: %q", out)
|
||||
}
|
||||
}
|
||||
|
||||
// Regression for round-1 adversarial finding: generic reverse-proxy
|
||||
// headers must NOT trigger the warning. Every nginx/Caddy/Traefik/
|
||||
// k8s-ingress reverse proxy sets X-Forwarded-For and X-Real-IP, so
|
||||
// flagging them produces a false positive on every reverse-proxied
|
||||
// install and trains operators to ignore the warning.
|
||||
func TestCDNDetection_SilentOnReverseProxyHeadersAlone(t *testing.T) {
|
||||
cases := []struct {
|
||||
name string
|
||||
header string
|
||||
}{
|
||||
{"x-forwarded-for-alone", "X-Forwarded-For"},
|
||||
{"x-real-ip-alone", "X-Real-IP"},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
resetCDNDetectionOnce()
|
||||
req := httptest.NewRequest("GET", "/api/observers", nil)
|
||||
req.Header.Set(tc.header, "10.0.0.1")
|
||||
// No CDN-specific headers — just the generic reverse-proxy one.
|
||||
|
||||
out, nextCalled := runWithCDNMiddleware(t, req)
|
||||
|
||||
if !nextCalled {
|
||||
t.Fatal("middleware did not call next handler")
|
||||
}
|
||||
if strings.Contains(out, "detected request via CDN") {
|
||||
t.Errorf("header %s alone must NOT trigger CDN warning (would false-positive every nginx/k8s deploy); got: %q", tc.header, out)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// When a CDN-specific header is present alongside generic proxy
|
||||
// headers (common: Cloudflare → nginx → app), the warning still fires.
|
||||
func TestCDNDetection_LogsWhenCDNHeaderAccompaniesProxyHeaders(t *testing.T) {
|
||||
resetCDNDetectionOnce()
|
||||
req := httptest.NewRequest("GET", "/api/observers", nil)
|
||||
req.Header.Set("X-Forwarded-For", "10.0.0.1")
|
||||
req.Header.Set("X-Real-IP", "10.0.0.1")
|
||||
req.Header.Set("CF-Connecting-IP", "1.2.3.4")
|
||||
|
||||
out, nextCalled := runWithCDNMiddleware(t, req)
|
||||
|
||||
if !nextCalled {
|
||||
t.Fatal("middleware did not call next handler")
|
||||
}
|
||||
if !strings.Contains(out, "detected request via CDN") {
|
||||
t.Errorf("expected CDN warning when CF-Connecting-IP present alongside proxy headers; got: %q", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestCDNDetection_LogsOnlyOnce(t *testing.T) {
|
||||
resetCDNDetectionOnce()
|
||||
|
||||
var buf bytes.Buffer
|
||||
prev := log.Writer()
|
||||
log.SetOutput(&buf)
|
||||
defer log.SetOutput(prev)
|
||||
|
||||
nextCalled := 0
|
||||
h := cdnDetectionMiddleware(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
nextCalled++
|
||||
w.WriteHeader(http.StatusOK)
|
||||
}))
|
||||
|
||||
for i := 0; i < 3; i++ {
|
||||
req := httptest.NewRequest("GET", "/api/observers", nil)
|
||||
req.Header.Set("CF-Ray", "abc123")
|
||||
w := httptest.NewRecorder()
|
||||
h.ServeHTTP(w, req)
|
||||
}
|
||||
|
||||
if nextCalled != 3 {
|
||||
t.Fatalf("middleware must call next on every request; got %d calls, want 3", nextCalled)
|
||||
}
|
||||
got := strings.Count(buf.String(), "detected request via CDN")
|
||||
if got != 1 {
|
||||
t.Errorf("expected CDN warning exactly once across multiple requests; got %d in output: %q", got, buf.String())
|
||||
}
|
||||
}
|
||||
|
||||
// Each genuinely CDN-specific header should trip the detector on its
|
||||
// own. X-Forwarded-For / X-Real-IP are NOT in this set — see the
|
||||
// negative test TestCDNDetection_SilentOnReverseProxyHeadersAlone.
|
||||
func TestCDNDetection_RecognizesAllCommonCDNHeaders(t *testing.T) {
|
||||
headers := []string{
|
||||
"CF-Connecting-IP",
|
||||
"CF-Ray",
|
||||
"Fastly-Client-IP",
|
||||
"True-Client-IP",
|
||||
}
|
||||
for _, h := range headers {
|
||||
t.Run(h, func(t *testing.T) {
|
||||
resetCDNDetectionOnce()
|
||||
req := httptest.NewRequest("GET", "/api/observers", nil)
|
||||
req.Header.Set(h, "1.2.3.4")
|
||||
out, nextCalled := runWithCDNMiddleware(t, req)
|
||||
if !nextCalled {
|
||||
t.Fatal("middleware did not call next handler")
|
||||
}
|
||||
if !strings.Contains(out, "detected request via CDN") {
|
||||
t.Errorf("header %s should trip CDN detection; log was: %q", h, out)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// Round-1 KB finding #2: sync.Once is what keeps the log from
|
||||
// spamming — verify it holds under concurrent first-request load.
|
||||
// CI runs `go test -race`, so this also stresses the underlying
|
||||
// primitive for data races. Without -race, the assertion still
|
||||
// catches a plain bool / non-atomic implementation.
|
||||
func TestCDNDetectionMiddlewareConcurrentFirstRequestLogsOnce(t *testing.T) {
|
||||
resetCDNDetectionOnce()
|
||||
|
||||
var buf bytes.Buffer
|
||||
var bufMu sync.Mutex
|
||||
prev := log.Writer()
|
||||
// log.Printf can be called concurrently; serialize writes to buf
|
||||
// so we never race the test's own assertion read.
|
||||
log.SetOutput(writerFunc(func(p []byte) (int, error) {
|
||||
bufMu.Lock()
|
||||
defer bufMu.Unlock()
|
||||
return buf.Write(p)
|
||||
}))
|
||||
defer log.SetOutput(prev)
|
||||
|
||||
var nextCalls int64
|
||||
h := cdnDetectionMiddleware(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
atomic.AddInt64(&nextCalls, 1)
|
||||
w.WriteHeader(http.StatusOK)
|
||||
}))
|
||||
|
||||
const n = 50
|
||||
var wg sync.WaitGroup
|
||||
wg.Add(n)
|
||||
for i := 0; i < n; i++ {
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
req := httptest.NewRequest("GET", "/api/observers", nil)
|
||||
req.Header.Set("CF-Ray", "abc123-LAX")
|
||||
w := httptest.NewRecorder()
|
||||
h.ServeHTTP(w, req)
|
||||
}()
|
||||
}
|
||||
wg.Wait()
|
||||
|
||||
if got := atomic.LoadInt64(&nextCalls); got != n {
|
||||
t.Fatalf("middleware must call next on every concurrent request; got %d, want %d", got, n)
|
||||
}
|
||||
|
||||
bufMu.Lock()
|
||||
out := buf.String()
|
||||
bufMu.Unlock()
|
||||
got := strings.Count(out, "detected request via CDN")
|
||||
if got != 1 {
|
||||
t.Errorf("expected sync.Once to admit exactly ONE warning under %d concurrent first-requests; got %d. Output:\n%s", n, got, out)
|
||||
}
|
||||
}
|
||||
|
||||
// writerFunc adapts a function to io.Writer.
|
||||
type writerFunc func(p []byte) (int, error)
|
||||
|
||||
func (f writerFunc) Write(p []byte) (int, error) { return f(p) }
|
||||
|
||||
// Round-2 MAJOR finding: sync.Once only short-circuits the log.Printf,
|
||||
// not the per-request header scan. firstCDNHeader still iterates 4
|
||||
// http.Header.Get lookups on every /api request after warning fires.
|
||||
// The fix is an atomic.Bool fast-path checked BEFORE firstCDNHeader.
|
||||
// This test gates that the flag is actually set on the first CDN
|
||||
// request — without it, the middleware would have no signal to
|
||||
// short-circuit on, and the optimization would be a dead store.
|
||||
func TestCDNDetection_CdnWarnedFlagSet(t *testing.T) {
|
||||
resetCDNDetectionOnce()
|
||||
req := httptest.NewRequest("GET", "/api/x", nil)
|
||||
req.Header.Set("CF-Ray", "x")
|
||||
if _, nextCalled := runWithCDNMiddleware(t, req); !nextCalled {
|
||||
t.Fatal("middleware did not call next handler")
|
||||
}
|
||||
if !cdnWarned.Load() {
|
||||
t.Fatal("cdnWarned must be true after first CDN request (fast-path flag not set)")
|
||||
}
|
||||
}
|
||||
@@ -172,6 +172,14 @@ func (s *Server) RegisterRoutes(r *mux.Router) {
|
||||
// CDN-cacheable (their headers are set by spaHandler).
|
||||
r.Use(noStoreAPIMiddleware)
|
||||
|
||||
// Detect CDN-fronted deployments and warn the operator ONCE if
|
||||
// any CDN-typical header (CF-Ray, CF-Connecting-IP, etc.) is
|
||||
// observed. See #1561: no-store alone isn't sufficient on
|
||||
// Cloudflare zones with Cache Rules / Page Rules that ignore
|
||||
// origin Cache-Control. Operator must add a Bypass Cache rule
|
||||
// for /api/* — see docs/deployment-behind-cdn.md.
|
||||
r.Use(cdnDetectionMiddleware)
|
||||
|
||||
// Config endpoints
|
||||
r.HandleFunc("/api/config/cache", s.handleConfigCache).Methods("GET")
|
||||
r.HandleFunc("/api/config/client", s.handleConfigClient).Methods("GET")
|
||||
|
||||
@@ -0,0 +1,26 @@
|
||||
# Deployment behind a CDN
|
||||
|
||||
This page is referenced from the server log warning and from the
|
||||
`scripts/check-cdn-bypass.sh` helper output. The canonical content
|
||||
lives in [`docs/deployment.md` → "Behind a CDN (Cloudflare, Fastly)"](./deployment.md#behind-a-cdn-cloudflare-fastly).
|
||||
|
||||
**TL;DR for operators behind Cloudflare/Fastly/etc.:**
|
||||
|
||||
1. Verify from outside the CDN:
|
||||
```sh
|
||||
curl -sI 'https://<your-domain>/api/observers' | grep -iE 'cf-cache|age|cache-control'
|
||||
```
|
||||
Look for `cf-cache-status: BYPASS` and `age: 0`.
|
||||
2. If you see `cf-cache-status: HIT` or `age > 0`, add a Cloudflare
|
||||
**Cache Rule** (Caching → Cache Rules → Create rule):
|
||||
- When: URI Path starts with `/api/`
|
||||
- Then: Cache eligibility → **Bypass cache**
|
||||
3. Re-verify with the curl in step 1.
|
||||
4. Run `scripts/check-cdn-bypass.sh https://<your-domain>` — should exit 0.
|
||||
|
||||
See [`docs/deployment.md`](./deployment.md#behind-a-cdn-cloudflare-fastly)
|
||||
for the full discussion (Fastly equivalent, why the origin header
|
||||
alone isn't sufficient, what the startup warning means).
|
||||
|
||||
Issue: [#1561](https://github.com/Kpa-clawbot/CoreScope/issues/1561).
|
||||
Related: [#1551](https://github.com/Kpa-clawbot/CoreScope/issues/1551).
|
||||
@@ -9,6 +9,7 @@ Comprehensive guide to deploying and operating CoreScope. For a quick start, see
|
||||
- [Configuration Reference](#configuration-reference)
|
||||
- [MQTT Setup](#mqtt-setup)
|
||||
- [TLS / HTTPS](#tls--https)
|
||||
- [Behind a CDN (Cloudflare, Fastly)](#behind-a-cdn-cloudflare-fastly)
|
||||
- [Monitoring & Health Checks](#monitoring--health-checks)
|
||||
- [Backup & Restore](#backup--restore)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
@@ -342,6 +343,109 @@ The live instance at [analyzer.00id.net](https://analyzer.00id.net) has all API
|
||||
|
||||
---
|
||||
|
||||
## Behind a CDN (Cloudflare, Fastly)
|
||||
|
||||
If you front CoreScope with a CDN — Cloudflare, Fastly, Akamai, or
|
||||
similar — you **must** configure the CDN to bypass cache for `/api/*`.
|
||||
The server emits `Cache-Control: no-store` on every API response
|
||||
(see #1551), but Cloudflare's zone-level Cache Rules and legacy Page
|
||||
Rules can override origin headers. When that happens, observers, packets,
|
||||
stats and other API responses get cached at the edge for minutes to hours,
|
||||
producing observer-flap, stale dashboards and inconsistent state across
|
||||
viewers.
|
||||
|
||||
### 1. Verify whether your CDN is caching `/api/*`
|
||||
|
||||
From **outside** the CDN (a different network than your origin), run:
|
||||
|
||||
```sh
|
||||
curl -sI 'https://<your-domain>/api/observers' | grep -iE 'cf-cache|age|cache-control'
|
||||
```
|
||||
|
||||
Healthy output (cache is bypassed):
|
||||
|
||||
```
|
||||
cache-control: no-store
|
||||
cf-cache-status: BYPASS
|
||||
age: 0
|
||||
```
|
||||
|
||||
Unhealthy output (CDN is caching despite `no-store`):
|
||||
|
||||
```
|
||||
cache-control: no-store
|
||||
cf-cache-status: HIT
|
||||
age: 4732
|
||||
```
|
||||
|
||||
`HIT` or `age > 0` means an intermediary is serving cached JSON. Fix it
|
||||
before relying on the dashboard.
|
||||
|
||||
You can also run the bundled helper, which exits non-zero with a precise
|
||||
diagnostic when caching is detected:
|
||||
|
||||
```sh
|
||||
scripts/check-cdn-bypass.sh https://<your-domain>
|
||||
```
|
||||
|
||||
### 2. Cloudflare: add a Cache Rule (recommended)
|
||||
|
||||
Cloudflare Dashboard → your zone → **Caching → Cache Rules → Create rule**:
|
||||
|
||||
- **When incoming requests match:** Field = `URI Path`, Operator = `starts with`, Value = `/api/`
|
||||
- **Then:** Cache eligibility → **Bypass cache**
|
||||
|
||||
Save and deploy. Re-run the curl above; you should now see
|
||||
`cf-cache-status: BYPASS`.
|
||||
|
||||
Legacy Page Rules equivalent (if your zone has no Cache Rules):
|
||||
|
||||
- URL pattern: `*your-domain*/api/*`
|
||||
- Setting: **Cache Level → Bypass**
|
||||
|
||||
### 3. Fastly / other CDNs
|
||||
|
||||
Apply the equivalent bypass-cache rule for the `/api/` path prefix.
|
||||
The key invariant is: any response from `/api/*` must reach the
|
||||
browser uncached (no shared-cache HIT, no positive `Age` header).
|
||||
|
||||
### 4. Re-verify
|
||||
|
||||
After applying the rule, run step 1's curl from outside the CDN again
|
||||
and confirm `cf-cache-status: BYPASS` (or absence of HIT) and `age: 0`.
|
||||
|
||||
### 5. Watch the server log
|
||||
|
||||
The server logs a one-shot warning at the first request bearing a
|
||||
CDN-specific header (`CF-Connecting-IP`, `CF-Ray`, `Fastly-Client-IP`,
|
||||
or `True-Client-IP`):
|
||||
|
||||
Generic reverse-proxy headers (`X-Forwarded-For`, `X-Real-IP`) are
|
||||
deliberately NOT used as the signal — every nginx/Caddy/Traefik/k8s
|
||||
deploy sets them, so they'd produce false positives on every
|
||||
reverse-proxied install.
|
||||
|
||||
```
|
||||
[security] WARNING: detected request via CDN (CF-Ray header present).
|
||||
Ensure /api/* is bypassed in your CDN config — see docs/deployment-behind-cdn.md.
|
||||
Cached API responses cause observer-flap and incorrect dashboards.
|
||||
```
|
||||
|
||||
This is informational — the request is not blocked. Treat it as a
|
||||
prompt to run the verification curl above. The warning logs at most
|
||||
once per process boot, regardless of how many CDN-fronted requests
|
||||
arrive.
|
||||
|
||||
### Why this can't be fixed server-side
|
||||
|
||||
CDN cache policy is operator-controlled. The application emits the
|
||||
most conservative cache header it can (`Cache-Control: no-store`),
|
||||
but Cloudflare Cache Rules / Page Rules have higher precedence than
|
||||
origin headers in many zone configurations. The only durable fix is
|
||||
the operator-side bypass rule.
|
||||
|
||||
---
|
||||
|
||||
## Monitoring & Health Checks
|
||||
|
||||
### Docker health check
|
||||
|
||||
Executable
+89
@@ -0,0 +1,89 @@
|
||||
#!/bin/sh
|
||||
# check-cdn-bypass.sh — verify a CoreScope deployment's /api/* is
|
||||
# NOT being cached by an upstream CDN (Cloudflare, Fastly, etc.).
|
||||
#
|
||||
# Issue #1561. Run this from outside the CDN (e.g. from a different
|
||||
# network than the origin). Exits 0 if no CDN caching is detected,
|
||||
# non-zero otherwise with a specific finding.
|
||||
#
|
||||
# Usage:
|
||||
# scripts/check-cdn-bypass.sh https://analyzer.example.com
|
||||
#
|
||||
# Requires: POSIX sh, curl, grep, awk. No other dependencies.
|
||||
|
||||
set -u
|
||||
|
||||
if [ $# -ne 1 ]; then
|
||||
echo "usage: $0 <https://your-host>" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
HOST="$1"
|
||||
URL="${HOST%/}/api/observers"
|
||||
|
||||
# -s silent, -S show errors, -I HEAD request, -L follow redirects.
|
||||
# -w '%{http_code}' so we can verify we actually reached the endpoint;
|
||||
# a 404/401/403/500 has no cf-cache-status / age header and would
|
||||
# otherwise be reported as "OK: no CDN caching" — a false success.
|
||||
TMP_HDRS="$(mktemp)"
|
||||
trap 'rm -f "$TMP_HDRS"' EXIT INT TERM
|
||||
|
||||
STATUS="$(curl -sSIL -o "$TMP_HDRS" -w '%{http_code}' "$URL" 2>/tmp/cdn-check-err)"
|
||||
rc=$?
|
||||
if [ "$rc" != "0" ] && [ ! -s "$TMP_HDRS" ]; then
|
||||
echo "FAIL: curl could not reach $URL (exit $rc): $(cat /tmp/cdn-check-err 2>/dev/null)" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Verify the endpoint actually responded successfully. A non-200 means
|
||||
# we cannot draw any conclusion about CDN caching from this URL —
|
||||
# previously the script would silently return "OK" on 404/401/500.
|
||||
case "$STATUS" in
|
||||
2*) : ;;
|
||||
"")
|
||||
echo "FAIL: no HTTP status returned for $URL — cannot verify CDN bypass (check URL / auth / network)." >&2
|
||||
exit 1
|
||||
;;
|
||||
*)
|
||||
echo "FAIL: endpoint returned HTTP $STATUS — cannot verify CDN bypass (check URL / auth / network)." >&2
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
|
||||
HDRS="$(cat "$TMP_HDRS")"
|
||||
|
||||
# Lowercase the header names so grep is case-insensitive without -i tricks.
|
||||
LOWER="$(printf '%s\n' "$HDRS" | awk '{
|
||||
n = index($0, ":");
|
||||
if (n > 0) {
|
||||
name = tolower(substr($0, 1, n-1));
|
||||
rest = substr($0, n);
|
||||
print name rest;
|
||||
} else {
|
||||
print $0;
|
||||
}
|
||||
}')"
|
||||
|
||||
CF_STATUS="$(printf '%s\n' "$LOWER" | awk -F': *' '/^cf-cache-status:/ {print $2; exit}' | tr -d '\r')"
|
||||
AGE="$(printf '%s\n' "$LOWER" | awk -F': *' '/^age:/ {print $2; exit}' | tr -d '\r')"
|
||||
|
||||
case "$CF_STATUS" in
|
||||
HIT|hit)
|
||||
echo "FAIL: cf-cache-status: HIT — CDN is caching /api/observers; payload may be ${AGE:-unknown} seconds stale. Add a Cloudflare Cache Rule (Bypass cache) for /api/* — see docs/deployment-behind-cdn.md."
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
|
||||
if [ -n "$AGE" ]; then
|
||||
# age=0 is fine — anything > 0 means an intermediary served from cache.
|
||||
case "$AGE" in
|
||||
0|0[!0-9]*) : ;;
|
||||
*)
|
||||
echo "FAIL: response is $AGE seconds stale (age header > 0) — an intermediary cache is serving /api/observers. Configure your CDN to bypass /api/* — see docs/deployment-behind-cdn.md."
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
fi
|
||||
|
||||
echo "OK: no CDN caching detected for /api/observers (http=$STATUS, cf-cache-status=${CF_STATUS:-absent}, age=${AGE:-0})"
|
||||
exit 0
|
||||
Executable
+125
@@ -0,0 +1,125 @@
|
||||
#!/bin/sh
|
||||
# Test harness for scripts/check-cdn-bypass.sh — issue #1561.
|
||||
# Substitutes a fake `curl` on PATH so we can simulate CDN responses
|
||||
# without network access. The fake curl honors `-o <file>` (writes
|
||||
# the mocked headers there) and `-w '%{http_code}'` (writes the
|
||||
# mocked HTTP status to stdout), matching the real curl interface
|
||||
# the production script depends on.
|
||||
|
||||
set -u
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
TARGET="$SCRIPT_DIR/check-cdn-bypass.sh"
|
||||
|
||||
PASS=0
|
||||
FAIL=0
|
||||
|
||||
# mk_fake_curl HEADERS HTTP_STATUS
|
||||
# Builds a tmpdir containing a `curl` shim that:
|
||||
# - parses -o <file> from its args and writes HEADERS there
|
||||
# - parses -w <fmt> and prints HTTP_STATUS to stdout (the script
|
||||
# only ever uses -w '%{http_code}')
|
||||
# - exits 0 (curl considers a non-2xx still a successful transfer
|
||||
# unless -f is passed; production script does not pass -f, so
|
||||
# this matches reality).
|
||||
mk_fake_curl() {
|
||||
headers="$1"
|
||||
status="$2"
|
||||
tmpdir="$(mktemp -d)"
|
||||
# Embed the values via heredoc with single-quoted delimiter so
|
||||
# nothing is expanded inside the generated script except by the
|
||||
# outer shell now.
|
||||
cat >"$tmpdir/curl" <<EOF
|
||||
#!/bin/sh
|
||||
out=""
|
||||
while [ \$# -gt 0 ]; do
|
||||
case "\$1" in
|
||||
-o) out="\$2"; shift 2 ;;
|
||||
-w) shift 2 ;; # always %{http_code} in this script
|
||||
-*) shift ;; # -s, -S, -I, -L, etc.
|
||||
*) shift ;; # URL
|
||||
esac
|
||||
done
|
||||
if [ -n "\$out" ]; then
|
||||
cat >"\$out" <<'BODY'
|
||||
$headers
|
||||
BODY
|
||||
fi
|
||||
printf '%s' '$status'
|
||||
exit 0
|
||||
EOF
|
||||
chmod +x "$tmpdir/curl"
|
||||
echo "$tmpdir"
|
||||
}
|
||||
|
||||
run_case() {
|
||||
name="$1"
|
||||
headers="$2"
|
||||
status="$3"
|
||||
want_exit="$4"
|
||||
want_substr="$5"
|
||||
|
||||
if [ ! -f "$TARGET" ]; then
|
||||
echo "FAIL: $name — $TARGET missing"
|
||||
FAIL=$((FAIL+1))
|
||||
return
|
||||
fi
|
||||
|
||||
fakedir="$(mk_fake_curl "$headers" "$status")"
|
||||
out="$(PATH="$fakedir:$PATH" sh "$TARGET" https://example.test 2>&1)"
|
||||
rc=$?
|
||||
rm -rf "$fakedir"
|
||||
|
||||
if [ "$rc" != "$want_exit" ]; then
|
||||
echo "FAIL: $name — exit code $rc, want $want_exit; output: $out"
|
||||
FAIL=$((FAIL+1))
|
||||
return
|
||||
fi
|
||||
case "$out" in
|
||||
*"$want_substr"*) ;;
|
||||
*)
|
||||
printf 'FAIL: %s — output missing %s; got: %s\n' "$name" "$want_substr" "$out"
|
||||
FAIL=$((FAIL+1))
|
||||
return
|
||||
;;
|
||||
esac
|
||||
echo "PASS: $name"
|
||||
PASS=$((PASS+1))
|
||||
}
|
||||
|
||||
# Case 1: HTTP 200, bypass — no cf-cache HIT, age:0 → exit 0
|
||||
run_case "bypass-ok-200" "cache-control: no-store
|
||||
cf-cache-status: BYPASS
|
||||
age: 0" "200" 0 "OK"
|
||||
|
||||
# Case 2: HTTP 200, CDN HIT — exit 1, mention HIT
|
||||
run_case "cf-hit-200" "cache-control: no-store
|
||||
cf-cache-status: HIT
|
||||
age: 47" "200" 1 "HIT"
|
||||
|
||||
# Case 3: HTTP 200, stale by age — no cf-cache header but age > 0 → exit 1
|
||||
run_case "stale-age-200" "cache-control: no-store
|
||||
age: 120" "200" 1 "stale"
|
||||
|
||||
# Case 4: HTTP 200, no cache headers at all → exit 0 (existing behavior preserved)
|
||||
run_case "bare-200-no-cache-headers" "content-type: application/json" "200" 0 "OK"
|
||||
|
||||
# Round-1 regression: a non-200 endpoint had been silently reported
|
||||
# as "OK: no CDN caching detected" because no cache headers were
|
||||
# present. Script must now refuse to draw a conclusion.
|
||||
|
||||
# Case 5: HTTP 404 → exit 1, status-related error
|
||||
run_case "http-404-fails-with-status-error" "" "404" 1 "HTTP 404"
|
||||
|
||||
# Case 6: HTTP 401 → exit 1, status-related error
|
||||
run_case "http-401-fails-with-status-error" "" "401" 1 "HTTP 401"
|
||||
|
||||
# Case 7: HTTP 403 → exit 1, status-related error
|
||||
run_case "http-403-fails-with-status-error" "" "403" 1 "HTTP 403"
|
||||
|
||||
# Case 8: HTTP 500 → exit 1, status-related error
|
||||
run_case "http-500-fails-with-status-error" "" "500" 1 "HTTP 500"
|
||||
|
||||
echo
|
||||
echo "Results: $PASS passed, $FAIL failed"
|
||||
[ "$FAIL" = "0" ] || exit 1
|
||||
Reference in New Issue
Block a user