mirror of
https://github.com/Kpa-clawbot/meshcore-analyzer.git
synced 2026-07-03 10:31:44 +00:00
63bfa3d910
Closes #1561. Follow-up to #1551. ## Why #1551 added `Cache-Control: no-store` to all `/api/*` responses. That's sufficient for CDNs that honour origin headers (Varnish, nginx). It is **not** sufficient for Cloudflare zones where Cache Rules / Page Rules override origin Cache-Control. Field evidence from the meshat.se diagnosis (2026-06-04): observers behind Cloudflare were returning `cf-cache-status: HIT` with `age` up to ~6 hours despite the origin emitting `no-store`. The CDN was caching per zone policy and ignoring the upstream directive — exactly the failure mode #1551 cannot reach. The application has no way to inject CDN rules; the only durable fix is operator-side. This PR makes that operator step discoverable and verifiable. ## What ### Server-side detection (log-only) `cmd/server/cdn_detection.go` adds a middleware wired into the `/api/*` chain after `noStoreAPIMiddleware`. On the **first** request bearing any CDN-typical header (`CF-Connecting-IP`, `CF-Ray`, `X-Forwarded-For`, `X-Real-IP`, `Fastly-Client-IP`, `True-Client-IP`) it logs: ``` [security] WARNING: detected request via CDN (CF-Ray header present). Ensure /api/* is bypassed in your CDN config — see docs/deployment-behind-cdn.md. Cached API responses cause observer-flap and incorrect dashboards. ``` `sync.Once` guarantees the warning fires at most once per process boot. The middleware never blocks, never modifies the response, never adds headers. Detection is observational only — operators who run behind a CDN without bypass have a real bug; the warning is appropriate. ### Operator documentation `docs/deployment.md` gains a new **"Behind a CDN (Cloudflare, Fastly)"** section covering: 1. Curl verification command + healthy vs unhealthy output examples 2. Cloudflare Cache Rule creation (URI Path starts-with `/api/` → Bypass cache) 3. Legacy Page Rules equivalent 4. Fastly note 5. Re-verification 6. Meaning of the startup log warning 7. Why we can't fix this server-side `docs/deployment-behind-cdn.md` is the canonical path the log message references — it's a short TL;DR that links back to the full section. ### Healthcheck script `scripts/check-cdn-bypass.sh` — POSIX sh, no dependencies beyond curl + grep + awk. Operators run: ```sh scripts/check-cdn-bypass.sh https://your-domain.example.com ``` Exits `0` with `OK: no CDN caching detected ...` or `1` with a precise diagnostic naming the offending header (`cf-cache-status: HIT` or stale `age`). ## TDD - **Red commit `e90ccaba`** (`test(security): RED ...`) — `cmd/server/cdn_detection_test.go` (4 Go tests + 6 subtests for each header) and `scripts/test-check-cdn-bypass.sh` (3 shell harness cases). Middleware stub returns `next` unchanged so tests compile and fail on assertions, not build errors. - **Green commit `5e6a60b5`** (`feat(security): GREEN ...`) — real middleware, wiring in `routes.go`, healthcheck script, doc. ## Deliverables | File | Status | Purpose | |------|--------|---------| | `cmd/server/cdn_detection.go` | new | middleware + sync.Once warning | | `cmd/server/cdn_detection_test.go` | new | 4 Go tests (1 stand-alone + 1 silence + 1 once + 1 table-driven over 6 headers) | | `cmd/server/routes.go` | modified | `r.Use(cdnDetectionMiddleware)` after no-store | | `docs/deployment.md` | modified | TOC entry + "Behind a CDN" section | | `docs/deployment-behind-cdn.md` | new | canonical path referenced by log message + script output | | `scripts/check-cdn-bypass.sh` | new | operator-runnable healthcheck | | `scripts/test-check-cdn-bypass.sh` | new | shell harness with fake curl | ## What this PR explicitly does NOT do - Does not block requests based on CDN detection (log-only). - Does not enforce CDN bypass (impossible — operator-controlled). - Does not spoof, strip or modify CDN headers. - Does not add CSP / HSTS / other security headers (out of scope). - Warning is not configurable — operators behind a CDN without bypass have a real bug, surfacing it is correct. ## Verification - `go test ./...` in `cmd/server/` — full suite green. - `sh scripts/test-check-cdn-bypass.sh` — 3/3 pass. - Preflight checklist — all 11 gates clean (PII, branch scope, red commit, CSS vars, CSS self-fallback, LIKE-on-JSON, sync migration, async-migration annotation, XSS sinks, img/SVG ratio, themed-img/SVG, fixture coverage). --------- Co-authored-by: openclaw-bot <bot@openclaw.local> Co-authored-by: clawbot <bot@clawbot.invalid>
90 lines
3.0 KiB
Bash
Executable File
90 lines
3.0 KiB
Bash
Executable File
#!/bin/sh
|
|
# check-cdn-bypass.sh — verify a CoreScope deployment's /api/* is
|
|
# NOT being cached by an upstream CDN (Cloudflare, Fastly, etc.).
|
|
#
|
|
# Issue #1561. Run this from outside the CDN (e.g. from a different
|
|
# network than the origin). Exits 0 if no CDN caching is detected,
|
|
# non-zero otherwise with a specific finding.
|
|
#
|
|
# Usage:
|
|
# scripts/check-cdn-bypass.sh https://analyzer.example.com
|
|
#
|
|
# Requires: POSIX sh, curl, grep, awk. No other dependencies.
|
|
|
|
set -u
|
|
|
|
if [ $# -ne 1 ]; then
|
|
echo "usage: $0 <https://your-host>" >&2
|
|
exit 2
|
|
fi
|
|
|
|
HOST="$1"
|
|
URL="${HOST%/}/api/observers"
|
|
|
|
# -s silent, -S show errors, -I HEAD request, -L follow redirects.
|
|
# -w '%{http_code}' so we can verify we actually reached the endpoint;
|
|
# a 404/401/403/500 has no cf-cache-status / age header and would
|
|
# otherwise be reported as "OK: no CDN caching" — a false success.
|
|
TMP_HDRS="$(mktemp)"
|
|
trap 'rm -f "$TMP_HDRS"' EXIT INT TERM
|
|
|
|
STATUS="$(curl -sSIL -o "$TMP_HDRS" -w '%{http_code}' "$URL" 2>/tmp/cdn-check-err)"
|
|
rc=$?
|
|
if [ "$rc" != "0" ] && [ ! -s "$TMP_HDRS" ]; then
|
|
echo "FAIL: curl could not reach $URL (exit $rc): $(cat /tmp/cdn-check-err 2>/dev/null)" >&2
|
|
exit 1
|
|
fi
|
|
|
|
# Verify the endpoint actually responded successfully. A non-200 means
|
|
# we cannot draw any conclusion about CDN caching from this URL —
|
|
# previously the script would silently return "OK" on 404/401/500.
|
|
case "$STATUS" in
|
|
2*) : ;;
|
|
"")
|
|
echo "FAIL: no HTTP status returned for $URL — cannot verify CDN bypass (check URL / auth / network)." >&2
|
|
exit 1
|
|
;;
|
|
*)
|
|
echo "FAIL: endpoint returned HTTP $STATUS — cannot verify CDN bypass (check URL / auth / network)." >&2
|
|
exit 1
|
|
;;
|
|
esac
|
|
|
|
HDRS="$(cat "$TMP_HDRS")"
|
|
|
|
# Lowercase the header names so grep is case-insensitive without -i tricks.
|
|
LOWER="$(printf '%s\n' "$HDRS" | awk '{
|
|
n = index($0, ":");
|
|
if (n > 0) {
|
|
name = tolower(substr($0, 1, n-1));
|
|
rest = substr($0, n);
|
|
print name rest;
|
|
} else {
|
|
print $0;
|
|
}
|
|
}')"
|
|
|
|
CF_STATUS="$(printf '%s\n' "$LOWER" | awk -F': *' '/^cf-cache-status:/ {print $2; exit}' | tr -d '\r')"
|
|
AGE="$(printf '%s\n' "$LOWER" | awk -F': *' '/^age:/ {print $2; exit}' | tr -d '\r')"
|
|
|
|
case "$CF_STATUS" in
|
|
HIT|hit)
|
|
echo "FAIL: cf-cache-status: HIT — CDN is caching /api/observers; payload may be ${AGE:-unknown} seconds stale. Add a Cloudflare Cache Rule (Bypass cache) for /api/* — see docs/deployment-behind-cdn.md."
|
|
exit 1
|
|
;;
|
|
esac
|
|
|
|
if [ -n "$AGE" ]; then
|
|
# age=0 is fine — anything > 0 means an intermediary served from cache.
|
|
case "$AGE" in
|
|
0|0[!0-9]*) : ;;
|
|
*)
|
|
echo "FAIL: response is $AGE seconds stale (age header > 0) — an intermediary cache is serving /api/observers. Configure your CDN to bypass /api/* — see docs/deployment-behind-cdn.md."
|
|
exit 1
|
|
;;
|
|
esac
|
|
fi
|
|
|
|
echo "OK: no CDN caching detected for /api/observers (http=$STATUS, cf-cache-status=${CF_STATUS:-absent}, age=${AGE:-0})"
|
|
exit 0
|