fix(ci): aggregate e2e pass/fail across all suites instead of broken digits-before-slash regex (#1298)

RED 33d789c4f3 (test) → GREEN b43bd70f43 (fix). CI: https://github.com/Kpa-clawbot/CoreScope/actions/workflows/deploy.yml?query=branch%3Afix%2Fe2e-badge-aggregate Fixes #1296 ## Problem `.github/workflows/deploy.yml` was computing the e2e-tests badge with: ``` E2E_PASS=$(grep -oP '[0-9]+(?=/)' e2e-output.txt | tail -1 || echo "0") ``` This regex matched any digit-run immediately followed by `/` anywhere in the combined output of 45+ Playwright suites, then took the **last** match. The result was usually a small number scraped out of intermediate per-suite progress text (often `2` from something like `2/3 …`), so the badge perpetually showed `{"label":"e2e tests","message":"2 passed","color":"brightgreen"}` regardless of how many tests actually ran. ## Fix - New `scripts/aggregate-e2e-pass.sh` parses every per-suite summary shape emitted by `test-*-e2e.js` (`N passed, M failed` / `passed N failed M` / `N/T tests passed` / `N/T PASS` / `<file>.js: PASS|FAIL`) and sums them. Per-test progress lines (`✓`, `PASS:`) are skipped so they can't double-count. - `deploy.yml` sources the aggregator, sets the badge to `"X passed"` (brightgreen) when `FAIL=0` and `"X passed, Y failed"` (red) otherwise. Badge schema (`schemaVersion / label / message / color`) unchanged. ## TDD - **RED** 33d789c4f3: adds `test-e2e-badge-aggregate.sh` + vendored fixture `test-fixtures/e2e-output-sample.txt` (45 suites of realistic output). Aggregator stub returns zeros → test fails on assertion (`PASS=108 FAIL=0` expected, `PASS=0 FAIL=0` got). - **GREEN** b43bd70f43: real aggregator implementation → all five sub-tests pass (fixture aggregate, broken-regex sanity, synthetic mixed pass/fail, per-test-progress-line guard, missing-file fallback). No force-push. PII preflight clean. --------- Co-authored-by: openclaw-bot <bot@openclaw.local>
2026-05-22 09:25:09 +00:00 · 2026-05-19 22:40:10 -07:00
parent 4525c87963
commit e267fb754d
4 changed files with 431 additions and 2 deletions
@@ -279,7 +279,13 @@ jobs:
      - name: Generate frontend coverage badges
        if: success()
        run: |
-          E2E_PASS=$(grep -oP '[0-9]+(?=/)' e2e-output.txt | tail -1 || echo "0")
+          # Aggregate per-suite PASS/FAIL across every test-*-e2e.js summary.
+          # The previous regex (grep -oP '[0-9]+(?=/)' | tail -1) caught a
+          # stray digits-before-slash like the '2' in '2/3 tests passed' from
+          # some sub-output and stamped the badge as '2 passed'. See #1296.
+          eval "$(bash scripts/aggregate-e2e-pass.sh e2e-output.txt)"
+          E2E_PASS=${PASS:-0}
+          E2E_FAIL=${FAIL:-0}

          mkdir -p .badges
          if [ -f .nyc_output/frontend-coverage.json ] || [ -f .nyc_output/e2e-coverage.json ]; then
@@ -292,7 +298,14 @@ jobs:
            echo "{\"schemaVersion\":1,\"label\":\"frontend coverage\",\"message\":\"${FE_COVERAGE}%\",\"color\":\"${FE_COLOR}\"}" > .badges/frontend-coverage.json
            echo "## Frontend: ${FE_COVERAGE}% coverage" >> $GITHUB_STEP_SUMMARY
          fi
-          echo "{\"schemaVersion\":1,\"label\":\"e2e tests\",\"message\":\"${E2E_PASS:-0} passed\",\"color\":\"brightgreen\"}" > .badges/e2e-tests.json
+          if [ "${E2E_FAIL:-0}" -gt 0 ]; then
+            E2E_MSG="${E2E_PASS:-0} passed, ${E2E_FAIL} failed"
+            E2E_COLOR="red"
+          else
+            E2E_MSG="${E2E_PASS:-0} passed"
+            E2E_COLOR="brightgreen"
+          fi
+          echo "{\"schemaVersion\":1,\"label\":\"e2e tests\",\"message\":\"${E2E_MSG}\",\"color\":\"${E2E_COLOR}\"}" > .badges/e2e-tests.json

      - name: Stop test server
        if: always()
@@ -0,0 +1,78 @@
+#!/usr/bin/env bash
+# Aggregate E2E pass/fail counts across all per-suite summary lines.
+#
+# Each test-*-e2e.js emits a summary line in one of these shapes:
+#   "N passed, M failed"             — most suites (=== Results: 4 passed, 0 failed ===)
+#   "passed N failed M"              — observer-iata style
+#   "N/T tests passed[, M failed]"   — issue-1224 / 1236 / 1273 style
+#   "N/T PASS"                       — logo-* suites
+#   "N/T passed"                     — nav-fluid / nav-priority / nav-more-floor
+#   "<file>.js: PASS"                — single-test suites (hamburger-dropdown)
+#   "<file>.js: FAIL ..."            — suite-level failure
+#
+# Per-test progress lines (leading whitespace + PASS:/FAIL:/✓/✗) are skipped to
+# avoid double-counting. Each suite's summary is matched by the FIRST pattern
+# that fits, so a single line cannot contribute to two counters.
+#
+# Usage: aggregate-e2e-pass.sh [path-to-e2e-output.txt]
+# Prints:  PASS=<n> FAIL=<n>
+set -u
+file=${1:-e2e-output.txt}
+pass=0
+fail=0
+if [ ! -f "$file" ]; then
+  echo "PASS=0 FAIL=0"
+  exit 0
+fi
+while IFS= read -r line || [ -n "$line" ]; do
+  # Skip per-test progress lines (leading whitespace + marker).
+  case "$line" in
+    *"✓"*|*"✗"*) continue;;
+    " "*"PASS"*|" "*"FAIL"*) continue;;
+  esac
+
+  # "N/T tests passed[, M failed]"   (must come BEFORE "N passed" to avoid
+  # capturing T as the pass count)
+  if [[ "$line" =~ ([0-9]+)/[0-9]+\ tests\ passed(,\ ([0-9]+)\ failed)? ]]; then
+    pass=$((pass + BASH_REMATCH[1]))
+    if [ -n "${BASH_REMATCH[3]:-}" ]; then
+      fail=$((fail + BASH_REMATCH[3]))
+    fi
+    continue
+  fi
+  # "N/T PASS"
+  if [[ "$line" =~ ([0-9]+)/[0-9]+\ PASS ]]; then
+    pass=$((pass + BASH_REMATCH[1]))
+    continue
+  fi
+  # "N/T passed" (no "tests" between)
+  if [[ "$line" =~ ([0-9]+)/[0-9]+\ passed ]]; then
+    pass=$((pass + BASH_REMATCH[1]))
+    continue
+  fi
+  # "N passed, M failed" / "N passed"  (most suites)
+  if [[ "$line" =~ ([0-9]+)\ passed(,\ ([0-9]+)\ failed)? ]]; then
+    pass=$((pass + BASH_REMATCH[1]))
+    if [ -n "${BASH_REMATCH[3]:-}" ]; then
+      fail=$((fail + BASH_REMATCH[3]))
+    fi
+    continue
+  fi
+  # "passed N failed M"
+  if [[ "$line" =~ passed\ ([0-9]+)\ failed\ ([0-9]+) ]]; then
+    pass=$((pass + BASH_REMATCH[1]))
+    fail=$((fail + BASH_REMATCH[2]))
+    continue
+  fi
+  # Standalone single-suite "<file>.js: PASS"
+  if [[ "$line" =~ \.js:\ PASS$ ]]; then
+    pass=$((pass + 1))
+    continue
+  fi
+  # Standalone single-suite "<file>.js: FAIL ..."
+  if [[ "$line" =~ \.js:\ FAIL ]]; then
+    fail=$((fail + 1))
+    continue
+  fi
+done < "$file"
+echo "PASS=$pass FAIL=$fail"
@@ -0,0 +1,98 @@
+#!/usr/bin/env bash
+# Test for scripts/aggregate-e2e-pass.sh — verifies aggregate across 45+
+# Playwright suites in test-fixtures/e2e-output-sample.txt is correct, not the
+# broken old behavior of "grep digits-before-slash | tail -1" (which returned 2).
+#
+# Regression for #1296.
+set -u
+script_dir=$(cd "$(dirname "$0")" && pwd)
+aggregator="$script_dir/scripts/aggregate-e2e-pass.sh"
+fixture="$script_dir/test-fixtures/e2e-output-sample.txt"
+
+if [ ! -x "$aggregator" ]; then
+  chmod +x "$aggregator"
+fi
+
+# --- Test 1: fixture aggregate ---------------------------------------------
+out=$("$aggregator" "$fixture")
+# Count expected pass: sum N for every per-suite summary in the fixture.
+# Computed by hand from the fixture (45 suites, see file).
+EXPECTED_PASS=108
+EXPECTED_FAIL=0
+EXPECTED="PASS=$EXPECTED_PASS FAIL=$EXPECTED_FAIL"
+
+if [ "$out" != "$EXPECTED" ]; then
+  echo "FAIL: fixture aggregate"
+  echo "  expected: $EXPECTED"
+  echo "  got:      $out"
+  exit 1
+fi
+echo "PASS: fixture aggregates to $out"
+
+# --- Test 2: the broken old regex would have returned something tiny -------
+# (sanity check that we are NOT just reproducing the bug). Requires grep -P
+# (PCRE), which is available in GitHub-hosted Ubuntu runners and most Linux
+# distros but not in BusyBox; skip gracefully if absent.
+if echo "x1/2" | grep -qoP '[0-9]+(?=/)' 2>/dev/null; then
+  old=$(grep -oP '[0-9]+(?=/)' "$fixture" | tail -1)
+  if [ "$old" = "$EXPECTED_PASS" ]; then
+    echo "FAIL: old broken regex coincidentally matches expected — fixture is not discriminating"
+    exit 1
+  fi
+  echo "PASS: old broken regex returned '$old' (NOT $EXPECTED_PASS) — fixture proves the bug"
+else
+  echo "SKIP: grep -P unavailable, cannot verify old broken regex sanity"
+fi
+
+# --- Test 3: synthetic with failures, ensures FAIL accounting --------------
+tmp=$(mktemp)
+cat > "$tmp" <<'EOF'
+=== Results: 4 passed, 1 failed ===
+=== Results: 2 passed, 0 failed ===
+test-foo.js: 3/5 passed
+test-bar.js: PASS
+test-baz.js: FAIL — boom
+passed 7 failed 2
+EOF
+out2=$("$aggregator" "$tmp")
+rm -f "$tmp"
+EXP2="PASS=17 FAIL=4"
+if [ "$out2" != "$EXP2" ]; then
+  echo "FAIL: synthetic mixed pass/fail"
+  echo "  expected: $EXP2"
+  echo "  got:      $out2"
+  exit 1
+fi
+echo "PASS: synthetic mixed pass/fail aggregates to $out2"
+
+# --- Test 4: per-test progress lines must NOT be counted -------------------
+tmp=$(mktemp)
+cat > "$tmp" <<'EOF'
+  ✓ test alpha
+  ✓ test beta
+  ✗ test gamma failed
+  PASS: detail line
+  FAIL: detail line
+=== Results: 2 passed, 1 failed ===
+EOF
+out3=$("$aggregator" "$tmp")
+rm -f "$tmp"
+EXP3="PASS=2 FAIL=1"
+if [ "$out3" != "$EXP3" ]; then
+  echo "FAIL: per-test progress double-count"
+  echo "  expected: $EXP3"
+  echo "  got:      $out3"
+  exit 1
+fi
+echo "PASS: per-test progress lines correctly ignored ($out3)"
+
+# --- Test 5: empty / missing file ------------------------------------------
+out4=$("$aggregator" /nonexistent/path/nope.txt)
+if [ "$out4" != "PASS=0 FAIL=0" ]; then
+  echo "FAIL: missing file should yield PASS=0 FAIL=0, got $out4"
+  exit 1
+fi
+echo "PASS: missing file → PASS=0 FAIL=0"
+
+echo
+echo "ALL TESTS PASS"
@@ -0,0 +1,240 @@
+Running test-e2e-playwright.js against http://localhost:13581
+  ✓ home loads
+  ✓ nodes table renders
+  ✓ packets table renders
+  ✓ map renders
+  ✓ analytics renders
+  ✓ channels renders
+  ✓ live renders
+  ✓ customizer opens
+
+8/8 tests passed
+Running test-filter-ux-e2e.js
+  ✓ filter compiles
+  ✓ filter evaluates equality
+  ✓ filter evaluates AND/OR
+  ✓ filter handles parens
+  ✓ negation works
+
+=== Results: 5 passed, 0 failed ===
+Running test-channel-issue-1087-e2e.js
+  ✓ tab renders
+  ✓ message list loads
+=== Results: 2 passed, 0 failed ===
+Running test-channel-issue-1111-e2e.js
+  ✓ count badge updates
+  ✓ filter clears
+  ✓ scroll restores
+=== Results: 3 passed, 0 failed ===
+Running test-map-modal-fluid-e2e.js
+  ✓ modal opens
+  ✓ modal closes
+  ✓ scroll lock
+  ✓ keyboard escape
+
+=== Results: 4 passed, 0 failed ===
+Running test-observer-iata-1188-e2e.js
+  ✓ observer IATA resolves
+  ✓ observer IATA missing handled
+  ✓ observer IATA legacy
+  ✓ observer IATA mixed
+  ✓ observer IATA empty
+All observer-IATA E2E tests passed.
+
+=== Results: passed 5 failed 0 ===
+Running test-nav-fluid-1055-e2e.js
+  PASS: narrow viewport
+  PASS: medium viewport
+  PASS: wide viewport
+
+test-nav-fluid-1055-e2e.js: OK — 3/3 passed
+Running test-nav-priority-1102-e2e.js
+  PASS: priority A
+  PASS: priority B
+  PASS: priority C
+  PASS: overflow rule
+
+test-nav-priority-1102-e2e.js: OK — 4/4 passed
+Running test-nav-more-floor-1139-e2e.js
+  PASS: small
+  PASS: medium
+
+test-nav-more-floor-1139-e2e.js: OK — 2/2 passed
+Running test-bottom-nav-1061-e2e.js
+  PASS: shows on mobile
+  PASS: hidden on desktop
+  PASS: highlights active
+
+test-bottom-nav-1061-e2e.js: 3 passed, 0 failed
+Running test-gestures-1062-e2e.js
+  PASS: swipe left
+  PASS: swipe right
+
+test-gestures-1062-e2e.js: 2 passed, 0 failed
+Running test-gestures-1185-scroll-discriminator-e2e.js
+  PASS: vertical scroll
+  PASS: horizontal swipe
+  PASS: diagonal ignored
+
+test-gestures-1185-scroll-discriminator-e2e.js: 3 passed, 0 failed
+Running test-gesture-hints-1065-e2e.js
+  PASS: hint shows once
+  PASS: hint dismissed
+  PASS: localStorage stored
+
+test-gesture-hints-1065-e2e.js: 3 passed, 0 failed
+Running test-channel-fluid-e2e.js
+  ✓ fluid layout 1
+  ✓ fluid layout 2
+
+=== Results: 2 passed, 0 failed ===
+Running test-table-fluid-e2e.js
+  ✓ table fluid 1
+  ✓ table fluid 2
+
+=== Results: 2 passed, 0 failed ===
+Running test-charts-fluid-1058-e2e.js
+  ✓ chart 1
+  ✓ chart 2
+
+=== #1058 fluid analytics charts E2E: 2 passed, 0 failed ===
+Running test-slideover-1056-e2e.js
+  ✓ slideover open
+  ✓ slideover close
+  ✓ slideover backdrop
+  ✓ slideover keyboard
+
+=== #1056 AC#4 slide-over E2E: 4 passed, 0 failed, 0 skipped ===
+Running test-slideover-1168-munger-e2e.js
+  ✓ munger 1
+  ✓ munger 2
+
+=== #1168 Munger SlideOver hardening: 2 passed, 0 failed ===
+Running test-logo-pulse-1173-e2e.js
+  ✓ pulse animates
+  ✓ pulse respects reduced motion
+
+=== #1173 logo-pulse E2E: 2 passed, 0 failed ===
+Running test-issue-1122-packets-filter-ux-e2e.js
+  ✓ packets filter UX 1
+  ✓ packets filter UX 2
+  ✓ packets filter UX 3
+
+=== Results: 3 passed, 0 failed ===
+Running test-issue-1128-packets-layout-e2e.js
+  ✓ packets layout 1
+  ✓ packets layout 2
+
+=== Results: 2 passed, 0 failed ===
+Running test-issue-1128-multi-viewport-e2e.js
+  ✓ multi viewport 1
+  ✓ multi viewport 2
+  ✓ multi viewport 3
+
+=== Results: 3 passed, 0 failed ===
+Running test-issue-1136-live-region-e2e.js
+  ✓ live region announces
+  ✓ live region debounced
+
+=== Results: 2 passed, 0 failed ===
+Running test-issue-1150-404-state-e2e.js
+  ✓ 404 renders
+  ✓ 404 link back
+
+=== Results: 2 passed, 0 failed ===
+Running test-issue-1146-path-link-contrast-e2e.js
+  ✓ contrast computed
+  ✓ contrast above threshold
+
+=== Results: 2 passed, 0 failed ===
+Running test-issue-1147-section-order-e2e.js
+  ✓ order 1
+  ✓ order 2
+
+=== Results: 2 passed, 0 failed ===
+Running test-issue-1151-orphan-separators-e2e.js
+  ✓ orphan separators handled
+
+=== #1151: 1 passed, 0 failed ===
+Running test-logo-rebrand-e2e.js
+  ✓ rebrand 1
+  ✓ rebrand 2
+  ✓ rebrand 3
+
+test-logo-rebrand-e2e.js: 3/3 PASS
+Running test-logo-theme-e2e.js
+  ✓ theme 1
+  ✓ theme 2
+
+test-logo-theme-e2e.js: 2/2 PASS
+Running test-logo-default-sage-teal-e2e.js
+  ✓ default sage teal
+
+test-logo-default-sage-teal-e2e.js: 1/1 PASS
+Running test-issue-1109-hamburger-dropdown-visible-e2e.js
+test-issue-1109-hamburger-dropdown-visible-e2e.js: PASS
+Running test-live-layout-1178-1179-e2e.js
+  ✓ live layout
+  ✓ live layout 2
+
+=== Results: 2 passed, 0 failed ===
+Running test-issue-1205-live-controls-anchor-e2e.js
+  ✓ anchor 1
+  ✓ anchor 2
+
+=== Results: 2 passed, 0 failed ===
+Running test-live-mql-leak-1180-e2e.js
+  ✓ no leak
+
+=== Results: 1 passed, 0 failed ===
+Running test-issue-1204-live-panel-structure-e2e.js
+  ✓ structure 1
+  ✓ structure 2
+
+=== Results: 2 passed, 0 failed ===
+Running test-issue-1234-live-chrome-pass2-e2e.js
+  ✓ chrome pass2
+
+=== Results: 1 passed, 0 failed ===
+Running test-issue-1206-vcr-overlap-e2e.js
+  ✓ vcr 1
+  ✓ vcr 2
+
+#1206 VCR overlap: 2 passed, 0 failed
+Running test-issue-1244-live-vcr-row-hints-e2e.js
+  ✓ row hints
+
+=== Results: 1 passed, 0 failed ===
+Running test-issue-1224-channels-mobile-ux-e2e.js
+  ✓ mobile UX 1
+  ✓ mobile UX 2
+  ✓ mobile UX 3
+
+3/3 tests passed
+Running test-issue-1236-map-mobile-e2e.js
+  ✓ map mobile 1
+  ✓ map mobile 2
+
+2/2 tests passed
+Running test-issue-1273-qr-overlay-height-e2e.js
+  ✓ qr overlay
+
+1/1 tests passed
+Running test-issue-1281-location-row-e2e.js
+  ✓ location row 1
+  ✓ location row 2
+
+=== Results: 2 passed, 0 failed ===
+Running test-issue-1279-legend-p2-e2e.js
+  ✓ legend p2
+
+=== Results: 1 passed, 0 failed ===
+Running test-issue-1206-resize-observer-leak-e2e.js
+  ✓ resize observer
+
+#1206 ResizeObserver leak: 1 passed, 0 failed
+Running test-nav-drawer-1064-e2e.js
+  PASS: drawer 1
+  PASS: drawer 2
+
+test-nav-drawer-1064-e2e.js: 2 passed, 0 failed