fix(ci): aggregate e2e pass/fail across all suites instead of broken digits-before-slash regex (#1298)

RED 33d789c4f3 (test) → GREEN
b43bd70f43 (fix). CI:
https://github.com/Kpa-clawbot/CoreScope/actions/workflows/deploy.yml?query=branch%3Afix%2Fe2e-badge-aggregate

Fixes #1296

## Problem
`.github/workflows/deploy.yml` was computing the e2e-tests badge with:

```
E2E_PASS=$(grep -oP '[0-9]+(?=/)' e2e-output.txt | tail -1 || echo "0")
```

This regex matched any digit-run immediately followed by `/` anywhere in
the combined output of 45+ Playwright suites, then took the **last**
match. The result was usually a small number scraped out of intermediate
per-suite progress text (often `2` from something like `2/3 …`), so the
badge perpetually showed `{"label":"e2e tests","message":"2
passed","color":"brightgreen"}` regardless of how many tests actually
ran.

## Fix
- New `scripts/aggregate-e2e-pass.sh` parses every per-suite summary
shape emitted by `test-*-e2e.js` (`N passed, M failed` / `passed N
failed M` / `N/T tests passed` / `N/T PASS` / `<file>.js: PASS|FAIL`)
and sums them. Per-test progress lines (`✓`, `PASS:`) are skipped so
they can't double-count.
- `deploy.yml` sources the aggregator, sets the badge to `"X passed"`
(brightgreen) when `FAIL=0` and `"X passed, Y failed"` (red) otherwise.
Badge schema (`schemaVersion / label / message / color`) unchanged.

## TDD
- **RED** 33d789c4f3: adds
`test-e2e-badge-aggregate.sh` + vendored fixture
`test-fixtures/e2e-output-sample.txt` (45 suites of realistic output).
Aggregator stub returns zeros → test fails on assertion (`PASS=108
FAIL=0` expected, `PASS=0 FAIL=0` got).
- **GREEN** b43bd70f43: real aggregator
implementation → all five sub-tests pass (fixture aggregate,
broken-regex sanity, synthetic mixed pass/fail, per-test-progress-line
guard, missing-file fallback).

No force-push. PII preflight clean.

---------

Co-authored-by: openclaw-bot <bot@openclaw.local>
This commit is contained in:
Kpa-clawbot
2026-05-19 22:40:10 -07:00
committed by GitHub
parent 4525c87963
commit e267fb754d
4 changed files with 431 additions and 2 deletions
+15 -2
View File
@@ -279,7 +279,13 @@ jobs:
- name: Generate frontend coverage badges
if: success()
run: |
E2E_PASS=$(grep -oP '[0-9]+(?=/)' e2e-output.txt | tail -1 || echo "0")
# Aggregate per-suite PASS/FAIL across every test-*-e2e.js summary.
# The previous regex (grep -oP '[0-9]+(?=/)' | tail -1) caught a
# stray digits-before-slash like the '2' in '2/3 tests passed' from
# some sub-output and stamped the badge as '2 passed'. See #1296.
eval "$(bash scripts/aggregate-e2e-pass.sh e2e-output.txt)"
E2E_PASS=${PASS:-0}
E2E_FAIL=${FAIL:-0}
mkdir -p .badges
if [ -f .nyc_output/frontend-coverage.json ] || [ -f .nyc_output/e2e-coverage.json ]; then
@@ -292,7 +298,14 @@ jobs:
echo "{\"schemaVersion\":1,\"label\":\"frontend coverage\",\"message\":\"${FE_COVERAGE}%\",\"color\":\"${FE_COLOR}\"}" > .badges/frontend-coverage.json
echo "## Frontend: ${FE_COVERAGE}% coverage" >> $GITHUB_STEP_SUMMARY
fi
echo "{\"schemaVersion\":1,\"label\":\"e2e tests\",\"message\":\"${E2E_PASS:-0} passed\",\"color\":\"brightgreen\"}" > .badges/e2e-tests.json
if [ "${E2E_FAIL:-0}" -gt 0 ]; then
E2E_MSG="${E2E_PASS:-0} passed, ${E2E_FAIL} failed"
E2E_COLOR="red"
else
E2E_MSG="${E2E_PASS:-0} passed"
E2E_COLOR="brightgreen"
fi
echo "{\"schemaVersion\":1,\"label\":\"e2e tests\",\"message\":\"${E2E_MSG}\",\"color\":\"${E2E_COLOR}\"}" > .badges/e2e-tests.json
- name: Stop test server
if: always()
+78
View File
@@ -0,0 +1,78 @@
#!/usr/bin/env bash
# Aggregate E2E pass/fail counts across all per-suite summary lines.
#
# Each test-*-e2e.js emits a summary line in one of these shapes:
# "N passed, M failed" — most suites (=== Results: 4 passed, 0 failed ===)
# "passed N failed M" — observer-iata style
# "N/T tests passed[, M failed]" — issue-1224 / 1236 / 1273 style
# "N/T PASS" — logo-* suites
# "N/T passed" — nav-fluid / nav-priority / nav-more-floor
# "<file>.js: PASS" — single-test suites (hamburger-dropdown)
# "<file>.js: FAIL ..." — suite-level failure
#
# Per-test progress lines (leading whitespace + PASS:/FAIL:/✓/✗) are skipped to
# avoid double-counting. Each suite's summary is matched by the FIRST pattern
# that fits, so a single line cannot contribute to two counters.
#
# Usage: aggregate-e2e-pass.sh [path-to-e2e-output.txt]
# Prints: PASS=<n> FAIL=<n>
set -u
file=${1:-e2e-output.txt}
pass=0
fail=0
if [ ! -f "$file" ]; then
echo "PASS=0 FAIL=0"
exit 0
fi
while IFS= read -r line || [ -n "$line" ]; do
# Skip per-test progress lines (leading whitespace + marker).
case "$line" in
*"✓"*|*"✗"*) continue;;
" "*"PASS"*|" "*"FAIL"*) continue;;
esac
# "N/T tests passed[, M failed]" (must come BEFORE "N passed" to avoid
# capturing T as the pass count)
if [[ "$line" =~ ([0-9]+)/[0-9]+\ tests\ passed(,\ ([0-9]+)\ failed)? ]]; then
pass=$((pass + BASH_REMATCH[1]))
if [ -n "${BASH_REMATCH[3]:-}" ]; then
fail=$((fail + BASH_REMATCH[3]))
fi
continue
fi
# "N/T PASS"
if [[ "$line" =~ ([0-9]+)/[0-9]+\ PASS ]]; then
pass=$((pass + BASH_REMATCH[1]))
continue
fi
# "N/T passed" (no "tests" between)
if [[ "$line" =~ ([0-9]+)/[0-9]+\ passed ]]; then
pass=$((pass + BASH_REMATCH[1]))
continue
fi
# "N passed, M failed" / "N passed" (most suites)
if [[ "$line" =~ ([0-9]+)\ passed(,\ ([0-9]+)\ failed)? ]]; then
pass=$((pass + BASH_REMATCH[1]))
if [ -n "${BASH_REMATCH[3]:-}" ]; then
fail=$((fail + BASH_REMATCH[3]))
fi
continue
fi
# "passed N failed M"
if [[ "$line" =~ passed\ ([0-9]+)\ failed\ ([0-9]+) ]]; then
pass=$((pass + BASH_REMATCH[1]))
fail=$((fail + BASH_REMATCH[2]))
continue
fi
# Standalone single-suite "<file>.js: PASS"
if [[ "$line" =~ \.js:\ PASS$ ]]; then
pass=$((pass + 1))
continue
fi
# Standalone single-suite "<file>.js: FAIL ..."
if [[ "$line" =~ \.js:\ FAIL ]]; then
fail=$((fail + 1))
continue
fi
done < "$file"
echo "PASS=$pass FAIL=$fail"
+98
View File
@@ -0,0 +1,98 @@
#!/usr/bin/env bash
# Test for scripts/aggregate-e2e-pass.sh — verifies aggregate across 45+
# Playwright suites in test-fixtures/e2e-output-sample.txt is correct, not the
# broken old behavior of "grep digits-before-slash | tail -1" (which returned 2).
#
# Regression for #1296.
set -u
script_dir=$(cd "$(dirname "$0")" && pwd)
aggregator="$script_dir/scripts/aggregate-e2e-pass.sh"
fixture="$script_dir/test-fixtures/e2e-output-sample.txt"
if [ ! -x "$aggregator" ]; then
chmod +x "$aggregator"
fi
# --- Test 1: fixture aggregate ---------------------------------------------
out=$("$aggregator" "$fixture")
# Count expected pass: sum N for every per-suite summary in the fixture.
# Computed by hand from the fixture (45 suites, see file).
EXPECTED_PASS=108
EXPECTED_FAIL=0
EXPECTED="PASS=$EXPECTED_PASS FAIL=$EXPECTED_FAIL"
if [ "$out" != "$EXPECTED" ]; then
echo "FAIL: fixture aggregate"
echo " expected: $EXPECTED"
echo " got: $out"
exit 1
fi
echo "PASS: fixture aggregates to $out"
# --- Test 2: the broken old regex would have returned something tiny -------
# (sanity check that we are NOT just reproducing the bug). Requires grep -P
# (PCRE), which is available in GitHub-hosted Ubuntu runners and most Linux
# distros but not in BusyBox; skip gracefully if absent.
if echo "x1/2" | grep -qoP '[0-9]+(?=/)' 2>/dev/null; then
old=$(grep -oP '[0-9]+(?=/)' "$fixture" | tail -1)
if [ "$old" = "$EXPECTED_PASS" ]; then
echo "FAIL: old broken regex coincidentally matches expected — fixture is not discriminating"
exit 1
fi
echo "PASS: old broken regex returned '$old' (NOT $EXPECTED_PASS) — fixture proves the bug"
else
echo "SKIP: grep -P unavailable, cannot verify old broken regex sanity"
fi
# --- Test 3: synthetic with failures, ensures FAIL accounting --------------
tmp=$(mktemp)
cat > "$tmp" <<'EOF'
=== Results: 4 passed, 1 failed ===
=== Results: 2 passed, 0 failed ===
test-foo.js: 3/5 passed
test-bar.js: PASS
test-baz.js: FAIL — boom
passed 7 failed 2
EOF
out2=$("$aggregator" "$tmp")
rm -f "$tmp"
EXP2="PASS=17 FAIL=4"
if [ "$out2" != "$EXP2" ]; then
echo "FAIL: synthetic mixed pass/fail"
echo " expected: $EXP2"
echo " got: $out2"
exit 1
fi
echo "PASS: synthetic mixed pass/fail aggregates to $out2"
# --- Test 4: per-test progress lines must NOT be counted -------------------
tmp=$(mktemp)
cat > "$tmp" <<'EOF'
✓ test alpha
✓ test beta
✗ test gamma failed
PASS: detail line
FAIL: detail line
=== Results: 2 passed, 1 failed ===
EOF
out3=$("$aggregator" "$tmp")
rm -f "$tmp"
EXP3="PASS=2 FAIL=1"
if [ "$out3" != "$EXP3" ]; then
echo "FAIL: per-test progress double-count"
echo " expected: $EXP3"
echo " got: $out3"
exit 1
fi
echo "PASS: per-test progress lines correctly ignored ($out3)"
# --- Test 5: empty / missing file ------------------------------------------
out4=$("$aggregator" /nonexistent/path/nope.txt)
if [ "$out4" != "PASS=0 FAIL=0" ]; then
echo "FAIL: missing file should yield PASS=0 FAIL=0, got $out4"
exit 1
fi
echo "PASS: missing file → PASS=0 FAIL=0"
echo
echo "ALL TESTS PASS"
+240
View File
@@ -0,0 +1,240 @@
Running test-e2e-playwright.js against http://localhost:13581
✓ home loads
✓ nodes table renders
✓ packets table renders
✓ map renders
✓ analytics renders
✓ channels renders
✓ live renders
✓ customizer opens
8/8 tests passed
Running test-filter-ux-e2e.js
✓ filter compiles
✓ filter evaluates equality
✓ filter evaluates AND/OR
✓ filter handles parens
✓ negation works
=== Results: 5 passed, 0 failed ===
Running test-channel-issue-1087-e2e.js
✓ tab renders
✓ message list loads
=== Results: 2 passed, 0 failed ===
Running test-channel-issue-1111-e2e.js
✓ count badge updates
✓ filter clears
✓ scroll restores
=== Results: 3 passed, 0 failed ===
Running test-map-modal-fluid-e2e.js
✓ modal opens
✓ modal closes
✓ scroll lock
✓ keyboard escape
=== Results: 4 passed, 0 failed ===
Running test-observer-iata-1188-e2e.js
✓ observer IATA resolves
✓ observer IATA missing handled
✓ observer IATA legacy
✓ observer IATA mixed
✓ observer IATA empty
All observer-IATA E2E tests passed.
=== Results: passed 5 failed 0 ===
Running test-nav-fluid-1055-e2e.js
PASS: narrow viewport
PASS: medium viewport
PASS: wide viewport
test-nav-fluid-1055-e2e.js: OK — 3/3 passed
Running test-nav-priority-1102-e2e.js
PASS: priority A
PASS: priority B
PASS: priority C
PASS: overflow rule
test-nav-priority-1102-e2e.js: OK — 4/4 passed
Running test-nav-more-floor-1139-e2e.js
PASS: small
PASS: medium
test-nav-more-floor-1139-e2e.js: OK — 2/2 passed
Running test-bottom-nav-1061-e2e.js
PASS: shows on mobile
PASS: hidden on desktop
PASS: highlights active
test-bottom-nav-1061-e2e.js: 3 passed, 0 failed
Running test-gestures-1062-e2e.js
PASS: swipe left
PASS: swipe right
test-gestures-1062-e2e.js: 2 passed, 0 failed
Running test-gestures-1185-scroll-discriminator-e2e.js
PASS: vertical scroll
PASS: horizontal swipe
PASS: diagonal ignored
test-gestures-1185-scroll-discriminator-e2e.js: 3 passed, 0 failed
Running test-gesture-hints-1065-e2e.js
PASS: hint shows once
PASS: hint dismissed
PASS: localStorage stored
test-gesture-hints-1065-e2e.js: 3 passed, 0 failed
Running test-channel-fluid-e2e.js
✓ fluid layout 1
✓ fluid layout 2
=== Results: 2 passed, 0 failed ===
Running test-table-fluid-e2e.js
✓ table fluid 1
✓ table fluid 2
=== Results: 2 passed, 0 failed ===
Running test-charts-fluid-1058-e2e.js
✓ chart 1
✓ chart 2
=== #1058 fluid analytics charts E2E: 2 passed, 0 failed ===
Running test-slideover-1056-e2e.js
✓ slideover open
✓ slideover close
✓ slideover backdrop
✓ slideover keyboard
=== #1056 AC#4 slide-over E2E: 4 passed, 0 failed, 0 skipped ===
Running test-slideover-1168-munger-e2e.js
✓ munger 1
✓ munger 2
=== #1168 Munger SlideOver hardening: 2 passed, 0 failed ===
Running test-logo-pulse-1173-e2e.js
✓ pulse animates
✓ pulse respects reduced motion
=== #1173 logo-pulse E2E: 2 passed, 0 failed ===
Running test-issue-1122-packets-filter-ux-e2e.js
✓ packets filter UX 1
✓ packets filter UX 2
✓ packets filter UX 3
=== Results: 3 passed, 0 failed ===
Running test-issue-1128-packets-layout-e2e.js
✓ packets layout 1
✓ packets layout 2
=== Results: 2 passed, 0 failed ===
Running test-issue-1128-multi-viewport-e2e.js
✓ multi viewport 1
✓ multi viewport 2
✓ multi viewport 3
=== Results: 3 passed, 0 failed ===
Running test-issue-1136-live-region-e2e.js
✓ live region announces
✓ live region debounced
=== Results: 2 passed, 0 failed ===
Running test-issue-1150-404-state-e2e.js
✓ 404 renders
✓ 404 link back
=== Results: 2 passed, 0 failed ===
Running test-issue-1146-path-link-contrast-e2e.js
✓ contrast computed
✓ contrast above threshold
=== Results: 2 passed, 0 failed ===
Running test-issue-1147-section-order-e2e.js
✓ order 1
✓ order 2
=== Results: 2 passed, 0 failed ===
Running test-issue-1151-orphan-separators-e2e.js
✓ orphan separators handled
=== #1151: 1 passed, 0 failed ===
Running test-logo-rebrand-e2e.js
✓ rebrand 1
✓ rebrand 2
✓ rebrand 3
test-logo-rebrand-e2e.js: 3/3 PASS
Running test-logo-theme-e2e.js
✓ theme 1
✓ theme 2
test-logo-theme-e2e.js: 2/2 PASS
Running test-logo-default-sage-teal-e2e.js
✓ default sage teal
test-logo-default-sage-teal-e2e.js: 1/1 PASS
Running test-issue-1109-hamburger-dropdown-visible-e2e.js
test-issue-1109-hamburger-dropdown-visible-e2e.js: PASS
Running test-live-layout-1178-1179-e2e.js
✓ live layout
✓ live layout 2
=== Results: 2 passed, 0 failed ===
Running test-issue-1205-live-controls-anchor-e2e.js
✓ anchor 1
✓ anchor 2
=== Results: 2 passed, 0 failed ===
Running test-live-mql-leak-1180-e2e.js
✓ no leak
=== Results: 1 passed, 0 failed ===
Running test-issue-1204-live-panel-structure-e2e.js
✓ structure 1
✓ structure 2
=== Results: 2 passed, 0 failed ===
Running test-issue-1234-live-chrome-pass2-e2e.js
✓ chrome pass2
=== Results: 1 passed, 0 failed ===
Running test-issue-1206-vcr-overlap-e2e.js
✓ vcr 1
✓ vcr 2
#1206 VCR overlap: 2 passed, 0 failed
Running test-issue-1244-live-vcr-row-hints-e2e.js
✓ row hints
=== Results: 1 passed, 0 failed ===
Running test-issue-1224-channels-mobile-ux-e2e.js
✓ mobile UX 1
✓ mobile UX 2
✓ mobile UX 3
3/3 tests passed
Running test-issue-1236-map-mobile-e2e.js
✓ map mobile 1
✓ map mobile 2
2/2 tests passed
Running test-issue-1273-qr-overlay-height-e2e.js
✓ qr overlay
1/1 tests passed
Running test-issue-1281-location-row-e2e.js
✓ location row 1
✓ location row 2
=== Results: 2 passed, 0 failed ===
Running test-issue-1279-legend-p2-e2e.js
✓ legend p2
=== Results: 1 passed, 0 failed ===
Running test-issue-1206-resize-observer-leak-e2e.js
✓ resize observer
#1206 ResizeObserver leak: 1 passed, 0 failed
Running test-nav-drawer-1064-e2e.js
PASS: drawer 1
PASS: drawer 2
test-nav-drawer-1064-e2e.js: 2 passed, 0 failed