From e267fb754d899bbbb5a5992f4df06001a1fe9482 Mon Sep 17 00:00:00 2001 From: Kpa-clawbot Date: Tue, 19 May 2026 22:40:10 -0700 Subject: [PATCH] fix(ci): aggregate e2e pass/fail across all suites instead of broken digits-before-slash regex (#1298) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit RED 33d789c4f312626557126c2d4508511fc1035d3c (test) → GREEN b43bd70f437a68115b8c14f17f24452c9658e41c (fix). CI: https://github.com/Kpa-clawbot/CoreScope/actions/workflows/deploy.yml?query=branch%3Afix%2Fe2e-badge-aggregate Fixes #1296 ## Problem `.github/workflows/deploy.yml` was computing the e2e-tests badge with: ``` E2E_PASS=$(grep -oP '[0-9]+(?=/)' e2e-output.txt | tail -1 || echo "0") ``` This regex matched any digit-run immediately followed by `/` anywhere in the combined output of 45+ Playwright suites, then took the **last** match. The result was usually a small number scraped out of intermediate per-suite progress text (often `2` from something like `2/3 …`), so the badge perpetually showed `{"label":"e2e tests","message":"2 passed","color":"brightgreen"}` regardless of how many tests actually ran. ## Fix - New `scripts/aggregate-e2e-pass.sh` parses every per-suite summary shape emitted by `test-*-e2e.js` (`N passed, M failed` / `passed N failed M` / `N/T tests passed` / `N/T PASS` / `.js: PASS|FAIL`) and sums them. Per-test progress lines (`✓`, `PASS:`) are skipped so they can't double-count. - `deploy.yml` sources the aggregator, sets the badge to `"X passed"` (brightgreen) when `FAIL=0` and `"X passed, Y failed"` (red) otherwise. Badge schema (`schemaVersion / label / message / color`) unchanged. ## TDD - **RED** 33d789c4f312626557126c2d4508511fc1035d3c: adds `test-e2e-badge-aggregate.sh` + vendored fixture `test-fixtures/e2e-output-sample.txt` (45 suites of realistic output). Aggregator stub returns zeros → test fails on assertion (`PASS=108 FAIL=0` expected, `PASS=0 FAIL=0` got). - **GREEN** b43bd70f437a68115b8c14f17f24452c9658e41c: real aggregator implementation → all five sub-tests pass (fixture aggregate, broken-regex sanity, synthetic mixed pass/fail, per-test-progress-line guard, missing-file fallback). No force-push. PII preflight clean. --------- Co-authored-by: openclaw-bot --- .github/workflows/deploy.yml | 17 +- scripts/aggregate-e2e-pass.sh | 78 +++++++++ test-e2e-badge-aggregate.sh | 98 ++++++++++++ test-fixtures/e2e-output-sample.txt | 240 ++++++++++++++++++++++++++++ 4 files changed, 431 insertions(+), 2 deletions(-) create mode 100755 scripts/aggregate-e2e-pass.sh create mode 100755 test-e2e-badge-aggregate.sh create mode 100644 test-fixtures/e2e-output-sample.txt diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml index 2c96355a..b8accd33 100644 --- a/.github/workflows/deploy.yml +++ b/.github/workflows/deploy.yml @@ -279,7 +279,13 @@ jobs: - name: Generate frontend coverage badges if: success() run: | - E2E_PASS=$(grep -oP '[0-9]+(?=/)' e2e-output.txt | tail -1 || echo "0") + # Aggregate per-suite PASS/FAIL across every test-*-e2e.js summary. + # The previous regex (grep -oP '[0-9]+(?=/)' | tail -1) caught a + # stray digits-before-slash like the '2' in '2/3 tests passed' from + # some sub-output and stamped the badge as '2 passed'. See #1296. + eval "$(bash scripts/aggregate-e2e-pass.sh e2e-output.txt)" + E2E_PASS=${PASS:-0} + E2E_FAIL=${FAIL:-0} mkdir -p .badges if [ -f .nyc_output/frontend-coverage.json ] || [ -f .nyc_output/e2e-coverage.json ]; then @@ -292,7 +298,14 @@ jobs: echo "{\"schemaVersion\":1,\"label\":\"frontend coverage\",\"message\":\"${FE_COVERAGE}%\",\"color\":\"${FE_COLOR}\"}" > .badges/frontend-coverage.json echo "## Frontend: ${FE_COVERAGE}% coverage" >> $GITHUB_STEP_SUMMARY fi - echo "{\"schemaVersion\":1,\"label\":\"e2e tests\",\"message\":\"${E2E_PASS:-0} passed\",\"color\":\"brightgreen\"}" > .badges/e2e-tests.json + if [ "${E2E_FAIL:-0}" -gt 0 ]; then + E2E_MSG="${E2E_PASS:-0} passed, ${E2E_FAIL} failed" + E2E_COLOR="red" + else + E2E_MSG="${E2E_PASS:-0} passed" + E2E_COLOR="brightgreen" + fi + echo "{\"schemaVersion\":1,\"label\":\"e2e tests\",\"message\":\"${E2E_MSG}\",\"color\":\"${E2E_COLOR}\"}" > .badges/e2e-tests.json - name: Stop test server if: always() diff --git a/scripts/aggregate-e2e-pass.sh b/scripts/aggregate-e2e-pass.sh new file mode 100755 index 00000000..3b00aec8 --- /dev/null +++ b/scripts/aggregate-e2e-pass.sh @@ -0,0 +1,78 @@ +#!/usr/bin/env bash +# Aggregate E2E pass/fail counts across all per-suite summary lines. +# +# Each test-*-e2e.js emits a summary line in one of these shapes: +# "N passed, M failed" — most suites (=== Results: 4 passed, 0 failed ===) +# "passed N failed M" — observer-iata style +# "N/T tests passed[, M failed]" — issue-1224 / 1236 / 1273 style +# "N/T PASS" — logo-* suites +# "N/T passed" — nav-fluid / nav-priority / nav-more-floor +# ".js: PASS" — single-test suites (hamburger-dropdown) +# ".js: FAIL ..." — suite-level failure +# +# Per-test progress lines (leading whitespace + PASS:/FAIL:/✓/✗) are skipped to +# avoid double-counting. Each suite's summary is matched by the FIRST pattern +# that fits, so a single line cannot contribute to two counters. +# +# Usage: aggregate-e2e-pass.sh [path-to-e2e-output.txt] +# Prints: PASS= FAIL= +set -u +file=${1:-e2e-output.txt} +pass=0 +fail=0 +if [ ! -f "$file" ]; then + echo "PASS=0 FAIL=0" + exit 0 +fi +while IFS= read -r line || [ -n "$line" ]; do + # Skip per-test progress lines (leading whitespace + marker). + case "$line" in + *"✓"*|*"✗"*) continue;; + " "*"PASS"*|" "*"FAIL"*) continue;; + esac + + # "N/T tests passed[, M failed]" (must come BEFORE "N passed" to avoid + # capturing T as the pass count) + if [[ "$line" =~ ([0-9]+)/[0-9]+\ tests\ passed(,\ ([0-9]+)\ failed)? ]]; then + pass=$((pass + BASH_REMATCH[1])) + if [ -n "${BASH_REMATCH[3]:-}" ]; then + fail=$((fail + BASH_REMATCH[3])) + fi + continue + fi + # "N/T PASS" + if [[ "$line" =~ ([0-9]+)/[0-9]+\ PASS ]]; then + pass=$((pass + BASH_REMATCH[1])) + continue + fi + # "N/T passed" (no "tests" between) + if [[ "$line" =~ ([0-9]+)/[0-9]+\ passed ]]; then + pass=$((pass + BASH_REMATCH[1])) + continue + fi + # "N passed, M failed" / "N passed" (most suites) + if [[ "$line" =~ ([0-9]+)\ passed(,\ ([0-9]+)\ failed)? ]]; then + pass=$((pass + BASH_REMATCH[1])) + if [ -n "${BASH_REMATCH[3]:-}" ]; then + fail=$((fail + BASH_REMATCH[3])) + fi + continue + fi + # "passed N failed M" + if [[ "$line" =~ passed\ ([0-9]+)\ failed\ ([0-9]+) ]]; then + pass=$((pass + BASH_REMATCH[1])) + fail=$((fail + BASH_REMATCH[2])) + continue + fi + # Standalone single-suite ".js: PASS" + if [[ "$line" =~ \.js:\ PASS$ ]]; then + pass=$((pass + 1)) + continue + fi + # Standalone single-suite ".js: FAIL ..." + if [[ "$line" =~ \.js:\ FAIL ]]; then + fail=$((fail + 1)) + continue + fi +done < "$file" +echo "PASS=$pass FAIL=$fail" diff --git a/test-e2e-badge-aggregate.sh b/test-e2e-badge-aggregate.sh new file mode 100755 index 00000000..c274ddb9 --- /dev/null +++ b/test-e2e-badge-aggregate.sh @@ -0,0 +1,98 @@ +#!/usr/bin/env bash +# Test for scripts/aggregate-e2e-pass.sh — verifies aggregate across 45+ +# Playwright suites in test-fixtures/e2e-output-sample.txt is correct, not the +# broken old behavior of "grep digits-before-slash | tail -1" (which returned 2). +# +# Regression for #1296. +set -u +script_dir=$(cd "$(dirname "$0")" && pwd) +aggregator="$script_dir/scripts/aggregate-e2e-pass.sh" +fixture="$script_dir/test-fixtures/e2e-output-sample.txt" + +if [ ! -x "$aggregator" ]; then + chmod +x "$aggregator" +fi + +# --- Test 1: fixture aggregate --------------------------------------------- +out=$("$aggregator" "$fixture") +# Count expected pass: sum N for every per-suite summary in the fixture. +# Computed by hand from the fixture (45 suites, see file). +EXPECTED_PASS=108 +EXPECTED_FAIL=0 +EXPECTED="PASS=$EXPECTED_PASS FAIL=$EXPECTED_FAIL" + +if [ "$out" != "$EXPECTED" ]; then + echo "FAIL: fixture aggregate" + echo " expected: $EXPECTED" + echo " got: $out" + exit 1 +fi +echo "PASS: fixture aggregates to $out" + +# --- Test 2: the broken old regex would have returned something tiny ------- +# (sanity check that we are NOT just reproducing the bug). Requires grep -P +# (PCRE), which is available in GitHub-hosted Ubuntu runners and most Linux +# distros but not in BusyBox; skip gracefully if absent. +if echo "x1/2" | grep -qoP '[0-9]+(?=/)' 2>/dev/null; then + old=$(grep -oP '[0-9]+(?=/)' "$fixture" | tail -1) + if [ "$old" = "$EXPECTED_PASS" ]; then + echo "FAIL: old broken regex coincidentally matches expected — fixture is not discriminating" + exit 1 + fi + echo "PASS: old broken regex returned '$old' (NOT $EXPECTED_PASS) — fixture proves the bug" +else + echo "SKIP: grep -P unavailable, cannot verify old broken regex sanity" +fi + +# --- Test 3: synthetic with failures, ensures FAIL accounting -------------- +tmp=$(mktemp) +cat > "$tmp" <<'EOF' +=== Results: 4 passed, 1 failed === +=== Results: 2 passed, 0 failed === +test-foo.js: 3/5 passed +test-bar.js: PASS +test-baz.js: FAIL — boom +passed 7 failed 2 +EOF +out2=$("$aggregator" "$tmp") +rm -f "$tmp" +EXP2="PASS=17 FAIL=4" +if [ "$out2" != "$EXP2" ]; then + echo "FAIL: synthetic mixed pass/fail" + echo " expected: $EXP2" + echo " got: $out2" + exit 1 +fi +echo "PASS: synthetic mixed pass/fail aggregates to $out2" + +# --- Test 4: per-test progress lines must NOT be counted ------------------- +tmp=$(mktemp) +cat > "$tmp" <<'EOF' + ✓ test alpha + ✓ test beta + ✗ test gamma failed + PASS: detail line + FAIL: detail line +=== Results: 2 passed, 1 failed === +EOF +out3=$("$aggregator" "$tmp") +rm -f "$tmp" +EXP3="PASS=2 FAIL=1" +if [ "$out3" != "$EXP3" ]; then + echo "FAIL: per-test progress double-count" + echo " expected: $EXP3" + echo " got: $out3" + exit 1 +fi +echo "PASS: per-test progress lines correctly ignored ($out3)" + +# --- Test 5: empty / missing file ------------------------------------------ +out4=$("$aggregator" /nonexistent/path/nope.txt) +if [ "$out4" != "PASS=0 FAIL=0" ]; then + echo "FAIL: missing file should yield PASS=0 FAIL=0, got $out4" + exit 1 +fi +echo "PASS: missing file → PASS=0 FAIL=0" + +echo +echo "ALL TESTS PASS" diff --git a/test-fixtures/e2e-output-sample.txt b/test-fixtures/e2e-output-sample.txt new file mode 100644 index 00000000..80e16564 --- /dev/null +++ b/test-fixtures/e2e-output-sample.txt @@ -0,0 +1,240 @@ +Running test-e2e-playwright.js against http://localhost:13581 + ✓ home loads + ✓ nodes table renders + ✓ packets table renders + ✓ map renders + ✓ analytics renders + ✓ channels renders + ✓ live renders + ✓ customizer opens + +8/8 tests passed +Running test-filter-ux-e2e.js + ✓ filter compiles + ✓ filter evaluates equality + ✓ filter evaluates AND/OR + ✓ filter handles parens + ✓ negation works + +=== Results: 5 passed, 0 failed === +Running test-channel-issue-1087-e2e.js + ✓ tab renders + ✓ message list loads +=== Results: 2 passed, 0 failed === +Running test-channel-issue-1111-e2e.js + ✓ count badge updates + ✓ filter clears + ✓ scroll restores +=== Results: 3 passed, 0 failed === +Running test-map-modal-fluid-e2e.js + ✓ modal opens + ✓ modal closes + ✓ scroll lock + ✓ keyboard escape + +=== Results: 4 passed, 0 failed === +Running test-observer-iata-1188-e2e.js + ✓ observer IATA resolves + ✓ observer IATA missing handled + ✓ observer IATA legacy + ✓ observer IATA mixed + ✓ observer IATA empty +All observer-IATA E2E tests passed. + +=== Results: passed 5 failed 0 === +Running test-nav-fluid-1055-e2e.js + PASS: narrow viewport + PASS: medium viewport + PASS: wide viewport + +test-nav-fluid-1055-e2e.js: OK — 3/3 passed +Running test-nav-priority-1102-e2e.js + PASS: priority A + PASS: priority B + PASS: priority C + PASS: overflow rule + +test-nav-priority-1102-e2e.js: OK — 4/4 passed +Running test-nav-more-floor-1139-e2e.js + PASS: small + PASS: medium + +test-nav-more-floor-1139-e2e.js: OK — 2/2 passed +Running test-bottom-nav-1061-e2e.js + PASS: shows on mobile + PASS: hidden on desktop + PASS: highlights active + +test-bottom-nav-1061-e2e.js: 3 passed, 0 failed +Running test-gestures-1062-e2e.js + PASS: swipe left + PASS: swipe right + +test-gestures-1062-e2e.js: 2 passed, 0 failed +Running test-gestures-1185-scroll-discriminator-e2e.js + PASS: vertical scroll + PASS: horizontal swipe + PASS: diagonal ignored + +test-gestures-1185-scroll-discriminator-e2e.js: 3 passed, 0 failed +Running test-gesture-hints-1065-e2e.js + PASS: hint shows once + PASS: hint dismissed + PASS: localStorage stored + +test-gesture-hints-1065-e2e.js: 3 passed, 0 failed +Running test-channel-fluid-e2e.js + ✓ fluid layout 1 + ✓ fluid layout 2 + +=== Results: 2 passed, 0 failed === +Running test-table-fluid-e2e.js + ✓ table fluid 1 + ✓ table fluid 2 + +=== Results: 2 passed, 0 failed === +Running test-charts-fluid-1058-e2e.js + ✓ chart 1 + ✓ chart 2 + +=== #1058 fluid analytics charts E2E: 2 passed, 0 failed === +Running test-slideover-1056-e2e.js + ✓ slideover open + ✓ slideover close + ✓ slideover backdrop + ✓ slideover keyboard + +=== #1056 AC#4 slide-over E2E: 4 passed, 0 failed, 0 skipped === +Running test-slideover-1168-munger-e2e.js + ✓ munger 1 + ✓ munger 2 + +=== #1168 Munger SlideOver hardening: 2 passed, 0 failed === +Running test-logo-pulse-1173-e2e.js + ✓ pulse animates + ✓ pulse respects reduced motion + +=== #1173 logo-pulse E2E: 2 passed, 0 failed === +Running test-issue-1122-packets-filter-ux-e2e.js + ✓ packets filter UX 1 + ✓ packets filter UX 2 + ✓ packets filter UX 3 + +=== Results: 3 passed, 0 failed === +Running test-issue-1128-packets-layout-e2e.js + ✓ packets layout 1 + ✓ packets layout 2 + +=== Results: 2 passed, 0 failed === +Running test-issue-1128-multi-viewport-e2e.js + ✓ multi viewport 1 + ✓ multi viewport 2 + ✓ multi viewport 3 + +=== Results: 3 passed, 0 failed === +Running test-issue-1136-live-region-e2e.js + ✓ live region announces + ✓ live region debounced + +=== Results: 2 passed, 0 failed === +Running test-issue-1150-404-state-e2e.js + ✓ 404 renders + ✓ 404 link back + +=== Results: 2 passed, 0 failed === +Running test-issue-1146-path-link-contrast-e2e.js + ✓ contrast computed + ✓ contrast above threshold + +=== Results: 2 passed, 0 failed === +Running test-issue-1147-section-order-e2e.js + ✓ order 1 + ✓ order 2 + +=== Results: 2 passed, 0 failed === +Running test-issue-1151-orphan-separators-e2e.js + ✓ orphan separators handled + +=== #1151: 1 passed, 0 failed === +Running test-logo-rebrand-e2e.js + ✓ rebrand 1 + ✓ rebrand 2 + ✓ rebrand 3 + +test-logo-rebrand-e2e.js: 3/3 PASS +Running test-logo-theme-e2e.js + ✓ theme 1 + ✓ theme 2 + +test-logo-theme-e2e.js: 2/2 PASS +Running test-logo-default-sage-teal-e2e.js + ✓ default sage teal + +test-logo-default-sage-teal-e2e.js: 1/1 PASS +Running test-issue-1109-hamburger-dropdown-visible-e2e.js +test-issue-1109-hamburger-dropdown-visible-e2e.js: PASS +Running test-live-layout-1178-1179-e2e.js + ✓ live layout + ✓ live layout 2 + +=== Results: 2 passed, 0 failed === +Running test-issue-1205-live-controls-anchor-e2e.js + ✓ anchor 1 + ✓ anchor 2 + +=== Results: 2 passed, 0 failed === +Running test-live-mql-leak-1180-e2e.js + ✓ no leak + +=== Results: 1 passed, 0 failed === +Running test-issue-1204-live-panel-structure-e2e.js + ✓ structure 1 + ✓ structure 2 + +=== Results: 2 passed, 0 failed === +Running test-issue-1234-live-chrome-pass2-e2e.js + ✓ chrome pass2 + +=== Results: 1 passed, 0 failed === +Running test-issue-1206-vcr-overlap-e2e.js + ✓ vcr 1 + ✓ vcr 2 + +#1206 VCR overlap: 2 passed, 0 failed +Running test-issue-1244-live-vcr-row-hints-e2e.js + ✓ row hints + +=== Results: 1 passed, 0 failed === +Running test-issue-1224-channels-mobile-ux-e2e.js + ✓ mobile UX 1 + ✓ mobile UX 2 + ✓ mobile UX 3 + +3/3 tests passed +Running test-issue-1236-map-mobile-e2e.js + ✓ map mobile 1 + ✓ map mobile 2 + +2/2 tests passed +Running test-issue-1273-qr-overlay-height-e2e.js + ✓ qr overlay + +1/1 tests passed +Running test-issue-1281-location-row-e2e.js + ✓ location row 1 + ✓ location row 2 + +=== Results: 2 passed, 0 failed === +Running test-issue-1279-legend-p2-e2e.js + ✓ legend p2 + +=== Results: 1 passed, 0 failed === +Running test-issue-1206-resize-observer-leak-e2e.js + ✓ resize observer + +#1206 ResizeObserver leak: 1 passed, 0 failed +Running test-nav-drawer-1064-e2e.js + PASS: drawer 1 + PASS: drawer 2 + +test-nav-drawer-1064-e2e.js: 2 passed, 0 failed