mirror of
https://github.com/Kpa-clawbot/meshcore-analyzer.git
synced 2026-07-04 05:12:04 +00:00
c9b98cb15f
## Summary Fixes #1498. Roots out the actual WS-vs-REST race that has made `test-channels-ws-batch-e2e.js` flaky on master for ~2 weeks. ## Root cause `selectChannel()` and `refreshMessages()` unconditionally replace the in-memory `messages` array with the REST response. Any WebSocket-pushed messages appended between `selectedHash` assignment (when the chat view opens) and the REST resolution were silently stomped. The flaky test was a real-world manifestation: when the synthetic `processWSBatch` injection happened to land BEFORE the in-flight `/channels/<hash>/messages` fetch resolved, the (effectively empty) fixture REST response wiped it out. This is a production bug too — real users would lose any live message that arrived during channel load. ## Why the three prior PRs missed it - **#1499** — added a 500ms `waitForTimeout` before injection. Often enough to let the REST fetch resolve first, but not under any added load. - **#1502** — skipped the test instead of diagnosing. - **#1511** — re-enabled with a "wait by hash, not index" predicate. That fixed the symptom of `messages[length-1]` being some unrelated packet, but did nothing for the underlying race where the WS-pushed message gets wiped entirely by the REST replacement. None of the three PRs reproduced the failure locally. The hypothesis "closure over stale messages" in the test comment was never substantiated. ## Fix Stamp WS-pushed messages with `_fromWS=true` and add a `mergeWsAppendedIntoRest()` helper that preserves WS-pushed messages whose `packetHash` isn't already present in the REST response. Applied to all three REST replacement sites: - `selectChannel()` REST path - `decryptAndRender()` (encrypted channel path) - `refreshMessages()` (background poll) ## Tests Added `test-channels-ws-race-1498-e2e.js`. Deterministically forces the race by stubbing `fetch` to delay the `/channels/<hash>/messages` response 800ms, injects a WS message during the delay, asserts it survives the late REST resolution. - Red commit (`9dfc4b08`): test added against unfixed master HEAD → fails with `WS message stomped by REST fetch — messages after fetch: {"present":false,"count":0,"hashes":[]}`. - Green commit (`8f336591`): applies the fix → passes. Verified the red commit actually fails when the production change is reverted (TDD discipline check). ## Local repro stats Used the instrumented frontend (`public-instrumented/`) which exposes the race more reliably than the raw `public/` build (slower JS load widens the WS-vs-REST window). - Before fix: 29/30 pass (1 reproduced "injected message not found" failure — identical to CI). The new race test: 0/50 pass. - After fix: original `test-channels-ws-batch-e2e.js` — **50/50 pass**. New `test-channels-ws-race-1498-e2e.js` — **50/50 pass**. ## CI Wired the new race test into `.github/workflows/deploy.yml` right after the existing `test-channels-ws-batch-e2e.js` invocation. ## Preflight `bash ~/.openclaw/skills/pr-preflight/scripts/run-all.sh origin/master` → all gates pass (PII, branch scope, red commit, CSS vars, LIKE-on-JSON, sync migration, all warnings). Browser verified: the fix was validated end-to-end against the local fixture server (`http://localhost:13581`) using the headless Chromium the CI uses. E2E assertion added: `test-channels-ws-race-1498-e2e.js` (deterministic race regression). --------- Co-authored-by: bot <bot@local> Co-authored-by: corescope-bot <bot@corescope.local>