Files
meshcore-analyzer/.github
Kpa-clawbot c9b98cb15f fix(#1498): preserve WS-pushed messages across REST replacements (#1513)
## Summary

Fixes #1498. Roots out the actual WS-vs-REST race that has made
`test-channels-ws-batch-e2e.js` flaky on master for ~2 weeks.

## Root cause

`selectChannel()` and `refreshMessages()` unconditionally replace the
in-memory `messages` array with the REST response. Any WebSocket-pushed
messages appended between `selectedHash` assignment (when the chat view
opens) and the REST resolution were silently stomped. The flaky test
was a real-world manifestation: when the synthetic `processWSBatch`
injection happened to land BEFORE the in-flight
`/channels/<hash>/messages` fetch resolved, the (effectively empty)
fixture REST response wiped it out. This is a production bug too —
real users would lose any live message that arrived during channel
load.

## Why the three prior PRs missed it

- **#1499** — added a 500ms `waitForTimeout` before injection. Often
  enough to let the REST fetch resolve first, but not under any added
  load.
- **#1502** — skipped the test instead of diagnosing.
- **#1511** — re-enabled with a "wait by hash, not index" predicate.
  That fixed the symptom of `messages[length-1]` being some unrelated
  packet, but did nothing for the underlying race where the WS-pushed
  message gets wiped entirely by the REST replacement.

None of the three PRs reproduced the failure locally. The hypothesis
"closure over stale messages" in the test comment was never
substantiated.

## Fix

Stamp WS-pushed messages with `_fromWS=true` and add a
`mergeWsAppendedIntoRest()` helper that preserves WS-pushed messages
whose `packetHash` isn't already present in the REST response. Applied
to all three REST replacement sites:

- `selectChannel()` REST path
- `decryptAndRender()` (encrypted channel path)
- `refreshMessages()` (background poll)

## Tests

Added `test-channels-ws-race-1498-e2e.js`. Deterministically forces
the race by stubbing `fetch` to delay the
`/channels/<hash>/messages` response 800ms, injects a WS message
during the delay, asserts it survives the late REST resolution.

- Red commit (`9dfc4b08`): test added against unfixed master HEAD →
  fails with `WS message stomped by REST fetch — messages after fetch:
  {"present":false,"count":0,"hashes":[]}`.
- Green commit (`8f336591`): applies the fix → passes.

Verified the red commit actually fails when the production change is
reverted (TDD discipline check).

## Local repro stats

Used the instrumented frontend (`public-instrumented/`) which exposes
the race more reliably than the raw `public/` build (slower JS load
widens the WS-vs-REST window).

- Before fix: 29/30 pass (1 reproduced "injected message not found"
  failure — identical to CI). The new race test: 0/50 pass.
- After fix: original `test-channels-ws-batch-e2e.js` — **50/50 pass**.
  New `test-channels-ws-race-1498-e2e.js` — **50/50 pass**.

## CI

Wired the new race test into `.github/workflows/deploy.yml` right
after the existing `test-channels-ws-batch-e2e.js` invocation.

## Preflight

`bash ~/.openclaw/skills/pr-preflight/scripts/run-all.sh origin/master`
→ all gates pass (PII, branch scope, red commit, CSS vars,
LIKE-on-JSON, sync migration, all warnings).

Browser verified: the fix was validated end-to-end against the local
fixture server (`http://localhost:13581`) using the headless Chromium
the CI uses.

E2E assertion added: `test-channels-ws-race-1498-e2e.js` (deterministic
race regression).

---------

Co-authored-by: bot <bot@local>
Co-authored-by: corescope-bot <bot@corescope.local>
2026-05-31 11:29:15 -07:00
..