Patch 3 (ble_gap.c): Handle BLE_ERR_CONN_ESTABLISHMENT (574) unconditionally.
NimBLE only handled 574 under BLE_PERIODIC_ADV_WITH_RESPONSES (disabled on
ESP32), causing ble_gap_master_failed() to never be called. This left the
master GAP state stuck in BLE_GAP_OP_M_CONN, permanently blocking scan and
advertising. Also clean up master state in the default case instead of
assert(0).
Patch 4 (NimBLEDevice.cpp): Expose host reset reason via global volatile int.
NimBLE's onReset callback logs the reason code through ESP_LOG (serial UART
only). This patch adds nimble_host_reset_reason that the BLE loop polls to
capture the reason in UDP log output for remote soak test monitoring.
NimBLEPlatform.cpp: Escalate persistent scan failures to full stack recovery.
After 3 consecutive enterErrorRecovery() rounds fail to restore scanning (30
total scan failures), escalate to recoverBLEStack() (clean reboot) instead
of looping indefinitely in a broken state.
Validated with 17+ hour soak test: device recovers from desyncs and maintains
3 active BLE connections with stable heap (~43K).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Subscribe loopTask and BLE task to the ESP32 Task Watchdog (10s timeout)
to detect and recover from silent hangs. Per-step WDT feeds in the main
loop prevent false triggers from cumulative slow operations.
Fix BLE mutex starvation that blocked the main loop for 3-6s:
- Move processDiscoveredPeers() out of performMaintenance() so _mutex
is not held during blocking NimBLE connect calls
- Use try_lock() in send_outgoing() to skip sends when BLE task has
the mutex, rather than blocking (Reticulum retransmits)
- Switch BLE data writes to write-without-response (non-blocking)
- Add WDT feeds to all NimBLE blocking wait loops
Replace NimBLE soft-reset recovery with immediate reboot — deinit()
during sync failures caused CORRUPT HEAP panics. With atomic file
persistence, data survives reboots reliably.
Reduce loop task stack from 49KB to 16KB (measured peak ~6KB).
Add NimBLE PHY update null guard to patch_nimble.py.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
NimBLE crash fix:
- Patch ble_hs.c assert(0) in BLE_HS_SYNC_STATE_BRINGUP timer handler
via pre-build script (patch_nimble.py). The assert fires when a timer
callback races with host re-sync — harmless, but kills the ESP32 and
corrupts any file writes in progress.
Persistence fixes (in microReticulum submodule):
- Atomic save: write to temp file then rename, protecting existing data
- Fast persist: 5s after dirty flag instead of waiting 60s interval
- Corrupt file recovery: delete invalid files, recover from temp files
- INFO-level logging for load/save visibility
Other:
- Wrap LXMF announce in try/catch for crash safety
- Call Identity::should_persist_data() from main loop
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>