pyxis

mirror of https://github.com/torlando-tech/pyxis.git synced 2026-05-17 04:45:15 +00:00

Author	SHA1	Message	Date
Torlando	70b8df052d	Merge pull request #12 from torlando-tech/feature/splash-screen Add boot splash screen with Pyxis constellation logo	2026-03-04 18:38:43 -05:00
torlando-tech	ff00c1d783	Clean up splash preprocessor structure and add include warning Consolidate #ifndef/#ifdef into single #ifdef/#else/#endif block. Add warning comment to generated header about static linkage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 18:19:49 -05:00
torlando-tech	bae59ff424	Fix unused BG_COLOR warning and make show_splash() private Move BG_COLOR inline into #ifndef block to avoid unused variable when HAS_SPLASH_IMAGE is defined. Make show_splash() private since it's only called internally from init_hardware_only(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 17:48:38 -05:00
Torlando	3c81eeb5be	Merge pull request #11 from torlando-tech/fix/ble-wdt-stability Fix Task WDT crashes from LVGL priority starvation	2026-03-04 17:07:27 -05:00
torlando-tech	ed8c08109f	Add ble_hs_synced() guard to notifyAll() Matches the guard already on notify() to prevent use-after-free of _tx_char during a NimBLE host reset. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 16:20:06 -05:00
torlando-tech	5b2a1ab53e	Skip redundant fill_screen when full-screen splash image is available Saves one full 320x240 SPI screen write before the splash renders. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 16:09:49 -05:00
torlando-tech	4ba97057c5	Remove no-op esp_task_wdt_reset() calls from NimBLEPlatform BLE task is no longer subscribed to WDT, so these 23 calls were silently returning ESP_ERR_NOT_FOUND. Removes dead code and the now-unused esp_task_wdt.h include. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 15:32:44 -05:00
torlando-tech	2a1b98f8f1	Fix _initialized never set in Display::init() Prevents double PSRAM allocation and LVGL driver re-registration if init() were called more than once. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 14:39:49 -05:00
Torlando	2f61b80567	Update lib/tdeck_ui/Hardware/TDeck/Display.cpp Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-04 14:20:44 -05:00
Torlando	a93f7c258f	Apply suggestion from @greptile-apps[bot] Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-04 14:19:48 -05:00
Torlando	f24d80688a	Merge pull request #10 from torlando-tech/feature/sd-spi-bus-sharing Add shared SPI bus mutex for SD card coexistence	2026-03-04 14:18:22 -05:00
torlando-tech	0f85f4dd69	Add Pyxis logo to README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 14:12:36 -05:00
torlando-tech	27668a7515	Add PYXIS text to splash and render full-screen 320x240 Scale constellation to 80% and shift down to make room for title text. Generate splash at full display resolution instead of 160x160 centered. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 14:12:25 -05:00
torlando-tech	30dc48086c	Add splash screen icon and build-time SVG-to-RGB565 generator - pyxis-icon.svg: Pyxis constellation icon (3 stars with connecting lines) - generate_splash.py: PlatformIO pre-build script that renders the SVG to a 160x160 RGB565 PROGMEM header (SplashImage.h) using cairosvg + Pillow - .gitignore: Exclude generated SplashImage.h - platformio.ini: Add generate_splash.py to both build environments Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 14:12:25 -05:00
torlando-tech	a4a1aacdd8	Show boot splash within 1s of power-on instead of after 20s+ init Move Display::init_hardware_only() and POWER_EN to right after serial banner, before GPS/WiFi/SD/Reticulum init. Add 150ms delay after POWER_EN HIGH so ST7789V power rail stabilizes before SPI commands (without this, SWRESET is sent to an unpowered chip and silently lost). Splash now visible for entire boot period (~18s) until LVGL takes over. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 14:12:25 -05:00
torlando-tech	c80e63dee9	Fix Task WDT crashes: LVGL priority starvation + BLE WDT false positives Two root causes for frequent device reboots: 1. LVGL task (priority 2) starved loopTask (priority 1) on core 1. During heavy screen rendering, loopTask couldn't run for 30+ seconds, triggering the Task WDT. Fixed by lowering LVGL to priority 1 so FreeRTOS round-robins both tasks fairly. 2. BLE task was registered with the 30s Task WDT, but blocking NimBLE GATT operations (connect + service discovery + subscribe + read) can legitimately take 30-60s total. Removed BLE task from WDT since NimBLE has its own internal ~30s timeouts per GATT operation. Also added ble_hs_synced() guards to write(), read(), notify(), writeCharacteristic(), discoverServices(), and enableNotifications() to prevent use-after-free on stale NimBLE client pointers during host resets. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 14:11:51 -05:00
torlando-tech	0608af6d38	Unify all SPI peripherals on global FSPI to fix pin conflicts Display and LoRa were creating separate SPIClass(HSPI) instances which claimed GPIO pins via the matrix, preventing SD card (on FSPI) from accessing MISO after Display init. Now all three peripherals use the global SPI (FSPI) instance, eliminating GPIO routing conflicts. - Display: use &SPI instead of new SPIClass(HSPI) - SX1262Interface: use &SPI instead of new SPIClass(HSPI) - SDAccess: enable format_if_empty for unformatted cards Verified on device: SD (128GB SDHC), display, and LoRa all coexist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 09:41:35 -05:00
torlando-tech	b4afa6d3f7	Fix SD card SPI init: use FSPI before Display claims HSPI SD card was unresponsive (MISO stuck 0xFF) because Display's HSPI peripheral had already claimed the GPIO pins via the matrix, preventing FSPI from routing MISO. Fix by initializing SD card BEFORE Display, using the global SPI (FSPI) instance — matching LilyGo's reference code. - Move SD card init before display init in boot sequence - Use global SPI (FSPI) instead of Display's SPIClass(HSPI) - Lower SPI frequency to 800kHz matching LilyGo example - Drive all CS lines (display, LoRa, SD) high before SD init - Add MISO=38 to Display's SPI.begin for post-init bus sharing - Add Display::get_spi() accessor for future shared use Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 09:29:47 -05:00
torlando-tech	d03f0b308f	Add shared SPI bus mutex for SD card, display, and LoRa coexistence The T-Deck Plus shares HSPI across the display (CS=12), LoRa (CS=9), and SD card (CS=39). Previously SD logging was disabled because SD.begin() reconfigured the SPI bus and blanked the display. This introduces a FreeRTOS mutex created in main.cpp and injected into Display, SX1262Interface, and a new SDAccess class so all three peripherals serialize their SPI transactions safely. - Add SDAccess class wrapping SD.begin() and file ops with mutex - Add set_spi_mutex() to Display and SX1262Interface - Wrap Display flush, fill, draw, and power ops in mutex - Refactor SDLogger to use SDAccess mutex instead of owning SD.begin() - Wire up mutex creation and injection order in setup() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 00:19:10 -05:00
Torlando	d9411fb4bb	Merge pull request #7 from torlando-tech/ble-stability-audit BLE stability: fix desync crash loops and scan recovery v0.2.2	2026-03-03 23:39:38 -05:00
Torlando	e3cf0c75bf	Merge pull request #8 from torlando-tech/fix/flasher-versioned-releases Fix web flasher versioned releases (CORS)	2026-03-03 23:39:17 -05:00
torlando-tech	fd62d5042f	Use exact match for HTTP status code check grep -q 200 could match 2001 or other superstrings; -qx anchors to the full line which is correct for curl's %{http_code} output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 23:33:10 -05:00
Torlando	5971771ed2	Merge pull request #5 from davidcranor/fix/cross-platform-build Fix cross-platform build: replace ${PROJECT_DIR} with relative paths	2026-03-03 23:29:38 -05:00
Torlando	1d0a58de64	Merge pull request #9 from torlando-tech/ci/build-check Add CI build check for pull requests	2026-03-03 21:26:33 -05:00
torlando-tech	1c5dd9936a	Add CI build check for pull requests Runs both tdeck and tdeck-bluedroid builds on every PR to main. Uploads firmware binaries as artifacts on the tdeck build for easy testing by reviewers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 21:17:14 -05:00
torlando-tech	4f22776971	Add tone as explicit dependency of tdeck_ui UIManager.cpp includes Tone.h, so tdeck_ui should declare this dependency rather than relying on implicit global discovery. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 21:05:42 -05:00
torlando-tech	8ebb637a78	Add landing page at GitHub Pages root from README.md Generates a styled HTML page from README.md and deploys it to the root of the GitHub Pages site, fixing the 404 at torlando-tech.github.io/pyxis/. Includes navigation links to the web flasher, GitHub repo, and releases. Closes #6 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 20:59:56 -05:00
torlando-tech	ee87edef7a	Fix web flasher versioned releases failing with CORS error GitHub release download URLs redirect to release-assets.githubusercontent.com which doesn't return Access-Control-Allow-Origin headers. The browser blocks cross-origin fetches from the GitHub Pages flasher, causing "Failed to fetch" for any versioned release while the latest dev build (same-origin) works fine. Fix: Deploy versioned firmware binaries to GitHub Pages alongside the dev build at firmware/releases/{tag}/, so all versions are fetched same-origin. The CI workflow now downloads existing release assets and deploys them to Pages with keep_files: true to preserve across deploys. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 20:57:38 -05:00
torlando-tech	46ce057a1e	BLE stability: host-controller resync, stuck GAP conn cancel, scan diagnostics After a 574 connection failure, the NimBLE controller's scan state can become corrupted (returning rc=530 / Invalid HCI Params) even after the host re-syncs. This led to scan failure escalation and device reboots. Key fixes: - Add ble_gap_conn_cancel() to enterErrorRecovery() — stuck GAP master connection operations were blocking all subsequent scans - Add ble_hs_sched_reset(BLE_HS_ECONTROLLER) in error recovery to force a full host-controller resynchronization after desync - Proactively cancel stale GAP connections before scan start - Reduce SCAN_FAIL_RECOVERY_THRESHOLD from 10 to 5 for faster recovery - Enhanced scan failure logging with GAP state diagnostics - Move ESP reset reason logging after WiFi init for UDP log visibility - Suppress connection candidate log spam when at max connections Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 19:57:55 -05:00
torlando-tech	2cc9441f0a	BLE stability: desync connect cooldown prevents crash-on-connect Add 30-second cooldown after NimBLE host desync recovery before allowing new connection attempts. During desync, client->connect() blocks waiting for a host-task completion event that never arrives, causing WDT crashes. The cooldown skips connection attempts while the host is desynced or recently recovered. Also adds ESP reset reason logging at boot to diagnose crash types (WDT, panic, brownout, etc.) in soak test logs. Soak test results: Run 3 (before) had 17 reboots in ~4 hours with a 12-crash-in-14-minutes loop. Run 4 (after) has 1 early reboot then 19+ hours of continuous uptime with the same desync frequency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 18:34:40 -05:00
davidcranor	827ff2eb42	Fix cross-platform build: replace ${PROJECT_DIR} with relative paths platformio.ini: - Replace -I${PROJECT_DIR}/lib, -I${PROJECT_DIR}/deps/... with relative paths (-Ilib, -Ideps/...) in both tdeck-bluedroid and tdeck environments; ${PROJECT_DIR} is mangled on Windows inside build_flags, causing include paths to resolve inside the PlatformIO builder directory instead of the project root - Remove hardcoded -I.pio/libdeps/tdeck/TinyGPSPlus/src and -I.pio/libdeps/tdeck/NimBLE-Arduino/src; these paths reference generated cache, break on fresh clones, and are redundant with lib_ldf_mode = deep+ - Fix OTA upload_command: replace python3 with $PYTHONEXE so it resolves to PlatformIO's bundled Python on Windows, macOS, and Linux src/main.cpp, lib/tdeck_ui/UI/LXMF/UIManager.cpp: - Change #include "tone/Tone.h" to #include "Tone.h"; PlatformIO automatically adds -Ilib/tone for local libraries, making the subdirectory prefix unnecessary and broken when -Ilib is not effective Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-03 15:19:32 -05:00
torlando-tech	74d832fb63	NimBLE patches: fix 574 stuck GAP state, add desync diagnostics Patch 3 (ble_gap.c): Handle BLE_ERR_CONN_ESTABLISHMENT (574) unconditionally. NimBLE only handled 574 under BLE_PERIODIC_ADV_WITH_RESPONSES (disabled on ESP32), causing ble_gap_master_failed() to never be called. This left the master GAP state stuck in BLE_GAP_OP_M_CONN, permanently blocking scan and advertising. Also clean up master state in the default case instead of assert(0). Patch 4 (NimBLEDevice.cpp): Expose host reset reason via global volatile int. NimBLE's onReset callback logs the reason code through ESP_LOG (serial UART only). This patch adds nimble_host_reset_reason that the BLE loop polls to capture the reason in UDP log output for remote soak test monitoring. NimBLEPlatform.cpp: Escalate persistent scan failures to full stack recovery. After 3 consecutive enterErrorRecovery() rounds fail to restore scanning (30 total scan failures), escalate to recoverBLEStack() (clean reboot) instead of looping indefinitely in a broken state. Validated with 17+ hour soak test: device recovers from desyncs and maintains 3 active BLE connections with stable heap (~43K). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 12:49:41 -05:00
torlando-tech	8d23c03e3b	Fix conversation list showing hashes instead of display names after restart After boot, the conversation list called recall_app_data() once during initial load. If announces hadn't arrived yet (or known destinations hadn't been loaded with app_data), conversations showed raw hashes permanently until the user navigated away and back. Add a lazy name resolution check to update_status() (called every 3s): if any conversations have unresolved names, try recall_app_data() again and refresh the list when a display name becomes available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 00:24:30 -05:00
torlando-tech	d6d4eb2c9c	BLE stability: defer disconnect processing, fix data races, harden operations Critical fixes for NimBLE host task / BLE loop task concurrency: - Defer all disconnect map cleanup from NimBLE callbacks to loop task via SPSC ring buffer, preventing iterator invalidation and use-after-free - Defer enterErrorRecovery() from callback context to loop task - Add WDT feed in enterErrorRecovery() host-sync polling loop Operational hardening: - Cache NimBLERemoteCharacteristic* pointers in write() to avoid repeated service/characteristic lookups per fragment - Add isConnected() checks before GATT operations (read, enableNotifications) - Validate peer address in notification callback to guard against handle reuse - Skip stuck-state detector during CONNECTING/CONN_STARTING states - Expire stale pending data entries after HANDSHAKE_TIMEOUT (30s) - Read actual connection RSSI via ble_gap_conn_rssi() for peripheral connections instead of hardcoding 0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 00:15:24 -05:00
Torlando	774f055362	Merge pull request #4 from torlando-tech/torlando-tech-patch-1 Revise README for LXMF and LXST client details	2026-03-02 23:30:05 -05:00
Torlando	7b2a4b80c8	Revise README for LXMF and LXST client details Updated README to reflect changes in features and stability.	2026-03-02 23:29:55 -05:00
torlando-tech	609a3bc62b	LXMF propagation sync, manual node entry, and status improvements Propagation sync (microReticulum submodule): - Fix msgpack interop: send nil (not 0) for per_transfer_limit so Python server doesn't reject all messages as exceeding "0 KB limit" - Fix Resource response routing: extract request_id from packed data when not present in Resource advertisement, route to pending request callback instead of generic concluded handler - Fix Link::request() to manually build packed arrays, avoiding Bytes::to_msgpack() BIN-wrapping that breaks protocol interop UI enhancements: - PropagationNodesScreen: manual node entry via 32-char hex hash in search field, with paste support and radio button selection - StatusScreen: display stamp cost from propagation node - UIManager: NVS persistence for selected propagation node, proactive path request on node selection, sync state machine with timeout handling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> v0.2.1	2026-03-02 23:03:32 -05:00
Torlando	7d687ea8a8	Merge pull request #3 from torlando-tech/flasher-version-picker Add firmware version picker to web flasher v0.2.0	2026-02-25 16:58:27 -05:00
torlando-tech	fd6fb4bcda	Add firmware version picker to web flasher Lets users choose between the latest dev build and tagged GitHub releases. The dropdown queries the GitHub Releases API on page load and swaps firmware fetch paths between Pages-relative and release-asset URLs. CI now attaches all 4 firmware files to releases (bootloader, partitions, boot_app0, firmware) so full installs work from any release version. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 16:36:03 -05:00
Torlando	1a6c9aee44	Merge pull request #2 from torlando-tech/transport-pools-psram Stability fixes, BLE PSRAM pools, LXST voice calls	2026-02-25 11:35:11 -05:00
torlando-tech	ea586112e3	CI: auto-build and deploy firmware to GitHub Pages on every push to main Rewrite release-firmware.yml to build the tdeck (NimBLE) env on pushes to main (versioned as dev-<sha>) and on v* tags. Remove checked-in firmware binaries from git tracking — CI now generates and deploys them to Pages. Release creation is conditional on v* tags only. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 10:57:14 -05:00
torlando-tech	6744eb136d	LXST voice call stability: fix hangup crash, signal queue, TX pump, mic tuning - Fix use-after-free crash on hangup: set _call_state=IDLE before deleting _lxst_audio, preventing pump_call_tx() (runs without LVGL lock) from accessing freed memory - Replace single-slot _call_signal_pending with 8-element ring buffer queue to prevent signal loss when CONNECTING+ESTABLISHED arrive in rapid succession - Extract TX pump into pump_call_tx() called right after reticulum->loop() for low-latency audio TX without LVGL lock dependency (was buried at step 10) - Tune ES7210 mic gain to 21dB (was 15dB) to improve Codec2 input level without ADC clipping that occurred at 24dB - I2S capture: use APLL for accurate 8kHz clock, direct 8kHz sampling (no more 16→8kHz decimation), DMA 16x64 for encode burst headroom - Reduce Reticulum log verbosity to LOG_INFO (was LOG_TRACE) - BLE: add ble_hs_sched_reset() tiered recovery before reboot on desync, widen supervision timeout to 4.0s for WiFi coexistence - Add UDP multicast log broadcasting and OTA flash support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 10:57:14 -05:00
torlando-tech	ddd19a04db	Fix LXST TX audio wire format to match Columba's expected batch size Columba's native OboePlaybackEngine ring buffer expects exactly frameSamples (1600 for Codec2 3200 mode) decoded samples per writeEncodedPacket call = 10 sub-frames of 160 samples each. Changes: - Batch exactly 10 sub-frames per fixarray element (82 bytes each: codec_type + mode_header + 10*8 raw bytes) - Up to 2 batches per msgpack packet, matching Columba C2C format - Proper fixarray wrapping for multi-batch, bare bin8 for single - Add codec_type byte (0x02) prefix per batch element - Respond to PREFERRED_PROFILE negotiation with LBW (Codec2 3200) - Add capture diagnostics (raw PCM peaks, I2S dump, rate logging) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 10:57:14 -05:00
torlando-tech	6e47cb808b	Increase playback buffer for jitter-free LXST RX audio PCM_RING_FRAMES 16→50 (320ms→1000ms capacity) and PREBUFFER_FRAMES 3→15 (60ms→300ms prebuffer) to match LXST-kt's buffering strategy. Interop test suite confirms zero underruns with ±100ms jitter at these settings. Also adds tests/interop/ with 48 Python tests verifying wire format, codec round-trip, and pipeline compatibility between Pyxis, Python LXST, and LXST-kt implementations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 10:57:07 -05:00
torlando-tech	e263e1e7a6	Add OTA flashing and wireless UDP log broadcasting ArduinoOTA enables wireless firmware uploads (pio run -e tdeck-ota -t upload). UDP log callback via RNS::setLogCallback sends all log lines plus Serial.printf diagnostics to multicast group 239.0.99.99:9999 for untethered monitoring. Includes safety guards: UDP suspended during WiFi transitions, reentrancy protection, and WiFi status check before each send. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 17:43:48 -05:00
torlando-tech	bb58c69d38	Add core dump partition for crash analysis Carve 64KB from SPIFFS for a coredump partition at 0x7F0000. On any crash (panic, WDT, assert), the ESP32 writes CPU state and backtrace to flash, readable on next boot with espcoredump. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 16:57:05 -05:00
torlando-tech	5949cd97ff	Reduce BLE desync reboot tolerance from 5min to 90s with connections A desynced NimBLE host can't actually communicate over existing connections, so they're effectively zombies. Waiting 5 minutes left the device unresponsive. 90s gives enough time for self-recovery while avoiding prolonged dead states. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 16:39:43 -05:00
torlando-tech	4e1f379d94	Persist only contacts to flash, mark on send/receive Only destinations that have exchanged messages are written to SPIFFS. UIManager marks destinations as persistent on send_message() and on_message_received(). Reduces persist time from 40-50s to <1s. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 13:17:28 -05:00
torlando-tech	e343caf2d2	Stability: WDT yield, BLE mutex fixes, time-based desync recovery Reduces crash rate from every 60-85s to 1 reboot per 6+ minutes. Zero WDT triggers in 10-minute stability test. BLE mutex fixes (BLEInterface.cpp): - Release _mutex before blocking GATT ops in onConnected() and onServicesDiscovered() — prevents 5-30s main-loop stalls during service discovery, notification subscribe, identity exchange - Non-blocking try_lock() for peerCount(), getConnectedPeerSummaries(), get_stats() — returns empty/default if BLE task holds mutex - Write-without-response in initiateHandshake() WDT and persistence (main.cpp, sdkconfig.defaults, microReticulum): - 30s WDT timeout (up from 10s) for SPIFFS flash I/O headroom - Register Identity::set_persist_yield_callback() to feed WDT every 5 entries during save_known_destinations() (70+ entries = 30-50s) - WDT feeds between reticulum and identity persist calls BLE host desync recovery (NimBLEPlatform): - Time-based desync tracking instead of aggressive counter-based reboot - 60s tolerance without connections, 5 minutes with active connections (data still flows over existing BLE mesh links) - Remove immediate recoverBLEStack() from 574 handler and enterErrorRecovery() — let startScan() manage reboot decision - Increase CONNECTION_COOLDOWN from 3s to 10s to reduce 574 risk - Increase SCAN_FAIL_RECOVERY_THRESHOLD from 5 to 10 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 12:30:30 -05:00
torlando-tech	3ca27f53f6	Task watchdog, BLE mutex fixes, NimBLE crash-safe recovery Subscribe loopTask and BLE task to the ESP32 Task Watchdog (10s timeout) to detect and recover from silent hangs. Per-step WDT feeds in the main loop prevent false triggers from cumulative slow operations. Fix BLE mutex starvation that blocked the main loop for 3-6s: - Move processDiscoveredPeers() out of performMaintenance() so _mutex is not held during blocking NimBLE connect calls - Use try_lock() in send_outgoing() to skip sends when BLE task has the mutex, rather than blocking (Reticulum retransmits) - Switch BLE data writes to write-without-response (non-blocking) - Add WDT feeds to all NimBLE blocking wait loops Replace NimBLE soft-reset recovery with immediate reboot — deinit() during sync failures caused CORRUPT HEAP panics. With atomic file persistence, data survives reboots reliably. Reduce loop task stack from 49KB to 16KB (measured peak ~6KB). Add NimBLE PHY update null guard to patch_nimble.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 10:45:43 -05:00

1 2

96 Commits