Commit Graph

117 Commits

Author SHA1 Message Date
torlando-tech 8518a5d1b3 Fix map markers not clearing on cease and add telemetry RX logging
Clear peer markers when locations list is empty (e.g. after cease signal)
instead of returning early, which left stale markers on the map. Add debug
logging around COLUMBA_META field handling to trace cease signal flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 10:58:34 -04:00
torlando-tech ae234e81af Fix receiving LXMF telemetry: field unpacking, silent handling, map markers
- Fix Bytes({key_int}) bug in LXMessage::unpack_from_bytes() that caused
  incoming field keys to be empty, making fields_get() always miss
- Telemetry-only messages (fields present, no body) now skip chat bubble,
  notification sound, and message store entirely
- Implement position_peer_markers() with proper lat/lon to screen coords
- Add display name resolution for map markers via recall_app_data()
- Store peer locations for repositioning on pan/zoom

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 11:33:50 -05:00
torlando-tech 8d4a047c14 Fix GPS time sync: double-mktime corruption, missing retry, no DST
Three issues caused wildly incorrect device time:

1. mktime() was called before setting TZ=UTC, corrupting the time
   struct. The second mktime() then operated on mutated values.
   Fix: set TZ=UTC0 before the single mktime() call.

2. GPS cold start takes minutes but boot only waited 15s. If GPS
   missed the window, time was never set. Fix: retry in main loop
   once GPS reports a valid fix.

3. GPS timezone used a raw longitude offset with no DST rules, so
   EDT was shown as EST (off by 1 hour). Fix: use proper POSIX TZ
   strings with DST rules for US timezones.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 10:06:05 -05:00
torlando-tech cab566a69e Fix telemetry sending to Columba: field keys, encoding, and expiry
Three bugs prevented Pyxis telemetry from appearing on Columba's map:

1. Bytes({0x02}) called Bytes(size_t capacity=2) instead of creating a
   1-byte buffer containing 0x02. Both field keys were empty, so the
   second fields_set() overwrote the first — FIELD_TELEMETRY was missing
   from the wire payload. Fixed with Bytes(&key, 1).

2. FIELD_COLUMBA_META was encoded as msgpack but Columba expects JSON
   (json.loads after .decode('utf-8')). Changed to manual JSON string.

3. expires was sent as Unix seconds but Columba compares against
   System.currentTimeMillis() (milliseconds). Locations were immediately
   deleted as expired. Now sends expires_ms = end_time * 1000.

Also: set OPPORTUNISTIC delivery method on telemetry/cease messages,
add pyxis_log() diagnostics for telemetry pipeline, and manage tile
download task lifecycle (start on show, stop on hide).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 09:57:50 -05:00
torlando-tech 0d3e1bae30 Fix tile download TLS memory and map rendering performance
- Route mbedtls allocations to PSRAM via mbedtls_platform_set_calloc_free()
  to fix TLS handshake failure (-32512 SSL alloc) with only ~36KB internal heap
- Use WiFiClientSecure with setInsecure() for HTTPS tile downloads from OSM
- Move tile downloads to background FreeRTOS task to avoid blocking UI thread
- Add incremental tile loading (one PNG decode per update cycle) to prevent
  LVGL mutex timeout from decoding 4 tiles synchronously
- Enable LVGL image cache (LV_IMG_CACHE_DEF_SIZE=8) so decoded PNGs stay
  in PSRAM and don't re-decode on every redraw
- Add touch drag panning for map navigation via LV_EVENT_PRESSING

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 19:08:28 -05:00
torlando-tech 1a89466b66 Add on-demand tile downloading with SD card caching
TileDownloader fetches OSM raster tiles over HTTPS when not found
on SD card, saves them for permanent offline cache. MapScreen calls
ensure_tile() before each LVGL load — tiles download once, then
load from SD on subsequent views.

Default tile server: tile.openstreetmap.org (configurable via
set_tile_url). Proper User-Agent set per OSM usage policy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 14:58:19 -05:00
torlando-tech e9e091114a Merge remote-tracking branch 'origin/main' into feature/map-telemetry 2026-03-05 14:50:20 -05:00
Torlando fd5e9316f9 Merge pull request #13 from torlando-tech/fix/ble-map-race-condition
Fix cross-thread race condition on BLE connection maps
2026-03-05 14:49:53 -05:00
torlando-tech e2df70161b Protect _clients lookup in discoverServices with _conn_mutex
The unprotected _clients.find() could race with
processPendingDisconnects() erasing from the map concurrently.
Mutex is released before the blocking getService() GATT call.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 12:30:52 -05:00
torlando-tech 602d8f7083 Improve write() cache-miss warning to indicate discovery dependency
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 12:28:35 -05:00
torlando-tech 74def922a7 Guard enableNotifications against missing connection entry
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 12:05:50 -05:00
torlando-tech 8013597a5f Avoid redundant mutex retry in discoverServices error path
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 11:59:05 -05:00
torlando-tech d9883c9e36 Report failure on mutex timeout in discoverServices
Previously, a mutex timeout left characteristic caches empty but
still signalled success to callers, making all GATT ops silently
fail for the connection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 11:46:43 -05:00
torlando-tech 8c0dd227f4 Add missing mutex timeout warning in updateConnectionMTU
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 11:44:30 -05:00
torlando-tech 20a072d258 Fix TOCTOU in disconnect, stale cache in discovery, silent onConnect failures
- Re-check hasActiveWriteOperations() after acquiring mutex in
  processPendingDisconnects() to close race where write() registers
  an op between the pre-mutex check and mutex acquisition
- Move cached char pointer writes inside connection-exists guard in
  discoverServices() to prevent dangling pointers on handle reuse
- Add WARNING logs to both onConnect callbacks on mutex timeout

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 11:01:55 -05:00
torlando-tech 9bd075c91e Cache char pointers in discoverServices, defer disconnect, add mutex timeout logs
- Move getService()/getCharacteristic() out of mutex-held paths in
  writeCharacteristic(), read(), enableNotifications() by caching all
  three char pointers (RX, TX, Identity) during discoverServices()
- Replace 5-second spin-wait in processPendingDisconnects() with
  non-blocking deferral: break if GATT ops in flight, retry next loop
- Add WARNING logs to all read-path helpers on mutex timeout

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 01:50:46 -05:00
torlando-tech 334f024179 Close TOCTOU gap and protect onConnect map insertions
- Move beginWriteOperation() before xSemaphoreGive(_conn_mutex) in
  write(), writeCharacteristic(), read(), and enableNotifications()
  so the active-op counter is incremented while the mutex is still
  held. This closes the window where processPendingDisconnects()
  could observe hasActiveWriteOperations()==false and delete the
  client before the GATT caller has registered its operation.
- Add _conn_mutex around _connections/_clients insertions in both
  server and client onConnect() callbacks, preventing concurrent
  map insertions from corrupting the red-black tree.
- Protect updateConnectionMTU() with _conn_mutex.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 00:16:58 -05:00
torlando-tech fffd8ec79e Add beginWriteOperation() guards to all blocking GATT methods
writeCharacteristic(), read(), and enableNotifications() resolve
characteristic pointers under _conn_mutex then call blocking GATT
ops after releasing it — same pattern as write(). Without the
active-operation guard, processPendingDisconnects() could delete
the client (and its child characteristics) during the GATT call.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 23:54:01 -05:00
torlando-tech 71bc4ae82b Address Greptile review: fix use-after-free and unprotected accessors
- Defer NimBLEDevice::deleteClient() in processPendingDisconnects()
  until after releasing _conn_mutex and waiting for any active write
  operations to complete. Prevents use-after-free when write() holds
  a child NimBLERemoteCharacteristic* pointer across the mutex boundary.
- Add _conn_mutex protection to getConnectionCount(), isConnectedTo(),
  and isDeviceConnected() which read _connections without synchronization.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 23:05:47 -05:00
torlando-tech 9174d27183 Fix cross-thread race condition on BLE connection maps
send_outgoing() on loopTask (core 1) calls write() which reads
_connections, _clients, and _cached_rx_chars maps, while
processPendingDisconnects() on the BLE task (core 0) erases from
them — with no synchronization. This causes std::map red-black tree
corruption, manifesting as LoadProhibited crashes in map rotate/insert
operations (EXCVADDR=0x00000008).

Protect all map accesses in write(), writeCharacteristic(), read(),
enableNotifications(), getConnection(), getConnections(), and
processPendingDisconnects() with _conn_mutex. The mutex is released
before any blocking GATT operations (writeValue, readValue, subscribe)
to avoid holding it during 10-30s NimBLE timeouts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 22:54:29 -05:00
torlando-tech cff41d4fa0 Add offline map display and Sideband-compatible telemetry location sharing
- LVGL PNG decoder (lodepng) + SD card filesystem driver for loading OSM tiles
- MapScreen with 2x2 tile grid, GPS marker, peer location markers, pan/zoom
- 5th nav button (GPS icon) on conversation list for map access
- TelemetryCodec: Sideband/Columba-compatible LXMF telemetry encode/decode
- TelemetryManager: per-peer sharing sessions with duration/expiry, SPIFFS persistence
- ChatScreen location share button with duration picker (15min/1hr/4hr/indefinite)
- UIManager integration: telemetry send/receive via LXMF fields, map marker updates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 21:05:28 -05:00
Torlando 70b8df052d Merge pull request #12 from torlando-tech/feature/splash-screen
Add boot splash screen with Pyxis constellation logo
2026-03-04 18:38:43 -05:00
torlando-tech ff00c1d783 Clean up splash preprocessor structure and add include warning
Consolidate #ifndef/#ifdef into single #ifdef/#else/#endif block.
Add warning comment to generated header about static linkage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 18:19:49 -05:00
torlando-tech bae59ff424 Fix unused BG_COLOR warning and make show_splash() private
Move BG_COLOR inline into #ifndef block to avoid unused variable
when HAS_SPLASH_IMAGE is defined. Make show_splash() private since
it's only called internally from init_hardware_only().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 17:48:38 -05:00
Torlando 3c81eeb5be Merge pull request #11 from torlando-tech/fix/ble-wdt-stability
Fix Task WDT crashes from LVGL priority starvation
2026-03-04 17:07:27 -05:00
torlando-tech ed8c08109f Add ble_hs_synced() guard to notifyAll()
Matches the guard already on notify() to prevent use-after-free
of _tx_char during a NimBLE host reset.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 16:20:06 -05:00
torlando-tech 5b2a1ab53e Skip redundant fill_screen when full-screen splash image is available
Saves one full 320x240 SPI screen write before the splash renders.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 16:09:49 -05:00
torlando-tech 4ba97057c5 Remove no-op esp_task_wdt_reset() calls from NimBLEPlatform
BLE task is no longer subscribed to WDT, so these 23 calls were
silently returning ESP_ERR_NOT_FOUND. Removes dead code and the
now-unused esp_task_wdt.h include.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 15:32:44 -05:00
torlando-tech 2a1b98f8f1 Fix _initialized never set in Display::init()
Prevents double PSRAM allocation and LVGL driver re-registration
if init() were called more than once.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 14:39:49 -05:00
Torlando 2f61b80567 Update lib/tdeck_ui/Hardware/TDeck/Display.cpp
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-03-04 14:20:44 -05:00
Torlando a93f7c258f Apply suggestion from @greptile-apps[bot]
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-03-04 14:19:48 -05:00
Torlando f24d80688a Merge pull request #10 from torlando-tech/feature/sd-spi-bus-sharing
Add shared SPI bus mutex for SD card coexistence
2026-03-04 14:18:22 -05:00
torlando-tech 0f85f4dd69 Add Pyxis logo to README
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 14:12:36 -05:00
torlando-tech 27668a7515 Add PYXIS text to splash and render full-screen 320x240
Scale constellation to 80% and shift down to make room for title text.
Generate splash at full display resolution instead of 160x160 centered.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 14:12:25 -05:00
torlando-tech 30dc48086c Add splash screen icon and build-time SVG-to-RGB565 generator
- pyxis-icon.svg: Pyxis constellation icon (3 stars with connecting lines)
- generate_splash.py: PlatformIO pre-build script that renders the SVG to
  a 160x160 RGB565 PROGMEM header (SplashImage.h) using cairosvg + Pillow
- .gitignore: Exclude generated SplashImage.h
- platformio.ini: Add generate_splash.py to both build environments

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 14:12:25 -05:00
torlando-tech a4a1aacdd8 Show boot splash within 1s of power-on instead of after 20s+ init
Move Display::init_hardware_only() and POWER_EN to right after serial
banner, before GPS/WiFi/SD/Reticulum init. Add 150ms delay after
POWER_EN HIGH so ST7789V power rail stabilizes before SPI commands
(without this, SWRESET is sent to an unpowered chip and silently lost).

Splash now visible for entire boot period (~18s) until LVGL takes over.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 14:12:25 -05:00
torlando-tech c80e63dee9 Fix Task WDT crashes: LVGL priority starvation + BLE WDT false positives
Two root causes for frequent device reboots:

1. LVGL task (priority 2) starved loopTask (priority 1) on core 1.
   During heavy screen rendering, loopTask couldn't run for 30+ seconds,
   triggering the Task WDT. Fixed by lowering LVGL to priority 1 so
   FreeRTOS round-robins both tasks fairly.

2. BLE task was registered with the 30s Task WDT, but blocking NimBLE
   GATT operations (connect + service discovery + subscribe + read) can
   legitimately take 30-60s total. Removed BLE task from WDT since
   NimBLE has its own internal ~30s timeouts per GATT operation.

Also added ble_hs_synced() guards to write(), read(), notify(),
writeCharacteristic(), discoverServices(), and enableNotifications()
to prevent use-after-free on stale NimBLE client pointers during
host resets.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 14:11:51 -05:00
torlando-tech 0608af6d38 Unify all SPI peripherals on global FSPI to fix pin conflicts
Display and LoRa were creating separate SPIClass(HSPI) instances which
claimed GPIO pins via the matrix, preventing SD card (on FSPI) from
accessing MISO after Display init. Now all three peripherals use the
global SPI (FSPI) instance, eliminating GPIO routing conflicts.

- Display: use &SPI instead of new SPIClass(HSPI)
- SX1262Interface: use &SPI instead of new SPIClass(HSPI)
- SDAccess: enable format_if_empty for unformatted cards

Verified on device: SD (128GB SDHC), display, and LoRa all coexist.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 09:41:35 -05:00
torlando-tech b4afa6d3f7 Fix SD card SPI init: use FSPI before Display claims HSPI
SD card was unresponsive (MISO stuck 0xFF) because Display's HSPI
peripheral had already claimed the GPIO pins via the matrix, preventing
FSPI from routing MISO. Fix by initializing SD card BEFORE Display,
using the global SPI (FSPI) instance — matching LilyGo's reference code.

- Move SD card init before display init in boot sequence
- Use global SPI (FSPI) instead of Display's SPIClass(HSPI)
- Lower SPI frequency to 800kHz matching LilyGo example
- Drive all CS lines (display, LoRa, SD) high before SD init
- Add MISO=38 to Display's SPI.begin for post-init bus sharing
- Add Display::get_spi() accessor for future shared use

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 09:29:47 -05:00
torlando-tech d03f0b308f Add shared SPI bus mutex for SD card, display, and LoRa coexistence
The T-Deck Plus shares HSPI across the display (CS=12), LoRa (CS=9),
and SD card (CS=39). Previously SD logging was disabled because
SD.begin() reconfigured the SPI bus and blanked the display.

This introduces a FreeRTOS mutex created in main.cpp and injected into
Display, SX1262Interface, and a new SDAccess class so all three
peripherals serialize their SPI transactions safely.

- Add SDAccess class wrapping SD.begin() and file ops with mutex
- Add set_spi_mutex() to Display and SX1262Interface
- Wrap Display flush, fill, draw, and power ops in mutex
- Refactor SDLogger to use SDAccess mutex instead of owning SD.begin()
- Wire up mutex creation and injection order in setup()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 00:19:10 -05:00
Torlando d9411fb4bb Merge pull request #7 from torlando-tech/ble-stability-audit
BLE stability: fix desync crash loops and scan recovery
v0.2.2
2026-03-03 23:39:38 -05:00
Torlando e3cf0c75bf Merge pull request #8 from torlando-tech/fix/flasher-versioned-releases
Fix web flasher versioned releases (CORS)
2026-03-03 23:39:17 -05:00
torlando-tech fd62d5042f Use exact match for HTTP status code check
grep -q 200 could match 2001 or other superstrings; -qx anchors
to the full line which is correct for curl's %{http_code} output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 23:33:10 -05:00
Torlando 5971771ed2 Merge pull request #5 from davidcranor/fix/cross-platform-build
Fix cross-platform build: replace ${PROJECT_DIR} with relative paths
2026-03-03 23:29:38 -05:00
Torlando 1d0a58de64 Merge pull request #9 from torlando-tech/ci/build-check
Add CI build check for pull requests
2026-03-03 21:26:33 -05:00
torlando-tech 1c5dd9936a Add CI build check for pull requests
Runs both tdeck and tdeck-bluedroid builds on every PR to main.
Uploads firmware binaries as artifacts on the tdeck build for
easy testing by reviewers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 21:17:14 -05:00
torlando-tech 4f22776971 Add tone as explicit dependency of tdeck_ui
UIManager.cpp includes Tone.h, so tdeck_ui should declare this
dependency rather than relying on implicit global discovery.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 21:05:42 -05:00
torlando-tech 8ebb637a78 Add landing page at GitHub Pages root from README.md
Generates a styled HTML page from README.md and deploys it to the root
of the GitHub Pages site, fixing the 404 at torlando-tech.github.io/pyxis/.
Includes navigation links to the web flasher, GitHub repo, and releases.

Closes #6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:59:56 -05:00
torlando-tech ee87edef7a Fix web flasher versioned releases failing with CORS error
GitHub release download URLs redirect to release-assets.githubusercontent.com
which doesn't return Access-Control-Allow-Origin headers. The browser blocks
cross-origin fetches from the GitHub Pages flasher, causing "Failed to fetch"
for any versioned release while the latest dev build (same-origin) works fine.

Fix: Deploy versioned firmware binaries to GitHub Pages alongside the dev
build at firmware/releases/{tag}/, so all versions are fetched same-origin.
The CI workflow now downloads existing release assets and deploys them to
Pages with keep_files: true to preserve across deploys.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:57:38 -05:00
torlando-tech 46ce057a1e BLE stability: host-controller resync, stuck GAP conn cancel, scan diagnostics
After a 574 connection failure, the NimBLE controller's scan state can
become corrupted (returning rc=530 / Invalid HCI Params) even after the
host re-syncs. This led to scan failure escalation and device reboots.

Key fixes:
- Add ble_gap_conn_cancel() to enterErrorRecovery() — stuck GAP master
  connection operations were blocking all subsequent scans
- Add ble_hs_sched_reset(BLE_HS_ECONTROLLER) in error recovery to force
  a full host-controller resynchronization after desync
- Proactively cancel stale GAP connections before scan start
- Reduce SCAN_FAIL_RECOVERY_THRESHOLD from 10 to 5 for faster recovery
- Enhanced scan failure logging with GAP state diagnostics
- Move ESP reset reason logging after WiFi init for UDP log visibility
- Suppress connection candidate log spam when at max connections

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 19:57:55 -05:00