SMP router specs

This commit is contained in:
Evgeny @ SimpleX Chat
2026-03-12 11:29:18 +00:00
parent 8b39610ff4
commit f7be44981a
24 changed files with 75 additions and 67 deletions
+8
View File
@@ -182,6 +182,14 @@ Before finishing a module doc, ask:
If any answer reveals a problem, fix it and repeat from question 1. Only finish when a full pass produces no changes.
## Terminology — the spec as translation boundary
The protocol documents (`protocol/overview-tjr.md`, `protocol/simplex-messaging.md`, `protocol/agent-protocol.md`) define the canonical terminology. Code uses different names for some of the same concepts. The spec is where the translation happens.
The most important distinction: SimpleX protocol routers are referred to as "servers" in code. The term "server" was adopted historically because SimpleX routers were implemented as Linux-based software that is deployed in the same way as servers. But the similarity is entirely formal. Functionally, servers serve responses to the requests of their users - that is why the term "server" was adopted for computers and software that provide Internet services. SimpleX protocol routers don't serve responses - they route packets between endpoints, and they have no concept of a user. Functionally they are similar to Internet Protocol routers, but with a resource-based addressing scheme. Further, SimpleX protocol routers are hardware and software agnostic. SimpleX protocols are open and documented, so they can be implemented in any language and run on a different architecture. For example, [SimpleGo](https://simplego.dev) is a prototype implementation of the SimpleX protocol stack in C for a microcontroller architecture.
**The rule**: use protocol terms for concepts, code terms for identifiers. Write "router" when describing the network node's role, `SMPServer` or `Server.hs` when referencing code. Similarly, "router identity" for the concept (called "server key hash" or "fingerprint" in code). When the distinction matters, bridge explicitly: "the SMP router (implemented by the `Server` module)" or "the `SMPServer` type (representing a router address)."
## Exclusions
- **Individual migration files** (M20XXXXXX_*.hs): Self-describing SQL. No per-migration docs.
+7 -7
View File
@@ -8,7 +8,7 @@
## Overview
This module implements the client side of the `Protocol` typeclass — connecting to servers, sending commands, receiving responses, and managing connection lifecycle. It is generic over `Protocol v err msg`, instantiated for SMP as `SMPClient` (= `ProtocolClient SMPVersion ErrorType BrokerMsg`). The SMP proxy protocol (PRXY/PFWD/RFWD) is also implemented here.
This module implements the client side of the `Protocol` typeclass — connecting to SMP routers, sending commands, receiving responses, and managing connection lifecycle. It is generic over `Protocol v err msg`, instantiated for SMP as `SMPClient` (= `ProtocolClient SMPVersion ErrorType BrokerMsg`). The SMP proxy protocol (PRXY/PFWD/RFWD) is also implemented here.
## Four concurrent threads — teardown semantics
@@ -36,9 +36,9 @@ The double-check pattern (`swapTVar pending False` + `tryTakeTMVar`) handles the
`timeoutErrorCount` is reset to 0 in three places: in `getResponse` when a response arrives, in `receive` on every TLS read, and the monitor uses this count to decide when to drop the connection.
## processMsg — server events vs expired responses
## processMsg — router events vs expired responses
When `corrId` is empty, the message is an `STEvent` (server-initiated). When non-empty and the request was already expired (`wasPending` is `False`), the response becomes `STResponse` — not discarded, but forwarded to `msgQ` with the original command context. Entity ID mismatch is `STUnexpectedError`.
When `corrId` is empty, the message is an `STEvent` (router-initiated). When non-empty and the request was already expired (`wasPending` is `False`), the response becomes `STResponse` — not discarded, but forwarded to `msgQ` with the original command context. Entity ID mismatch is `STUnexpectedError`.
## nonBlockingWriteTBQueue — fork on full
@@ -46,7 +46,7 @@ If `tryWriteTBQueue` returns `False`, a new thread is forked for the blocking wr
## Batch commands do not expire
See comment on `sendBatch`. Batched commands are written with `Nothing` as the request parameter — the send thread skips the `pending` flag check. Individual commands use `Just r` and the send thread checks `pending` after dequeue. The coupling: if the server stops responding, batched commands can block the send queue indefinitely since they have no timeout-based expiry.
See comment on `sendBatch`. Batched commands are written with `Nothing` as the request parameter — the send thread skips the `pending` flag check. Individual commands use `Just r` and the send thread checks `pending` after dequeue. The coupling: if the router stops responding, batched commands can block the send queue indefinitely since they have no timeout-based expiry.
## monitor — quasi-periodic adaptive ping
@@ -68,7 +68,7 @@ See comment above `proxySMPCommand` for the 9 error scenarios (0-9) mapping each
## forwardSMPTransmission — proxy-side forwarding
Used by the proxy server to forward `RFWD` to the destination relay. Uses `cbEncryptNoPad`/`cbDecryptNoPad` (no padding) with the session secret from the proxy-relay connection. Response nonce is `reverseNonce` of the request nonce.
Used by the proxy router to forward `RFWD` to the destination relay. Uses `cbEncryptNoPad`/`cbDecryptNoPad` (no padding) with the session secret from the proxy-relay connection. Response nonce is `reverseNonce` of the request nonce.
## authTransmission — dual auth with service signature
@@ -80,6 +80,6 @@ The service signature is only added when the entity authenticator is non-empty.
`action` stores a `Weak ThreadId` (via `mkWeakThreadId`) to the main client thread. `closeProtocolClient` dereferences and kills it. The weak reference allows the thread to be garbage collected if all other references are dropped.
## writeSMPMessage — server-side event injection
## writeSMPMessage — router-side event injection
`writeSMPMessage` writes directly to `msgQ` as `STEvent`, bypassing the entire command/response pipeline. This is used by the server to inject MSG events into the subscription response path.
`writeSMPMessage` writes directly to `msgQ` as `STEvent`, bypassing the entire command/response pipeline. This is used by the router to inject MSG events into the subscription response path.
@@ -6,9 +6,9 @@
## Overview
This is the "small agent" — used only in servers (SMP proxy, notification server) to manage client connections to other SMP servers. The "big agent" in `Simplex.Messaging.Agent` + `Simplex.Messaging.Agent.Client` serves client applications and adds the full messaging agent layer. See [Two agent layers](../../../../TOPICS.md) topic.
This is the "small agent" — used only in routers (SMP proxy, notification router) to manage client connections to other SMP routers. The "big agent" in `Simplex.Messaging.Agent` + `Simplex.Messaging.Agent.Client` serves client applications and adds the full messaging agent layer. See [Two agent layers](../../../../TOPICS.md) topic.
`SMPClientAgent` manages `SMPClient` connections via `smpClients :: TMap SMPServer SMPClientVar` (one per SMP server), tracks active and pending subscriptions, and handles automatic reconnection. It is parameterized by `Party` (`p`) and uses the `ServiceParty` constraint to support both `RecipientService` and `NotifierService` modes.
`SMPClientAgent` manages `SMPClient` connections via `smpClients :: TMap SMPServer SMPClientVar` (one per router), tracks active and pending subscriptions, and handles automatic reconnection. It is parameterized by `Party` (`p`) and uses the `ServiceParty` constraint to support both `RecipientService` and `NotifierService` modes.
## Dual subscription model
@@ -19,7 +19,7 @@ Four TMap fields track subscriptions in two dimensions:
| **Service** | `activeServiceSubs` (TMap SMPServer (TVar (Maybe (ServiceSub, SessionId)))) | `pendingServiceSubs` (TMap SMPServer (TVar (Maybe ServiceSub))) |
| **Queue** | `activeQueueSubs` (TMap SMPServer (TMap QueueId (SessionId, C.APrivateAuthKey))) | `pendingQueueSubs` (TMap SMPServer (TMap QueueId C.APrivateAuthKey)) |
See comments on `activeServiceSubs` and `pendingServiceSubs` for the coexistence rules. Key constraint: only one service subscription per server. Active subs store the `SessionId` that established them.
See comments on `activeServiceSubs` and `pendingServiceSubs` for the coexistence rules. Key constraint: only one service subscription per router. Active subs store the `SessionId` that established them.
## SessionVar compare-and-swap — core concurrency safety
@@ -27,11 +27,11 @@ See comments on `activeServiceSubs` and `pendingServiceSubs` for the coexistence
## removeClientAndSubs — outside-STM lookup optimization
See comment on `removeClientAndSubs`. Subscription TVar references are obtained outside STM (via `TM.lookupIO`), then modified inside `atomically`. This is safe because the invariant is that subscription TVar entries for a server are never deleted from the outer TMap, only their contents change. Moving lookups inside the STM transaction would cause excessive re-evaluation under contention.
See comment on `removeClientAndSubs`. Subscription TVar references are obtained outside STM (via `TM.lookupIO`), then modified inside `atomically`. This is safe because the invariant is that subscription TVar entries for a router are never deleted from the outer TMap, only their contents change. Moving lookups inside the STM transaction would cause excessive re-evaluation under contention.
## Disconnect preserves others' subscriptions
`updateServiceSub` only moves active→pending when `sessId` matches the disconnected client (see its comment). If a new client already established different subscriptions on the same server, those are preserved. Queue subs use `M.partition` to split by SessionId — only matching subs move to pending, non-matching remain active.
`updateServiceSub` only moves active→pending when `sessId` matches the disconnected client (see its comment). If a new client already established different subscriptions on the same router, those are preserved. Queue subs use `M.partition` to split by SessionId — only matching subs move to pending, non-matching remain active.
## Pending never reset to Nothing on disconnect
@@ -63,7 +63,7 @@ When serviceId and sessionId match the existing active subscription, queue count
## CAServiceUnavailable — cascade to queue resubscription
When `smpSubscribeService` detects service ID or role mismatch with the connection, it fires `CAServiceUnavailable`. See comment on `CAServiceUnavailable` for the full implication: the app must resubscribe all queues individually, creating new associations. This can happen if the SMP server reassigns service IDs (e.g., after downgrade and upgrade).
When `smpSubscribeService` detects service ID or role mismatch with the connection, it fires `CAServiceUnavailable`. See comment on `CAServiceUnavailable` for the full implication: the app must resubscribe all queues individually, creating new associations. This can happen if the SMP router reassigns service IDs (e.g., after downgrade and upgrade).
## getPending — polymorphic over STM/IO
@@ -89,4 +89,4 @@ During reconnection, `reconnectSMPClient` reads current active queue subs (outsi
## addSubs_ — left-biased union
`addSubs_` uses `TM.union` which delegates to `M.union` (left-biased). If a queue subscription already exists, the new auth key from the incoming map wins. Service subs use `writeTVar` (overwrite) since only one service sub exists per server.
`addSubs_` uses `TM.union` which delegates to `M.union` (left-biased). If a queue subscription already exists, the new auth key from the incoming map wins. Service subs use `writeTVar` (overwrite) since only one service sub exists per router.
@@ -12,8 +12,8 @@ Short links encode connection data in two encrypted blobs: fixed data (2048 byte
Two distinct HKDF derivations with different info strings:
- **contactShortLinkKdf**: `HKDF("", linkKey, "SimpleXContactLink", 56)` → splits into 24-byte LinkId + 32-byte SbKey. The LinkId is used as the server-side identifier.
- **invShortLinkKdf**: `HKDF("", linkKey, "SimpleXInvLink", 32)` → 32-byte SbKey only. No LinkId because invitation links don't use server-side lookup.
- **contactShortLinkKdf**: `HKDF("", linkKey, "SimpleXContactLink", 56)` → splits into 24-byte LinkId + 32-byte SbKey. The LinkId is used as the router-side identifier.
- **invShortLinkKdf**: `HKDF("", linkKey, "SimpleXInvLink", 32)` → 32-byte SbKey only. No LinkId because invitation links don't use router-side lookup.
## Fixed padding lengths
@@ -29,7 +29,7 @@ The NTF protocol reuses SMP's transport infrastructure but with reduced paramete
## Same ALPN/legacy fallback pattern as SMP
`ntfServerHandshake` uses the same pattern as `smpServerHandshake`: if ALPN is not negotiated (`getSessionALPN` returns `Nothing`), the server offers only `legacyServerNTFVRange` (v1 only).
`ntfServerHandshake` uses the same pattern as `smpServerHandshake`: if ALPN is not negotiated (`getSessionALPN` returns `Nothing`), the notification router offers only `legacyServerNTFVRange` (v1 only).
## NTF handshake uses SMP shared types
+4 -4
View File
@@ -8,11 +8,11 @@
## Overview
This module defines the SMP protocol's type-level structure, wire encoding, and transport batching. It does not implement the server or client — those are in [Server.hs](./Server.md) and [Client.hs](./Client.md). The protocol spec governs the command semantics; this doc focuses on non-obvious implementation choices.
This module defines the SMP protocol's type-level structure, wire encoding, and transport batching. It does not implement the router or client — those are in [Server.hs](./Server.md) and [Client.hs](./Client.md). The protocol spec governs the command semantics; this doc focuses on non-obvious implementation choices.
## Two separate version scopes
SMP client protocol version (`SMPClientVersion`, 4 versions) is separate from SMP relay protocol version (`SMPVersion`, up to version 19, defined in [Transport.hs](./Transport.md)). The client version governs client-to-client concerns (binary encoding, multi-host addresses, SKEY command, short links). The relay version governs client-to-server wire format, transport encryption, and command availability. See comment above `SMPClientVersion` data declaration for version history.
SMP client protocol version (`SMPClientVersion`, 4 versions) is separate from SMP relay protocol version (`SMPVersion`, up to version 19, defined in [Transport.hs](./Transport.md)). The client version governs client-to-client concerns (binary encoding, multi-host addresses, SKEY command, short links). The relay version governs client-to-router wire format, transport encryption, and command availability. See comment above `SMPClientVersion` data declaration for version history.
## maxMessageLength — version-dependent
@@ -57,7 +57,7 @@ The `MsgFlags` parser consumes the `notification` Bool then calls `A.takeTill (=
## BrokerErrorType NETWORK — detail loss
The `NETWORK` variant of `BrokerErrorType` encodes as just `"NETWORK"` (detail dropped), with `TODO once all upgrade` comment. The parser falls back to `NEFailedError` when the `NetworkError` detail can't be parsed (`_smpP <|> pure NEFailedError`). This means a newer server's detailed network error is seen as `NEFailedError` by older clients.
The `NETWORK` variant of `BrokerErrorType` encodes as just `"NETWORK"` (detail dropped), with `TODO once all upgrade` comment. The parser falls back to `NEFailedError` when the `NetworkError` detail can't be parsed (`_smpP <|> pure NEFailedError`). This means a newer router's detailed network error is seen as `NEFailedError` by older clients.
## Version-dependent encoding — scope
@@ -65,4 +65,4 @@ The `NETWORK` variant of `BrokerErrorType` encodes as just `"NETWORK"` (detail d
## SUBS/NSUBS — asymmetric defaulting
When the server parses `SUBS`/`NSUBS` from a client using a version older than `rcvServiceSMPVersion`, both count and hash default (`-1` and `mempty`). For the response side (`SOKS`/`ENDS` via `serviceRespP`), count is still parsed from the wire — only hash defaults to `mempty`. This asymmetry means command-side and response-side parsing have different fallback behavior for the same version boundary.
When the router parses `SUBS`/`NSUBS` from a client using a version older than `rcvServiceSMPVersion`, both count and hash default (`-1` and `mempty`). For the response side (`SOKS`/`ENDS` via `serviceRespP`), count is still parsed from the wire — only hash defaults to `mempty`. This asymmetry means command-side and response-side parsing have different fallback behavior for the same version boundary.
+4 -4
View File
@@ -1,6 +1,6 @@
# Simplex.Messaging.Server
> SMP server: client handling, subscription lifecycle, message delivery, proxy forwarding, control port.
> SMP router (`Server` module): client handling, subscription lifecycle, message delivery, proxy forwarding, control port.
**Source**: [`Server.hs`](../../../../src/Simplex/Messaging/Server.hs)
@@ -8,7 +8,7 @@
## Overview
The server runs as `raceAny_` over many threads — any thread exit stops the entire server. The thread set includes: one `serverThread` per subscription type (SMP, NTF), a notification delivery thread, a pending events thread, a proxy agent receiver, a SIGINT handler, plus per-transport listener threads and optional expiration/stats/prometheus/control-port threads. `E.finally` ensures `stopServer` runs on any exit.
The router runs as `raceAny_` over many threads — any thread exit stops the entire router process. The thread set includes: one `serverThread` per subscription type (SMP, NTF), a notification delivery thread, a pending events thread, a proxy agent receiver, a SIGINT handler, plus per-transport listener threads and optional expiration/stats/prometheus/control-port threads. `E.finally` ensures `stopServer` runs on any exit.
## serverThread — subscription lifecycle with split STM
@@ -51,7 +51,7 @@ When the signature algorithm doesn't match the queue key, verification runs with
## Service subscription — hash-based drift detection
See comment on `sharedSubscribeService`. The client sends expected `(count, idsHash)`. The server reads the actual values from storage, then computes `subsChange = subtractServiceSubs currSubs subs'` — the **difference** between what the client's session currently tracks and the new values. This difference (not the absolute values) is passed to `serverThread` via `CSService` to adjust `totalServiceSubs`. Using differences prevents double-counting when a service resubscribes.
See comment on `sharedSubscribeService`. The client sends expected `(count, idsHash)`. The router reads the actual values from storage, then computes `subsChange = subtractServiceSubs currSubs subs'` — the **difference** between what the client's session currently tracks and the new values. This difference (not the absolute values) is passed to `serverThread` via `CSService` to adjust `totalServiceSubs`. Using differences prevents double-counting when a service resubscribes.
Stats classification: exactly one of `srvSubOk`/`srvSubMore`/`srvSubFewer`/`srvSubDiff` is incremented per subscription. `count == -1` is a special case for old NTF servers.
@@ -91,7 +91,7 @@ See `noSubscriptions`. The idle client disconnect thread only checks expiration
## clientDisconnected — ordered cleanup
On disconnect: (1) set `connected = False`, (2) atomically swap out all subscriptions, (3) cancel subscription threads, (4) if server is still active: delete client from server map, update queue and service subscribers. Service subscription cleanup (`updateServiceSubs`) subtracts the client's accumulated `(count, idsHash)` from `totalServiceSubs`. End threads are swapped out and killed.
On disconnect: (1) set `connected = False`, (2) atomically swap out all subscriptions, (3) cancel subscription threads, (4) if router is still active: delete client from `serverClients` map, update queue and service subscribers. Service subscription cleanup (`updateServiceSubs`) subtracts the client's accumulated `(count, idsHash)` from `totalServiceSubs`. End threads are swapped out and killed.
## Control port — single auth, no downgrade
+2 -2
View File
@@ -16,7 +16,7 @@ SMP ports are parsed first. When explicit WebSocket ports are provided, they are
## iniDBOptions — schema creation disabled at CLI
When reading database options from INI, `createSchema` is always set to `False` regardless of INI content. This enforces a security invariant: database schemas must be created manually or by migration, never automatically by the server.
When reading database options from INI, `createSchema` is always set to `False` regardless of INI content. This enforces a security invariant: database schemas must be created manually or by migration, never automatically by the router.
## createServerX509_ — external tool dependency
@@ -24,7 +24,7 @@ Certificate generation shells out to `openssl` commands via `readCreateProcess`,
## checkSavedFingerprint — startup invariant
Fingerprint is extracted from the CA certificate and saved during init. On every server start, the saved fingerprint is compared against the current certificate. Mismatch → startup failure. See [Main.md#initializeserver--fingerprint-invariant](./Main.md#initializeserver--fingerprint-invariant).
Fingerprint is extracted from the CA certificate and saved during init. On every router start, the saved fingerprint is compared against the current certificate. Mismatch → startup failure. See [Main.md#initializeserver--fingerprint-invariant](./Main.md#initializeserver--fingerprint-invariant).
## genOnline — existing certificate dependency
@@ -1,6 +1,6 @@
# Simplex.Messaging.Server.Control
> Control port protocol types and encoding for server administration.
> Control port protocol types and encoding for router administration.
**Source**: [`Control.hs`](../../../../../src/Simplex/Messaging/Server/Control.hs)
@@ -1,12 +1,12 @@
# Simplex.Messaging.Server.Env.STM
> Server environment, configuration, client state, subscription types, and storage initialization.
> Router environment, configuration, client state, subscription types, and storage initialization.
**Source**: [`Env/STM.hs`](../../../../../../src/Simplex/Messaging/Server/Env/STM.hs)
## Overview
This module defines the server's shared state (`Env`, `Server`, `Client`) and the subscription model types. Most non-obvious patterns are about concurrency safety — preventing STM contention while maintaining consistency. Key patterns are documented in [Server.md](../Server.md) where they're used; this doc covers patterns specific to the type definitions and initialization.
This module defines the router's shared state (`Env`, `Server`, `Client`) and the subscription model types. Most non-obvious patterns are about concurrency safety — preventing STM contention while maintaining consistency. Key patterns are documented in [Server.md](../Server.md) where they're used; this doc covers patterns specific to the type definitions and initialization.
## SubscribedClients — TVar-of-Maybe pattern
@@ -26,7 +26,7 @@ See comment on `deleteSubcribedClient`. The TVar lookup is in a separate IO read
## insertServerClient — connected check
`insertServerClient` checks `connected` inside the STM transaction before inserting. If the client was already marked disconnected (race with cleanup), the insert is skipped and returns `False`. This prevents resurrecting a disconnected client in the server map.
`insertServerClient` checks `connected` inside the STM transaction before inserting. If the client was already marked disconnected (race with cleanup), the insert is skipped and returns `False`. This prevents resurrecting a disconnected client in the `serverClients` map.
## SupportedStore — compile-time storage validation
@@ -34,7 +34,7 @@ Type family with `(Int ~ Bool, TypeError ...)` for invalid combinations. The uns
## newEnv — initialization order
Store initialization order matters: (1) create message store (loads store log for STM backends), (2) create notification store (empty TMap), (3) generate TLS credentials, (4) compute server identity from fingerprint, (5) create stats, (6) create proxy agent. The store log load (`loadStoreLog`) calls `readWriteQueueStore` which reads the existing log, replays it to build state, then opens a new log for writing. `setStoreLog` attaches the write log to the store.
Store initialization order matters: (1) create message store (loads store log for STM backends), (2) create notification store (empty TMap), (3) generate TLS credentials, (4) compute router identity from fingerprint, (5) create stats, (6) create proxy agent. The store log load (`loadStoreLog`) calls `readWriteQueueStore` which reads the existing log, replays it to build state, then opens a new log for writing. `setStoreLog` attaches the write log to the store.
HTTPS credentials are validated: must be at least 4096-bit RSA (`public_size >= 512` bytes). The check explicitly notes that Let's Encrypt ECDSA uses "insecure curve p256."
@@ -1,6 +1,6 @@
# Simplex.Messaging.Server.Information
> Server public information types (config, operator, hosting) for the server info page.
> Router public information types (config, operator, hosting) for the router info page.
**Source**: [`Information.hs`](../../../../../src/Simplex/Messaging/Server/Information.hs)
@@ -1,12 +1,12 @@
# Simplex.Messaging.Server.Main
> Server CLI entry point: dispatches Init, Start, Delete, Journal, and Database commands.
> Router CLI entry point: dispatches Init, Start, Delete, Journal, and Database commands.
**Source**: [`Main.hs`](../../../../../src/Simplex/Messaging/Server/Main.hs)
## Overview
This is the CLI dispatcher for the SMP server. It parses INI configuration, validates storage mode combinations, and dispatches to the appropriate command handler. The most complex logic is storage configuration validation and migration between storage modes.
This is the CLI dispatcher for the SMP router. It parses INI configuration, validates storage mode combinations, and dispatches to the appropriate command handler. The most complex logic is storage configuration validation and migration between storage modes.
## Storage mode compatibility — state machine
@@ -1,6 +1,6 @@
# Simplex.Messaging.Server.Main.Init
> Server initialization: INI file content generation, default settings, and CLI option structures.
> Router initialization: INI file content generation, default settings, and CLI option structures.
**Source**: [`Main/Init.hs`](../../../../../../src/Simplex/Messaging/Server/Main/Init.hs)
@@ -1,6 +1,6 @@
# Simplex.Messaging.Server.MsgStore.Postgres
> PostgreSQL message store: server-side stored procedures for message operations, COPY protocol for bulk import.
> PostgreSQL message store: router-side stored procedures for message operations, COPY protocol for bulk import.
**Source**: [`Postgres.hs`](../../../../../../src/Simplex/Messaging/Server/MsgStore/Postgres.hs)
@@ -14,7 +14,7 @@ All associated types (`StoreMonad`, `MsgQueue`, `StoreQueue`, `QueueStore`, `Msg
## tryDelPeekMsg — atomic delete-and-peek
Deletes the current message AND peeks the next one in a single `isolateQueue` call. This atomicity is critical for the ACK flow: the server needs to know if there's a next message to deliver immediately after acknowledging the current one, without a window where a concurrent SEND could interleave.
Deletes the current message AND peeks the next one in a single `isolateQueue` call. This atomicity is critical for the ACK flow: the router needs to know if there's a next message to deliver immediately after acknowledging the current one, without a window where a concurrent SEND could interleave.
## withIdleMsgQueue — journal-specific lifecycle
@@ -22,7 +22,7 @@ For Journal store, the message queue file handle is closed after the action if i
## unsafeWithAllMsgQueues — CLI-only
Explicitly unsafe: iterates all queues including those not in active memory. Only safe before server start or in CLI commands. During normal operation, Journal store may have queues on disk but not loaded — this function would load them, interfering with the lazy-loading lifecycle.
Explicitly unsafe: iterates all queues including those not in active memory. Only safe before router start or in CLI commands. During normal operation, Journal store may have queues on disk but not loaded — this function would load them, interfering with the lazy-loading lifecycle.
## snapshotTQueue visibility gap
@@ -6,7 +6,7 @@
## storeNtf — outside-STM lookup with STM fallback
`storeNtf` uses `TM.lookupIO` outside STM, then falls back to `TM.lookup` inside STM if the notifier entry doesn't exist. This is the same outside-STM lookup pattern used in Server.hs and Client/Agent.hs — avoids transaction re-evaluation from unrelated map changes. The double-check inside STM prevents races when two messages arrive concurrently for a new notifier.
`storeNtf` uses `TM.lookupIO` outside STM, then falls back to `TM.lookup` inside STM if the notifier entry doesn't exist. This is the same outside-STM lookup pattern used in the router (`Server.hs`) and `Client/Agent.hs` — avoids transaction re-evaluation from unrelated map changes. The double-check inside STM prevents races when two messages arrive concurrently for a new notifier.
## deleteExpiredNtfs — last-is-earliest optimization
@@ -1,6 +1,6 @@
# Simplex.Messaging.Server.Prometheus
> Prometheus text exposition format for server metrics, with histogram gap-filling and derived aggregations.
> Prometheus text exposition format for router metrics, with histogram gap-filling and derived aggregations.
**Source**: [`Prometheus.hs`](../../../../../src/Simplex/Messaging/Server/Prometheus.hs)
@@ -30,7 +30,7 @@ Batch queue lookups (`getQueues_`) read the entire TVar map once with `readTVarI
## closeQueueStore — non-atomic shutdown
`closeQueueStore` clears TMaps in separate `atomically` calls, not one transaction. Concurrent operations during shutdown could see partially cleared state. This is acceptable because the store log is closed first, and the server should not be processing new requests during shutdown.
`closeQueueStore` clears TMaps in separate `atomically` calls, not one transaction. Concurrent operations during shutdown could see partially cleared state. This is acceptable because the store log is closed first, and the router should not be processing new requests during shutdown.
## addQueueLinkData — conditional idempotency
@@ -1,6 +1,6 @@
# Simplex.Messaging.Server.Stats
> Server statistics: counters, rolling period tracking, delivery time histograms, proxy stats, service stats.
> Router statistics: counters, rolling period tracking, delivery time histograms, proxy stats, service stats.
**Source**: [`Stats.hs`](../../../../../src/Simplex/Messaging/Server/Stats.hs)
@@ -36,4 +36,4 @@ In `logServerStats` (Server.hs), each counter is read and reset via `atomicSwapI
## setPeriodStats — not thread safe
See comment on `setPeriodStats`. Uses `writeIORef` (not atomic). Only safe during server startup when no other threads are running. If called concurrently, period data could be corrupted.
See comment on `setPeriodStats`. Uses `writeIORef` (not atomic). Only safe during router startup when no other threads are running. If called concurrently, period data could be corrupted.
@@ -17,11 +17,11 @@ The `.start` temp backup file provides crash recovery during compaction. The seq
3. Write compacted state to new file
4. Rename `.start` to timestamped backup, remove old backups
If the server crashes during step 3, the next startup detects `.start` and restores from it instead of the incomplete new file. Any partially-written current file is preserved as `.bak`. The comment says "do not terminate" during compaction — there is no safe interrupt point between steps 2 and 4.
If the router crashes during step 3, the next startup detects `.start` and restores from it instead of the incomplete new file. Any partially-written current file is preserved as `.bak`. The comment says "do not terminate" during compaction — there is no safe interrupt point between steps 2 and 4.
## removeStoreLogBackups — layered retention policy
Backup retention is layered: (1) keep all backups newer than 24 hours, (2) of the rest, keep at least 3, (3) of those eligible for deletion, only delete backups older than 21 days. This means a server with infrequent restarts accumulates many backups (only cleaned on startup), while a frequently-restarting server keeps a rolling window. Backup timestamps come from ISO 8601 suffixes parsed from filenames.
Backup retention is layered: (1) keep all backups newer than 24 hours, (2) of the rest, keep at least 3, (3) of those eligible for deletion, only delete backups older than 21 days. This means a router with infrequent restarts accumulates many backups (only cleaned on startup), while a frequently-restarting router keeps a rolling window. Backup timestamps come from ISO 8601 suffixes parsed from filenames.
## QueueRec StrEncoding — backward-compatible parsing
+2 -2
View File
@@ -1,12 +1,12 @@
# Simplex.Messaging.Server.Web
> Static site generation, serving (HTTP, HTTPS, HTTP/2), and template rendering for the server info page.
> Static site generation, serving (HTTP, HTTPS, HTTP/2), and template rendering for the router info page.
**Source**: [`Web.hs`](../../../../../src/Simplex/Messaging/Server/Web.hs)
## attachStaticFiles — reusing Warp internals for TLS connections
`attachStaticFiles` receives already-established TLS connections (which passed TLS handshake and ALPN check in the SMP transport layer) and runs Warp's HTTP handler on them. It manually calls `WI.withII`, `WT.attachConn`, `WI.registerKillThread`, and `WI.serveConnection` — internal Warp APIs. This couples the server to Warp internals and could break on Warp library updates.
`attachStaticFiles` receives already-established TLS connections (which passed TLS handshake and ALPN check in the SMP transport layer) and runs Warp's HTTP handler on them. It manually calls `WI.withII`, `WT.attachConn`, `WI.registerKillThread`, and `WI.serveConnection` — internal Warp APIs. This couples the router to Warp internals and could break on Warp library updates.
## serveStaticPageH2 — path traversal protection
+13 -13
View File
@@ -10,7 +10,7 @@
This is the core transport module. It defines:
- The `Transport` typeclass abstracting over TLS and WebSocket connections
- The SMP handshake protocol (server and client sides)
- The SMP handshake protocol (router and client sides)
- Optional block encryption using HKDF-derived symmetric key chains (v11+)
- Version negotiation with backward-compatible extensions
@@ -29,8 +29,8 @@ In practice (Server.hs), the SMP proxy uses `proxiedSMPRelayVRange` to cap the d
## withTlsUnique — different API calls yield same value
`withTlsUnique` extracts the tls-unique channel binding (RFC 5929) using a type-level dispatch:
- **Server** (`STServer`): `T.getPeerFinished` — the peer's (client's) Finished message
- **Client** (`STClient`): `T.getFinished` — own (client's) Finished message
- **Router side** (`STServer`): `T.getPeerFinished` — the peer's (client's) Finished message
- **Client side** (`STClient`): `T.getFinished` — own (client's) Finished message
Both calls yield the client's Finished message. If the result is `Nothing`, the connection is closed immediately (`closeTLS cxt >> ioe_EOF`).
@@ -41,31 +41,31 @@ Two TLS parameter sets:
- **`defaultSupportedParams`**: ChaCha20-Poly1305 ciphers only, Ed448/Ed25519 signatures only, X448/X25519 groups. Per the protocol spec: "TLS_CHACHA20_POLY1305_SHA256 cipher suite, ed25519 EdDSA algorithms for signatures, x25519 ECDHE groups for key exchange."
- **`defaultSupportedParamsHTTPS`**: extends `defaultSupportedParams` with `ciphersuite_strong`, additional groups, and additional hash/signature combinations. The source comment says: "A selection of extra parameters to accomodate browser chains."
In the SMP server (Server.hs), when HTTP credentials are configured, `defaultSupportedParamsHTTPS` is used for all connections on that port (not selected per-connection). When no HTTP credentials are configured, `defaultSupportedParams` is used.
In the SMP router (`Server.hs`), when HTTP credentials are configured, `defaultSupportedParamsHTTPS` is used for all connections on that port (not selected per-connection). When no HTTP credentials are configured, `defaultSupportedParams` is used.
## SMP handshake flow
Per the [protocol spec](../../../../protocol/simplex-messaging.md#transport-handshake), the handshake is a two-message exchange (three if service certs are used):
1. **Server → Client**: `paddedRouterHello` containing `smpVersionRange`, `sessionIdentifier` (tls-unique), and `routerCertKey` (certificate chain + X25519 key signed by the server's certificate)
2. **Client → Server**: `paddedClientHello` containing agreed `smpVersion`, `keyHash` (router identity — CA certificate fingerprint), optional `clientKey`, `proxyRouter` flag, and optional `clientService`
3. **Server → Client** (service only): `paddedRouterHandshakeResponse` containing assigned `serviceId` or `handshakeError`
1. **Router → Client**: `paddedRouterHello` containing `smpVersionRange`, `sessionIdentifier` (tls-unique), and `routerCertKey` (certificate chain + X25519 key signed by the router's certificate)
2. **Client → Router**: `paddedClientHello` containing agreed `smpVersion`, `keyHash` (router identity — CA certificate fingerprint), optional `clientKey`, `proxyRouter` flag, and optional `clientService`
3. **Router → Client** (service only): `paddedRouterHandshakeResponse` containing assigned `serviceId` or `handshakeError`
The client verifies `sessionIdentifier` matches its own tls-unique (`when (sessionId /= sessId) $ throwE TEBadSession`). The server verifies `keyHash` matches its CA fingerprint (`when (keyHash /= kh) $ throwE $ TEHandshake IDENTITY`).
The client verifies `sessionIdentifier` matches its own tls-unique (`when (sessionId /= sessId) $ throwE TEBadSession`). The router verifies `keyHash` matches its CA fingerprint (`when (keyHash /= kh) $ throwE $ TEHandshake IDENTITY`).
Per the protocol spec: "For TLS transport client should assert that sessionIdentifier is equal to tls-unique channel binding defined in RFC 5929."
### legacyServerSMPRelayVRange when no ALPN
If ALPN is not negotiated (`getSessionALPN c` returns `Nothing`), the server offers `legacyServerSMPRelayVRange` (v6 only) instead of the full version range. Per the protocol spec: "If the client does not confirm this protocol name, the router would fall back to v6 of SMP protocol." The spec notes: "This is added to allow support of older clients without breaking backward compatibility and to extend or modify handshake syntax."
If ALPN is not negotiated (`getSessionALPN c` returns `Nothing`), the router offers `legacyServerSMPRelayVRange` (v6 only) instead of the full version range. Per the protocol spec: "If the client does not confirm this protocol name, the router would fall back to v6 of SMP protocol." The spec notes: "This is added to allow support of older clients without breaking backward compatibility and to extend or modify handshake syntax."
### Service certificate handshake extension
When `clientService` is present in the client handshake, the server performs additional verification:
When `clientService` is present in the client handshake, the router performs additional verification:
- The TLS client certificate chain must exactly match the certificate chain in the handshake message (`getPeerCertChain c == cc`)
- The signed X25519 public key is verified against the leaf certificate's key (`getCertVerifyKey leafCert` then `C.verifyX509`)
- On success, the server sends `SMPServerHandshakeResponse` with a `serviceId`
- On failure, the server sends `SMPServerHandshakeError` before raising the error
- On success, the router sends `SMPServerHandshakeResponse` with a `serviceId`
- On failure, the router sends `SMPServerHandshakeError` before raising the error
Per the protocol spec (v16+): "`clientService` provides long-term service client certificate for high-volume services using SMP router (chat relays, notification routers, high traffic bots). The router responds with a third handshake message containing the assigned service ID."
@@ -86,7 +86,7 @@ The protocol spec version history (v11) describes this as "additional encryption
## smpTHandleClient — chain key swap
`smpTHandleClient` applies `swap` to the chain key pair before creating `TSbChainKeys`. The code comment states: "swap is needed to use client's sndKey as server's rcvKey and vice versa."
`smpTHandleClient` applies `swap` to the chain key pair before creating `TSbChainKeys`. The code comment states: "swap is needed to use client's sndKey as server's rcvKey and vice versa." (Here "server" is the code's term for the router side of the transport.)
## Proxy version downgrade logic
@@ -1,6 +1,6 @@
# Simplex.Messaging.Transport.Server
> TLS server: socket lifecycle, client acceptance, SNI credential switching, socket leak detection.
> TLS listener: socket lifecycle, client acceptance, SNI credential switching, socket leak detection.
**Source**: [`Transport/Server.hs`](../../../../../src/Simplex/Messaging/Transport/Server.hs)
@@ -19,10 +19,10 @@
## SNI credential switching
`supportedTLSServerParams` selects TLS credentials based on SNI:
- **No SNI**: uses `credential` (the primary server credential)
- **No SNI**: uses `credential` (the primary router credential)
- **SNI present**: uses `sniCredential` (when configured)
The `sniCredUsed` TVar records whether SNI triggered credential switching. In the SMP server (Server.hs), when `sniUsed` is `True`, the connection is dispatched to the HTTP handler instead of the SMP handler.
The `sniCredUsed` TVar records whether SNI triggered credential switching. In the SMP router (`Server.hs`), when `sniUsed` is `True`, the connection is dispatched to the HTTP handler instead of the SMP handler.
## startTCPServer — address resolution
@@ -1,6 +1,6 @@
# Simplex.Messaging.Transport.Shared
> Certificate chain parsing and X.509 validation utilities shared between client and server.
> Certificate chain parsing and X.509 validation utilities shared between client and router.
**Source**: [`Transport/Shared.hs`](../../../../../src/Simplex/Messaging/Transport/Shared.hs)
@@ -19,10 +19,10 @@
| 4 | `CCValid {leafCert, idCert, _, caCert}` | "with network certificate" |
| 5+ | `CCLong` | (rejected) |
The protocol spec defines supported chain lengths of 2, 3, and 4 certificates (see [Router certificate](../../../../protocol/simplex-messaging.md#router-certificate)). In all `CCValid` cases, `idCert` is the certificate whose fingerprint is compared against the server address key hash, and `caCert` is used as the X.509 trust anchor.
The protocol spec defines supported chain lengths of 2, 3, and 4 certificates (see [Router certificate](../../../../protocol/simplex-messaging.md#router-certificate)). In all `CCValid` cases, `idCert` is the certificate whose fingerprint is compared against the router identity (key hash in the queue URI), and `caCert` is used as the X.509 trust anchor.
In the 4-cert case, index 2 is skipped (`_`) — it is present in the chain but not used as either the identity or the trust anchor.
## x509validate — FQHN check disabled
`x509validate` sets `checkFQHN = False`. The protocol spec identifies servers by certificate fingerprint (key hash in the server address), not by domain name. The validation uses a fresh `ValidationCache` (`ValidationCacheUnknown` for all lookups, no-op store) — each connection validates independently.
`x509validate` sets `checkFQHN = False`. The protocol spec identifies routers by certificate fingerprint (key hash in the queue URI), not by domain name. The validation uses a fresh `ValidationCache` (`ValidationCacheUnknown` for all lookups, no-op store) — each connection validates independently.