Files
Evgeny @ SimpleX Chat f7be44981a SMP router specs
2026-03-12 18:13:12 +00:00

3.6 KiB

Simplex.Messaging.Server.Stats

Router statistics: counters, rolling period tracking, delivery time histograms, proxy stats, service stats.

Source: Stats.hs

Overview

All stats are IORef-based, not STM — individual increments are atomic (atomicModifyIORef'_) but multi-field reads are not transactional. getServerStatsData reads 30+ IORefs sequentially — the resulting snapshot is temporally smeared, not a point-in-time atomic view.

PeriodStats — rolling window with boundary-only reset

PeriodStats maintains three IORef IntSet (day, week, month). updatePeriodStats hashes the entity ID and inserts into all three periods. periodStatCounts resets a period's IntSet only when the period boundary is reached (day 1 of that period). At non-boundary times, it returns "" (empty string) — the data is kept accumulating but not reported.

See comment on periodCount. At day boundary (periodCount 1 ref), the day set is atomically swapped to empty and its size returned. Week resets on Monday (day 1 of week), month on the 1st. Periods are independent — day reset does NOT affect week/month accumulation. Each period counts unique queue hashes that were active during that period.

Disabled metrics — performance trade-offs

See comments on qSubNoMsg and subscribedQueues in the source. qSubNoMsg is disabled because counting "subscription with no message" creates too many STM transactions. subscribedQueues is disabled because maintaining PeriodStats-style IntSets for all subscribed queues uses too much memory. Both fields are omitted from the stats output entirely. The parser handles old log files that contain these fields: qSubNoMsg is silently skipped via skipInt, and subscribedQueues is parsed but replaced with empty data.

TimeBuckets — ceil-aligned bucketing with precision loss

updateTimeBuckets quantizes delivery-to-acknowledgment times into sparse buckets. Exact for 0-5s, then ceil-aligned: 6-30s → 5s buckets, 31-60s → 10s, 61-180s → 30s, 180+s → 60s. The toBucket formula uses - ((- n) \div` m) * mfor ceiling division.sumTimeandmaxTime` preserve exact values; only the histogram is lossy.

Serialization backward compatibility — silent data coercion

The strP parser for ServerStatsData handles multiple format generations. Old format qDeleted= is read as (value, 0, 0)qDeletedNew and qDeletedSecured default to 0. qSubNoMsg is parsed and silently discarded (skipInt). subscribedQueues is parsed but replaced with empty data. Data loaded from old formats is coerced, not reconstructed — precision is permanently lost.

Serialization typo — internally consistent

The field _srvAssocUpdated is serialized as "assocUpdatedt=" (extra 't') in ServiceStatsData encoding. The parser expects the same misspelling. Both sides are consistent, so it works — but external systems expecting assocUpdated= will fail to parse.

atomicSwapIORef for stats logging

In logServerStats (Server.hs), each counter is read and reset via atomicSwapIORef ref 0. This is lock-free but means counters are zeroed after each logging interval — values represent delta since last log, not cumulative totals. qCount and msgCount are exceptions: they're read-only (via readIORef) because they track absolute current values, not deltas.

setPeriodStats — not thread safe

See comment on setPeriodStats. Uses writeIORef (not atomic). Only safe during router startup when no other threads are running. If called concurrently, period data could be corrupted.