From 40b814c4233f07f427854b092d60cdf28dd2ba8c Mon Sep 17 00:00:00 2001 From: Narasimha-sc <166327228+Narasimha-sc@users.noreply.github.com> Date: Tue, 21 Apr 2026 09:13:36 +0000 Subject: [PATCH] support-bot: plans adversarial review --- .../20260207-support-bot-implementation.md | 299 ++++++++++++++++-- .../plans/20260207-support-bot.md | 2 +- 2 files changed, 272 insertions(+), 29 deletions(-) diff --git a/apps/simplex-support-bot/plans/20260207-support-bot-implementation.md b/apps/simplex-support-bot/plans/20260207-support-bot-implementation.md index d1c76d9b56..22ff5fc7e1 100644 --- a/apps/simplex-support-bot/plans/20260207-support-bot-implementation.md +++ b/apps/simplex-support-bot/plans/20260207-support-bot-implementation.md @@ -38,8 +38,9 @@ SimpleX Chat support bot — standalone Node.js app using `simplex-chat-nodejs` ``` apps/simplex-support-bot/ -├── package.json # deps: simplex-chat, @simplex-chat/types, async-mutex +├── package.json # deps: simplex-chat, @simplex-chat/types, async-mutex; devDeps: vitest, @types/node ├── tsconfig.json # ES2022, strict, Node16 module resolution +├── vitest.config.ts # test runner config, path aliases for mocks ├── src/ │ ├── index.ts # Entry: parse config, init instance, run │ ├── config.ts # CLI arg parsing, ID:name validation, Config type @@ -48,6 +49,11 @@ apps/simplex-support-bot/ │ ├── grok.ts # GrokApiClient: xAI API wrapper, system prompt, history │ ├── messages.ts # All user-facing message templates │ └── util.ts # isWeekend, profileMutex, logging helpers +├── bot.test.ts # Vitest suite (129 tests, 28 describes) +├── test/ +│ └── __mocks__/ +│ ├── simplex-chat.js # MockChatApi + utility re-exports +│ └── simplex-chat-types.js # enum re-exports for tests └── data/ # SQLite databases (created at runtime) ``` @@ -63,12 +69,41 @@ The Grok system-prompt / context file is supplied at runtime via `--context-file | `--team-group` | Yes | — | `name` | Team group display name (auto-created if absent, resolved by persisted ID on restarts) | | `--auto-add-team-members` / `-a` | No | `""` | `ID:name,...` | Comma-separated team member contacts. Validated at startup — exits on mismatch. | | `--context-file` | Required when `GROK_API_KEY` set | — | path | Grok system-prompt file (SimpleX documentation context). `parseConfig` throws if `GROK_API_KEY` is set without this flag. | -| `--timezone` | No | `"UTC"` | IANA tz | For weekend detection (24h vs 48h). Weekend = Sat 00:00 – Sun 23:59 in this tz. | -| `--complete-hours` | No | `3` | number | Hours of customer inactivity after last team/Grok reply before auto-completing a conversation (✅) | -| `--card-flush-seconds` | No | `300` | number | Seconds between card dashboard update flushes | +| `--timezone` | No | `"UTC"` | IANA tz | For weekend detection (24h vs 48h). Weekend = Sat 00:00 – Sun 23:59 in this tz. `parseConfig` validates the value by constructing a probe `Intl.DateTimeFormat` and throws with a clear error on `RangeError` (invalid IANA zone) — bot exits before init. | +| `--complete-hours` | No | `3` | integer ≥ 0 | Hours of customer inactivity after last team/Grok reply before auto-completing a conversation (✅). `parseConfig` rejects non-numeric, negative, or `NaN` values with a fail-fast error. `0` is allowed and disables auto-complete. | +| `--card-flush-seconds` | No | `300` | integer ≥ 0 | Seconds between card dashboard update flushes. `parseConfig` rejects non-numeric, negative, or `NaN` values with a fail-fast error. `0` is allowed and disables periodic flush (card updates still occur on explicit `scheduleUpdate` callers but never auto-drain). | **Env vars:** `GROK_API_KEY` (optional) — xAI API key. If unset or empty, the bot starts with Grok support fully disabled: it logs `"No GROK_API_KEY provided, disabling Grok support"`, skips Grok profile/contact setup and event handler registration, omits `/grok` from the bot command list, drops the `/grok` clause from customer-facing messages, and treats any `/grok` the customer still types as an unknown command. +**Numeric argument validation:** `parseConfig` MUST validate every numeric flag (`--complete-hours`, `--card-flush-seconds`) using a helper that throws on non-finite or negative results, rather than raw `parseInt`: + +```typescript +function parseNonNegativeInt(raw: string, flag: string): number { + const n = parseInt(raw, 10) + if (!Number.isFinite(n) || n < 0) { + throw new Error(`${flag} must be a non-negative integer, got "${raw}"`) + } + return n +} + +const completeHours = parseNonNegativeInt(optionalArg(args, "--complete-hours", "3"), "--complete-hours") +const cardFlushSeconds = parseNonNegativeInt(optionalArg(args, "--card-flush-seconds", "300"), "--card-flush-seconds") +``` + +Rationale: `parseInt("foo", 10)` returns `NaN`, and `NaN * 3600_000 === NaN`. Every subsequent comparison (`now - lastTeamGrokTime >= completeMs`) is `false`, so the feature silently becomes a no-op — auto-complete never fires, cards never auto-refresh — and the operator has no signal that they typo'd a flag. Failing fast at startup surfaces the typo before customers interact. `0` is explicitly allowed as a valid "disable" setting. + +**Timezone validation:** `parseConfig` MUST validate `--timezone` by constructing a probe `Intl.DateTimeFormat`: + +```typescript +try { + new Intl.DateTimeFormat("en-US", {timeZone: timezone, weekday: "short"}) +} catch (err) { + throw new Error(`--timezone "${timezone}" is not a valid IANA time zone: ${(err as Error).message}`) +} +``` + +Rationale: `isWeekend` is called from `queueMessage` and `teamAddedMessage` — both run on the hot customer message path. `new Intl.DateTimeFormat(..., {timeZone: , ...})` throws `RangeError: Invalid time zone specified` at every call. Without startup validation, a typo in `--timezone` turns every `/grok`, `/team`, or first-customer-message dispatch into an unhandled error that crashes the per-item handler (though the outer try/catch in `onNewChatItems` contains it, customers receive no reply at all). Validating once at startup surfaces the typo in the operator's console before any customer interaction. + ```typescript interface Config { dbPrefix: string @@ -92,10 +127,58 @@ Only two keys. All other state is persisted in the group's `customData` (per-con **Grok contact resolution** (state-file lookup always runs; contact establishment only when enabled): 1. Read `grokContactId` from state file → validate via `apiListContacts` → set `config.grokContactId` (this always runs, even when `grokApiKey === null`, so the one-way gate can identify and remove Grok members from groups) -2. If not found and `grokEnabled`: main profile creates one-time invite link, Grok profile connects, wait `contactConnected` (60s), persist new contact ID +2. If not found and `grokEnabled`: main profile creates one-time invite link, Grok profile connects, wait for a `contactConnected` event filtered by profile identity (60s — see "Grok contact identification" below), persist the resulting `contactId` atomically before proceeding. 3. If unavailable (with Grok otherwise enabled), bot runs but `/grok` returns "temporarily unavailable" 4. If `grokApiKey === null`: the Grok profile is not resolved or created, no invite link is issued — but `config.grokContactId` is still set from the state file if the contact exists. +### Grok contact identification + +`grokContactId` is written once and used forever — it is the single identifier for every subsequent Grok check (one-way gate, `onMemberConnected` skip, `isGrok` in card rendering). Identification MUST be narrowly scoped so that the `contactId` stored is unambiguously Grok's and no other contact completing a handshake in the 60s establishment window can be latched by mistake. + +Use the predicate form of `ChatApi.wait`. The signature (defined in `node_modules/simplex-chat/src/api.ts:217`) is: + +```typescript +wait( + event: K, + predicate: ((event: ChatEvent & {type: K}) => boolean) | undefined, + timeout: number, +): Promise +``` + +The implementation (api.ts:234) keeps the subscriber attached when the predicate returns `false`, so non-matching events are silently discarded and the wait continues until a matching event arrives or the timeout fires. + +Identification accepts only a `contactConnected` event observed by the MAIN profile (the profile whose `apiCreateLink` issued the invite, and whose `contactId` we persist and later pass to `apiAddMember`) whose connecting contact's profile `displayName` equals the Grok profile's displayName: + +```typescript +const grokProfileName = grokUser.profile.displayName // "Grok" or "Grok AI" +const evt = await chat.wait( + "contactConnected", + (e) => + e.user.userId === mainUser.userId && + e.contact.profile.displayName === grokProfileName, + 60_000, +) +if (!evt) { + console.error(`Timeout waiting for Grok contact (60s, displayName="${grokProfileName}"). ` + + `Check SMP relay availability or re-run after clearing state. Exiting.`) + process.exit(1) +} +config.grokContactId = evt.contact.contactId +state.grokContactId = config.grokContactId +writeState(stateFilePath, state) // atomic: tmp-file + rename (see §13 state persistence) +log(`Grok contact established: ID=${config.grokContactId} (displayName="${grokProfileName}")`) +``` + +Filter rationale: +- `e.user.userId === mainUser.userId` selects the main profile's view of the handshake. Both profiles observe the handshake (the Grok-side event describes the main profile as the `contact`); only the main-side event carries the `contactId` we need for subsequent `apiAddMember` calls. +- `e.contact.profile.displayName === grokProfileName` accepts only the contact whose profile matches the Grok profile just created/updated. This rejects stray inbound contacts (late business-request acceptance, operator test DM, a reconnect of an existing contact) that may complete in the same 60s window. The displayName is read from `evt.contact.profile`, which is `LocalProfile` (see `@simplex-chat/types/src/types.ts:2867`). + +`grokProfileName` is captured from `grokUser.profile.displayName` immediately before the wait, so whichever name the Grok profile was created/updated with earlier in startup is the exact string matched here. + +Single-tenant deployment caveat: if a human contact happens to set its SimpleX displayName to the literal `"Grok"` (or `"Grok AI"`) and completes a handshake with the main profile in the 60s window, the displayName filter alone cannot distinguish them. MVP is single-tenant and Grok's profile is created by the bot itself, so this is not expected in practice; deployments that need stronger guarantees can add a second filter (e.g. `e.contact.profile.image === grokImage` — the bot knows the exact image bytes it assigned to the Grok profile). + +Persistence: `writeState` is atomic (tmp-file + `fs.renameSync`, see §13 "State persistence") so a crash between identification and persistence cannot corrupt the state file. `state.grokContactId` is flushed to disk BEFORE proceeding to bot event wiring — if the process dies after wiring but before persistence, the next startup would issue a second invite link and leave the first Grok contact orphaned in the database. + **Team group resolution** (auto-create): 1. Read `teamGroupId` from state file → validate via group list 2. If not found: create with `apiNewGroup`, persist new group ID @@ -115,6 +198,8 @@ Per-conversation state is stored in the group's `customData` and written at the **WELCOME detection:** customData has no `state` field until the bot handles the first transition. `deriveState` returns `WELCOME` precisely when `customData.state` is absent. +**Type vs. persisted state.** The `ConversationState` union in `cards.ts` enumerates all five conceptual states (`WELCOME | QUEUE | GROK | TEAM-PENDING | TEAM`) so event handlers and composition can reason about them uniformly. However, `WELCOME` is NEVER written to `customData.state` — the runtime invariant is "persisted state ∈ {QUEUE, GROK, TEAM-PENDING, TEAM}; absence of the `state` field derives as WELCOME". The `isConversationState` guard in `cards.ts` rejects `WELCOME` on read to preserve this invariant (any stale `state: "WELCOME"` from a crashed transition is treated as absent). Do NOT introduce a separate `PersistedState` type in MVP — the invariant is small enough to enforce at two choke points: `getRawCustomData` on read and the dispatch handlers on write. + **State-write matrix:** | Bot-observed event | `customData.state` written | @@ -188,7 +273,7 @@ Card is two messages. **Message 1 (card text):** **Message count:** All messages in chat history except the bot's own (`groupSnd` from main profile). -**Message preview:** Last several messages, most recent last, separated by ` / `. Newlines in message text are replaced with spaces to prevent card layout bloat from spam. The customer's display name is also sanitized (newlines → spaces) for the card header, but the `/join` command uses the raw name so it matches the actual group profile. Newest messages are prioritized — when the total exceeds ~1000 chars, the oldest messages are truncated (with `[truncated]` prepended) while the newest are always shown. When truncation occurs, the first visible message is guaranteed to have a sender prefix even if it was a continuation in the original sequence. Each message is prefixed with the sender's name (`Name: message`) on the first message in a consecutive run from that sender - subsequent messages from the same sender omit the prefix until a different sender's message appears. Sender identification: Grok contact is detected by `grokContactId` and labeled "Grok"; the customer is identified by matching `memberId` to the group's `customerId` and labeled with their display name; all other members use their `memberProfile.displayName`. Bot's own messages (`groupSnd`) are excluded. Each message truncated to ~200 chars. Media-only messages show type labels: `[image]`, `[file]`, `[voice]`, `[video]`. +**Message preview:** Last several messages, most recent last, separated by ` / `. Newlines in message text are replaced with spaces to prevent card layout bloat from spam. The customer's display name is also sanitized (newlines → spaces) for the card header, but the `/join` command uses the raw name so it matches the actual group profile. Newest messages are prioritized — when the total exceeds ~500 chars (`maxTotal = 500` in `composeCard`), the oldest messages are truncated (with `[truncated]` prepended) while the newest are always shown. When truncation occurs, the first visible message is guaranteed to have a sender prefix even if it was a continuation in the original sequence. Each message is prefixed with the sender's name (`Name: message`) on the first message in a consecutive run from that sender - subsequent messages from the same sender omit the prefix until a different sender's message appears. Sender identification: Grok contact is detected by `grokContactId` and labeled "Grok"; the customer is identified by matching `memberId` to the group's `customerId` and labeled with their display name; all other members use their `memberProfile.displayName`. Bot's own messages (`groupSnd`) are excluded. Each message truncated to ~200 chars. Media-only messages show type labels: `[image]`, `[file]`, `[voice]`, `[video]`. **Join command:** `/join groupId:name` — `groupId` is the customer group's ID, `name` is the customer's display name. Names with spaces single-quoted: `/join 42:'First Last'`. @@ -203,17 +288,17 @@ Card is two messages. **Message 1 (card text):** **Update** (delete + repost) — on every subsequent event (new customer msg, team/Grok reply, state change, agent join): 1. Read `{cardItemId, joinItemId}` from `customData` -2. Delete old card + join command via `apiDeleteChatItems([Group, teamGroupId], [cardItemId, joinItemId], "broadcast")` — ignore errors -3. Post new card text + `/join` command as two messages → get new IDs -4. Overwrite `customData` with new `{cardItemId, joinItemId}` +2. Delete old card + join command via `apiDeleteChatItems([Group, teamGroupId], [cardItemId, joinItemId], "broadcast")`. Per `simplex-chat/src/api.ts:436-445` the call either returns `T.ChatItemDeletion[]` (possibly empty if the items no longer exist) or throws `ChatCommandError`. Both outcomes are acceptable: the surrounding `try { ... } catch { /* log and continue */ }` allows execution to proceed whether the items were still present, already gone, or the server returned a transient error. +3. Post new card text + `/join` command as two messages via `apiSendMessages` → get new IDs. **On failure** the partial-failure policy below applies: log, re-queue this groupId into `pendingUpdates`, return without writing `customData`. +4. Write `{cardItemId, joinItemId, complete?}` to `customData` via `mergeCustomData`. **On failure** the tracking-write policy below applies. **Debouncing:** Card updates debounced globally — pending changes flushed every `cardFlushSeconds` seconds (default 300, configurable via `--card-flush-seconds`). Within a batch, each group's card reposted at most once with latest state. **Wait time rules:** Time since the customer's last unanswered message. For ✅ (auto-completed) conversations, the wait field shows the literal string "done". If customer sends a follow-up, wait time resets to count from that message. -**Auto-complete:** A conversation is marked ✅ when `completeHours` (default 3h, configurable via `--complete-hours`) have passed since the last team/Grok message **without any customer reply**. The card debounce flush (every 15 min) checks elapsed time and transitions to ✅ when the threshold is met. Customer follow-up at any point — including after ✅ — reverts to the derived active icon (👋/💬/⏰ for team states, 🟡/🔴 for queue), and wait time resets from that message. +**Auto-complete:** A conversation is marked ✅ when `completeHours` (default 3h, configurable via `--complete-hours`) have passed since the last team/Grok message **without any customer reply**. The card debounce flush (every 300 seconds / 5 min, configurable via `--card-flush-seconds`) checks elapsed time and transitions to ✅ when the threshold is met. Customer follow-up at any point — including after ✅ — reverts to the derived active icon (👋/💬/⏰ for team states, 🟡/🔴 for queue), and wait time resets from that message. -**Card icon state machine (TEAM states):** +**Card icon derivation (TEAM states) — computed at each card render by comparing the timestamps of the most recent customer and team/Grok messages in the group; nothing about the icon is stored:** ``` Team added, no reply yet → 👋 Team replied → 💬 @@ -226,6 +311,67 @@ Customer sends after ✅ → back to 💬 or ⏰ (derived from wait ti **Restart recovery:** On startup, `CardManager.refreshAllCards()` lists all groups, finds those with `customData.cardItemId` set and `customData.complete` not set, sorts by `cardItemId` ascending (higher ID = more recently updated), and re-posts them oldest-first so the most recently active cards end up at the bottom of the team group. Completed cards (`complete: true`) and old/pre-bot groups (no `customData`) are skipped. Old card messages are deleted before reposting; deletion failures (e.g., >24h old) are silently ignored. Individual card failures are caught and logged without aborting the batch. +### Partial-failure and retry policy + +`createCard` and `updateCard` perform a multi-step sequence (delete + send + customData write). To design the correct policy we MUST be explicit about which failures the SimpleX core already handles for us vs. which surface to the bot: + +**SimpleX core semantics** (per `simplex-chat/src/api.ts` JSDoc): +- `apiSendMessages` — "Network usage: background". The call returns `newChatItems` once the chat item is CREATED LOCALLY (written to SQLite) and the SMP broadcast is QUEUED. The core's background machinery retries relay delivery transparently — **the bot never observes a transient relay failure from `apiSendMessages`**. A thrown `ChatCommandError` means the local create step itself failed: permission denied, chat does not exist, invalid content, DB locked/corrupted. +- `apiDeleteChatItems` — "Network usage: background". Same pattern: local delete + queued broadcast + core-managed delivery retry. A thrown error means the local delete step failed (item not found, permission, DB error). +- `apiSetGroupCustomData` — "Network usage: **no**". Pure local SQLite write, no SMP involvement at all. A thrown error means a local DB error. + +Consequence: failures surfaced to the bot are **terminal local errors** (bad state, DB problem, permission change), not transient network blips. Retrying the same operation against the same DB/relay state will usually hit the same error. Retry value comes from the narrow slice of genuinely transient local conditions — a brief SQLite lock held by a concurrent write, a race with group-state mutation elsewhere in the same process — where the next attempt sees a different state. + +This reshapes the policy: the bot does not need aggressive retry for "network" reasons (core handles that), and compensating actions for customData-write failure are rarely useful (if the pure-local customData write fails, the retry's customData write will almost certainly fail for the same reason). The bot needs a light safety net: re-queue on any step failure, let the flush loop try again at most once per `cardFlushSeconds`, and on persistent failure accept that operator intervention is needed. + +Policy (applies to both `createCard` and `updateCard`): + +**Any step fails** — whether step 2 (delete), step 3 (send), or step 4 (customData write): +- Log via `logError` with `{groupId, step, err}` so the operator can diagnose the underlying cause (permission change, DB corruption, bot removed from team group, etc). +- Re-add `groupId` to `pendingUpdates` via `this.scheduleUpdate(groupId)`. +- Return. Do NOT attempt compensating actions (no compensating delete for tracking-write failure — the scenario where send succeeds locally but customData write fails requires the SQLite DB to be healthy-then-unhealthy between two synchronous calls in the same transaction window, which is not a realistic transient state; the retry path handles any resulting duplicate by reading the stale `cardItemId` and deleting it on the next update attempt). + +**Flush dispatch** — the current `flush` loop calls `updateCard` unconditionally and `updateCard` returns early when `customData.cardItemId` is unset. This silently drops the retry path for a failed `createCard` — the group is in `pendingUpdates` but nothing will ever create a card for it. Replace with a single `flushOne(groupId)` that reads `customData` once and dispatches to create or update: + +```typescript +private async flushOne(groupId: number): Promise { + const groupInfo = await this.getGroupInfo(groupId) + if (!groupInfo) return // group deleted + const customData = this.deriveCustomData(groupInfo) + if (customData.complete) return // ✅ conversations don't auto-repost + if (typeof customData.cardItemId === "number") { + await this.updateCard(groupId, groupInfo) + } else { + await this.createCard(groupId, groupInfo) + } +} + +async flush(): Promise { + const groups = [...this.pendingUpdates] + this.pendingUpdates.clear() + for (const groupId of groups) { + try { await this.flushOne(groupId) } + catch (err) { + logError(`flush failed for group ${groupId}`, err) + this.scheduleUpdate(groupId) // re-queue on any thrown error + } + } +} +``` + +Retry behavior for each failure point under this design: + +| Failure point | `customData` after failure | Retry's `flushOne` path | Retry outcome if condition cleared | +|---|---|---|---| +| `createCard` send fails | `cardItemId` absent | create-path | fresh card posted, `customData` written | +| `updateCard` delete fails | old `cardItemId` still set | update-path | delete retried (idempotent — see below) + send + write | +| `updateCard` send fails (delete succeeded) | old (now-deleted) `cardItemId` still set | update-path | delete retried against stale ID — tolerated (see below) — then send + write | +| `updateCard` write fails (send succeeded, duplicate may exist) | old `cardItemId` still set, new card orphaned in team group | update-path | delete retried against stale old ID — tolerated — new card posted, tracking written; **leaked** new-card from the failed attempt persists until operator removes it | + +**Delete idempotency on retry** — `apiDeleteChatItems` against already-deleted IDs returns either an empty `ChatItemDeletion[]` or throws `ChatCommandError`. The step-2 `try { ... } catch { logError(...) }` swallows both; execution proceeds to step 3. Do NOT escalate a step-2 error to the partial-failure policy — that would create a retry loop for a permanent condition (items past the 24h deletion window will throw on every retry forever). + +**Persistent failures** — if the underlying condition is not transient (bot removed from team group, DB corruption, permission revoked), every retry hits the same error and the group stays in `pendingUpdates` indefinitely, logging at each flush. MVP accepts this — the operator-visible log stream makes the problem diagnosable. A bounded-retry-with-backoff-and-giveup strategy can be added later without changing the failure-point table above. + ### Card implementation ```typescript @@ -390,6 +536,51 @@ async function withProfile(userId: number, fn: () => Promise): Promise Grok HTTP API calls are made **outside** the mutex to avoid blocking. +**Per-group customData mutex** — `mergeCustomData` and `clearCustomData` must be serialized per customer group. `mergeCustomData` has two awaits (read via `getRawCustomData` → `apiListGroups`, then write via `apiSetGroupCustomData`); between them the event loop runs, so two concurrent async chains operating on the same `groupId` can both read the same snapshot, both produce a merged object, and the second write clobbers the first's patch. + +Concrete call sites that can overlap on one `groupId`: +- `processMainChatItem` writing `state` transitions (WELCOME→QUEUE, WELCOME→GROK, QUEUE→GROK, one-way gate →TEAM) +- `activateGrok`'s `revertStateOnFail` (fire-and-forget) racing with subsequent customer messages +- `activateTeam` writing `TEAM-PENDING` racing with `/grok` or another `/team` on the same group +- `CardManager.flush → updateCard` writing `{cardItemId, joinItemId, complete}` racing with dispatch writing `state` +- `createCard` writing `{cardItemId, joinItemId}` immediately after dispatch writes `state` + +The CAS-on-state inside `revertStateOnFail` guards only the `state` key — other keys (`cardItemId`, `joinItemId`, `complete`) can still be lost when spread from a stale snapshot. + +Implementation: + +```typescript +// In CardManager +private customDataMutexes = new Map() + +private getCustomDataMutex(groupId: number): Mutex { + let m = this.customDataMutexes.get(groupId) + if (!m) { m = new Mutex(); this.customDataMutexes.set(groupId, m) } + return m +} + +async mergeCustomData(groupId: number, patch: Partial): Promise { + return this.getCustomDataMutex(groupId).runExclusive(async () => { + const current = (await this.getRawCustomData(groupId)) ?? {} + const merged = {...current, ...patch} + for (const key of Object.keys(merged) as (keyof CardData)[]) { + if (merged[key] === undefined) delete merged[key] + } + await this.withMainProfile(() => this.chat.apiSetGroupCustomData(groupId, merged)) + }) +} + +async clearCustomData(groupId: number): Promise { + return this.getCustomDataMutex(groupId).runExclusive(() => + this.withMainProfile(() => this.chat.apiSetGroupCustomData(groupId)) + ) +} +``` + +Nesting rule: the per-group customData mutex is the **outer** lock; `profileMutex` (via `withMainProfile`) is the **inner** lock. Never acquire them in the opposite order, and never hold the customData mutex while calling an external (non-SimpleX) async function — this prevents cross-group deadlock and keeps the critical section short. + +Cleanup: entries in `customDataMutexes` are bounded by the number of customer groups. Removing the entry on `onLeftMember(customer)` is sufficient (the group's `customData` is also cleared at that point). Skip this refinement in MVP if acceptable — a long-running bot with many customers accumulates a few bytes per group. + **Profile images:** Both profiles have base64-encoded JPEG profile pictures (128x128, quality 85, under the 12,500-char data URI limit enforced by iOS/Android clients) set via the `image` field in `T.Profile`. The images are defined as `data:image/jpg;base64,...` string constants in `index.ts`. The main profile image is passed to `bot.run()` which handles update-on-change automatically. The Grok profile image is passed to `apiCreateActiveUser()` on first run; on subsequent runs, the bot compares the current profile against the desired one using `util.fromLocalProfile()` and calls `apiUpdateProfile()` if any field differs — this sends the update to all Grok contacts. **Startup sequence:** @@ -405,7 +596,7 @@ Grok HTTP API calls are made **outside** the mutex to avoid blocking. 9. Validate `--auto-add-team-members` (`-a`) if provided 10. Register Grok event handlers on `chat` (filtered by `event.user === grokUserId`) 10b. Refresh stale cards: `CardManager.refreshAllCards()` — lists all groups, skips those with `customData.complete` or no `customData.cardItemId`, sorts remaining by `cardItemId` ascending, re-posts oldest-first so newest cards land at the bottom of team group -11. On SIGINT/SIGTERM → delete invite link, exit +11. On SIGINT/SIGTERM → `clearTimeout(inviteLinkTimer)` (noop if already deleted), `cards.destroy()` (stops the card-flush interval), `deleteInviteLink()` (profileMutex-gated `apiDeleteGroupLink`), `process.exit(0)`. Signal handler is reentrant-safe: an `inviteLinkDeleted` flag prevents double-deletion; `clearTimeout`/`clearInterval` are no-op on undefined. **Grok event registration** (same ChatApi, filtered by profile): @@ -459,8 +650,7 @@ chat.on("connectedToGroupMember", (evt) => { // 4. Skip groupSnd (own messages) // 5. Identify sender via businessChat.customerId // 6. Team member message → check if first team text (trigger one-way gate: remove Grok, disable /grok), schedule card update -// 7. Team member or Grok reaction → schedule card update (auto-complete) -// 8. Customer message → derive state, dispatch: +// 7. Customer message → derive state, dispatch: // - WELCOME: create card, send queue msg (or handle /grok first msg → WELCOME→GROK, skip queue) // - QUEUE: /grok → invite Grok; /team → add ALL configured team members; else schedule card update // - GROK: /team → add ALL configured team members (Grok stays); else schedule card update @@ -516,10 +706,10 @@ Type signatures affected: **Main profile side (invite + failure detection):** 0. Send `grokInvitingMessage` ("Inviting Grok, please wait...") -1. `apiAddMember(groupId, grokContactId, Member)` → get `member.memberId`. If `groupDuplicateMember` (customer sent `/grok` again before join completed), silent return — the in-flight activation handles the outcome. -2. Store `pendingGrokJoins.set(memberId, mainGroupId)` -3. Drain `bufferedGrokInvitations` — if the `receivedGroupInvitation` event arrived during step 1's await (race condition), process it now. -4. Set `grokInitialResponsePending.add(groupId)` — suppresses per-message responses from `onGrokNewChatItems` for this group until the initial combined response completes. Without this gate, the message backlog arriving via `newChatItems` would trigger individual per-message responses racing with the initial combined response — producing duplicate replies (e.g., 3 replies for 2 messages). +1. **Set `grokInitialResponsePending.add(groupId)` FIRST** — the gate must be raised before any operation that could make Grok recognizable to `onGrokNewChatItems`. Specifically: before `apiAddMember`, before `pendingGrokJoins` is set, and before `bufferedGrokInvitations` is drained (which populates `reverseGrokMap`). Without this ordering, the sequence `apiAddMember → pendingGrokJoins.set → drain → reverseGrokMap.set → gate.add` contains a window where `reverseGrokMap` identifies the group as a Grok-active group but the gate is still DOWN. A customer message arriving in that window triggers a per-message response concurrent with the initial combined response — producing duplicate Grok replies. Every error path below MUST clear the gate. +2. `apiAddMember(groupId, grokContactId, Member)` → get `member.memberId`. On `groupDuplicateMember` (customer sent `/grok` again before join completed), **clear the gate** and silent return — the in-flight activation handles the outcome. On any other error, clear the gate, revert state, send `grokUnavailableMessage`. +3. Store `pendingGrokJoins.set(memberId, mainGroupId)` +4. Drain `bufferedGrokInvitations` — if the `receivedGroupInvitation` event arrived during step 2's await (race condition), process it now. (The gate is already up from step 1, so `onGrokNewChatItems` suppresses any per-message responses during drain and the subsequent join.) 5. `waitForGrokJoin(120s)` — awaits resolver from Grok profile's `connectedToGroupMember` (step 8 below) 6. Timeout → notify customer (`grokUnavailableMessage`), send queue message if was WELCOME→GROK, fall back to QUEUE (CAS-guarded: only if `customData.state` is still `GROK` — a concurrent `/team` that switched to `TEAM-PENDING` is respected), clear `grokInitialResponsePending` @@ -532,7 +722,7 @@ Type signatures affected: 10. If no customer messages found (visible history disabled or API failed), send generic greeting asking customer to repeat their question 11. Call Grok HTTP API (outside mutex) 12. Send response via `apiSendTextMessage` (through mutex with Grok profile) -13. Clear `grokInitialResponsePending` (via `finally` block — runs on success, failure, or early return). After this, per-message responses from `onGrokNewChatItems` resume normally for subsequent customer messages. +13. Clear `grokInitialResponsePending` (via `finally` block — runs on success, failure, or early return). After this, per-message responses from `onGrokNewChatItems` resume normally for subsequent customer messages. Note: because the gate is raised at step 1 (before any other work), the `finally` block MUST be wired to cover every code path from step 1 onward — including the `groupDuplicateMember` silent-return and all revert/timeout branches — otherwise per-message responses stay suppressed indefinitely for the affected group. ```typescript const pendingGrokJoins = new Map() // memberId → mainGroupId @@ -551,12 +741,12 @@ Grok profile's `onGrokNewChatItems` handler: 3. Only trigger for `groupRcv` **text** messages from customer (identified via `businessChat.customerId`) 4. Ignore: non-text messages (images, files, voice — card update handled by main profile), bot messages, own messages (`groupSnd`), team member messages 5. Read last 100 messages from own view (customer → `user`, own → `assistant`) -6. Call Grok HTTP API (serialized per group — queue if call in flight) +6. Call Grok HTTP API — different groups' calls run concurrently (see "Cross-group Grok parallelism" below). Per-group serialization of overlapping in-flight calls is NOT implemented in MVP (see §20.6). 7. Send response into group **Per-message error:** Send error message in group ("Sorry, I couldn't process that. Please try again or send /team for a human team member."), stay GROK. Customer can retry. -**Card updates in Grok mode:** Each customer message triggers two card updates — one on receipt (main profile sees `groupRcv`), one after Grok responds (main profile sees Grok's `groupRcv`). Both go through the 15-min debounce. +**Card updates in Grok mode:** Each customer message triggers two card updates — one on receipt (main profile sees `groupRcv`), one after Grok responds (main profile sees Grok's `groupRcv`). Both go through the 300-second debounce (default `--card-flush-seconds`). ### Grok removal @@ -597,6 +787,55 @@ If `GROK_API_KEY` is set but `--context-file` is missing, `parseConfig` throws a Customer messages always in `user` role, never `system`. +### Grok HTTP request timeout + +Every `fetch` to `api.x.ai/v1/chat/completions` MUST pass an `AbortSignal.timeout(60_000)` (60-second default). Without a timeout, a stuck TCP connection or an unresponsive server blocks the awaiting call indefinitely; because `processGrokChatItem` runs under the Grok profile's sequential event dispatch, a single hung call stalls per-message responses for ALL customer groups using Grok — and the same hang in `activateGrok`'s initial-response path leaves `grokInitialResponsePending` stuck (gate never released) until the process is killed. + +Implementation in `GrokApiClient.chatRaw`: + +```typescript +const response = await fetch("https://api.x.ai/v1/chat/completions", { + method: "POST", + headers: { ... }, + body: JSON.stringify({ ... }), + signal: AbortSignal.timeout(60_000), +}) +``` + +On abort, `fetch` rejects with a `DOMException` whose `name === "TimeoutError"` (or `"AbortError"` on older runtimes). Callers treat this identically to other `chat()` failures: +- `processGrokChatItem` → sends `grokErrorMessage` to the customer group, conversation stays GROK. +- `activateGrok` initial-response path → logs, sends `grokUnavailableMessage`, lets the `finally` block clear `grokInitialResponsePending`. + +Rationale for 60s: typical xAI responses return in 1–10s; a 60s ceiling accommodates cold-start / heavy-load latencies while still bounding worst-case per-customer wait. Not exposed as a CLI flag in MVP — a later iteration can add `--grok-timeout-seconds` if operator tuning is needed. + +### Cross-group Grok parallelism + +`onGrokNewChatItems` MUST dispatch per-group work concurrently. A naïve `for (const ci of lastPerGroup.values()) { await this.processGrokChatItem(ci) }` serializes calls across unrelated customer groups — if xAI takes 3s per call and five customers message in one event batch, customer #5 waits ~15s instead of ~3s. This is pure latency amplification with no ordering benefit (the groups are independent; within-group order is already preserved by batch deduplication picking the last message). + +Implementation: + +```typescript +async onGrokNewChatItems(evt: CEvt.NewChatItems): Promise { + const lastPerGroup = new Map() + for (const ci of evt.chatItems) { + // filter: groupRcv, customer text, not bot/team + // keep last per groupId + } + await Promise.allSettled( + [...lastPerGroup.values()].map((ci) => this.processGrokChatItem(ci)), + ) +} +``` + +Why `Promise.allSettled` (not `Promise.all`): one group's Grok API failure MUST NOT cancel or reject pending work for other groups. Each `processGrokChatItem` already handles its own errors (sends `grokErrorMessage`, logs); the outer handler only needs to wait until all per-group tasks finish before returning control to the event dispatcher. + +Concurrency bound: the number of distinct customer groups that have new Grok-eligible messages in a single event batch — typically ≤ the SimpleX batch-delivery size, practically small. No global semaphore needed in MVP. If xAI rate limits become a concern, add a shared semaphore later; orthogonal to this fix. + +Ordering guarantees preserved: +- Within a group, batch deduplication still picks only the latest message and earlier messages appear in the history context via `apiGetChat`. +- Across groups, there is no ordering requirement — each customer group is an independent conversation. +- The per-group gate (`grokInitialResponsePending`) still serializes against `activateGrok`'s initial response; this is a group-local check unaffected by cross-group parallelism. + ## 11. Team Group Commands | Command | Effect | @@ -835,7 +1074,7 @@ npx ts-node src/index.ts \ 7. Team member sends text → verify Grok removed, `/grok` rejected, card → 💬 8. `/grok` in TEAM → verify "team mode" rejection 9. `/team` when already invited → verify "already invited" message -10. Card debouncing: multiple rapid events → verify single card update per 15min flush +10. Card debouncing: multiple rapid events → verify single card update per 300s flush (default) 11. `/join` from team group → verify team member added to customer group, promoted to Owner 12. `/join` with non-business group → verify error 13. Weekend → verify "48 hours" @@ -983,7 +1222,7 @@ async function reachTeam(groupId?) // reachTeamPending → add team membe Called as: `const p = simulateGrokJoinSuccess(); await bot.onNewChatItems(...); await p;` -### 20.4 Test Catalog (129 tests, 27 suites) +### 20.4 Test Catalog (129 tests, 28 suites) #### 1. Welcome & First Message (4 tests) - first message → queue reply + card created with /join command @@ -1022,6 +1261,10 @@ Called as: `const p = simulateGrokJoinSuccess(); await bot.onNewChatItems(...); - customer text in TEAM → no bot reply, card update scheduled - /grok in TEAM-PENDING → invite Grok if not present +#### 5b. One-Way Gate with Grok Disabled (2 tests) +- team text removes Grok even when grokApi is null +- Grok does not respond when disabled even if grokContactId is set + #### 6. Team Member Lifecycle (6 tests) - team member connected → promoted to Owner - customer connected → NOT promoted @@ -1178,11 +1421,11 @@ Called as: `const p = simulateGrokJoinSuccess(); await bot.onNewChatItems(...); ### 20.6 Test Coverage Notes **Covered vs plan catalog:** -- §20.4 items 1-13, 15, 17-27 fully covered (129 tests across 27 suites) -- §20.4 item 14 (Weekend Detection) — not unit-tested; `isWeekend` depends on `Intl.DateTimeFormat(new Date())`, would need clock mocking -- §20.4 item 16 (Profile Mutex) — not unit-tested; mutex serialization is verified implicitly through all other tests (MockChatApi tracks activeUserId) -- §20.4 item 19 (Startup & State Persistence) — not unit-tested; tests `index.ts` startup which requires native ChatApi. Integration test only. This includes `deleteInviteLink` (profileMutex + `apiSetActiveUser` before `apiDeleteGroupLink`), the conditional `apiUpdateGroupProfile` (compare `fullGroupPreferences` before calling), and the best-effort `apiCreateGroupLink` (catch + log on SMP relay failure) — all are in startup code and cannot be covered by MockChatApi-based tests. +- §20.4 suites 1-13, 15, 17-27 plus 5b fully covered (129 tests across 28 suites) +- Weekend detection (`util.isWeekend`) — not unit-tested; depends on `Intl.DateTimeFormat(new Date())`, would need clock mocking. Not present in the §20.4 catalog. +- Profile Mutex serialization — not a standalone suite in §20.4; verified implicitly through all other tests (MockChatApi tracks activeUserId). +- Startup & state persistence (`index.ts` path) — not unit-tested; requires native ChatApi. Integration-test only. Includes `deleteInviteLink` (profileMutex + `apiSetActiveUser` before `apiDeleteGroupLink`), the conditional `apiUpdateGroupProfile` (compare `fullGroupPreferences` before calling), and the best-effort `apiCreateGroupLink` (catch + log on SMP relay failure). Not in §20.4 catalog. **Known plan items NOT implemented (conscious gaps, not test gaps):** - Per-group Grok API call serialization (plan §10) — not implemented or tested -- Team member replacement on leave after sending (plan §15) — not implemented +- Team member replacement on leave after sending — out of MVP scope. No plan section currently asserts it as a requirement; if added later, specify in SPEC §4.2 "Team replies" and implementation plan §15 "Error Handling" together. diff --git a/apps/simplex-support-bot/plans/20260207-support-bot.md b/apps/simplex-support-bot/plans/20260207-support-bot.md index c24b68563e..b4d0563dda 100644 --- a/apps/simplex-support-bot/plans/20260207-support-bot.md +++ b/apps/simplex-support-bot/plans/20260207-support-bot.md @@ -462,7 +462,7 @@ In Grok mode, each customer message triggers two card updates — one on receipt If the Grok HTTP API call fails or times out for a per-message request, the Grok profile sends an error message into the group: "Sorry, I couldn't process that. Please try again or send /team for a human team member." Grok remains in the group and the state stays GROK — the customer can retry by sending another message. -Grok API calls are serialized per customer group — if a new customer message arrives while a Grok API call is in flight, it is queued and processed after the current call completes. This ensures Grok's history includes its own prior response before handling the next message. +Grok API calls are NOT serialized per customer group in the MVP. If a new customer message arrives while a Grok API call is in flight, a second call runs concurrently — `apiGetChat` is re-read at the start of each call so history converges eventually, but two rapid messages in the same group can produce interleaved context. Cross-group calls run concurrently by design (see implementation plan §10 "Cross-group Grok parallelism"). Per-group serialization is a planned future improvement. #### Grok removal