mirror of
https://github.com/simplex-chat/simplex-chat.git
synced 2026-05-11 11:04:56 +00:00
support-bot: plans adversarial review
This commit is contained in:
@@ -38,8 +38,9 @@ SimpleX Chat support bot — standalone Node.js app using `simplex-chat-nodejs`
|
||||
|
||||
```
|
||||
apps/simplex-support-bot/
|
||||
├── package.json # deps: simplex-chat, @simplex-chat/types, async-mutex
|
||||
├── package.json # deps: simplex-chat, @simplex-chat/types, async-mutex; devDeps: vitest, @types/node
|
||||
├── tsconfig.json # ES2022, strict, Node16 module resolution
|
||||
├── vitest.config.ts # test runner config, path aliases for mocks
|
||||
├── src/
|
||||
│ ├── index.ts # Entry: parse config, init instance, run
|
||||
│ ├── config.ts # CLI arg parsing, ID:name validation, Config type
|
||||
@@ -48,6 +49,11 @@ apps/simplex-support-bot/
|
||||
│ ├── grok.ts # GrokApiClient: xAI API wrapper, system prompt, history
|
||||
│ ├── messages.ts # All user-facing message templates
|
||||
│ └── util.ts # isWeekend, profileMutex, logging helpers
|
||||
├── bot.test.ts # Vitest suite (129 tests, 28 describes)
|
||||
├── test/
|
||||
│ └── __mocks__/
|
||||
│ ├── simplex-chat.js # MockChatApi + utility re-exports
|
||||
│ └── simplex-chat-types.js # enum re-exports for tests
|
||||
└── data/ # SQLite databases (created at runtime)
|
||||
```
|
||||
|
||||
@@ -63,12 +69,41 @@ The Grok system-prompt / context file is supplied at runtime via `--context-file
|
||||
| `--team-group` | Yes | — | `name` | Team group display name (auto-created if absent, resolved by persisted ID on restarts) |
|
||||
| `--auto-add-team-members` / `-a` | No | `""` | `ID:name,...` | Comma-separated team member contacts. Validated at startup — exits on mismatch. |
|
||||
| `--context-file` | Required when `GROK_API_KEY` set | — | path | Grok system-prompt file (SimpleX documentation context). `parseConfig` throws if `GROK_API_KEY` is set without this flag. |
|
||||
| `--timezone` | No | `"UTC"` | IANA tz | For weekend detection (24h vs 48h). Weekend = Sat 00:00 – Sun 23:59 in this tz. |
|
||||
| `--complete-hours` | No | `3` | number | Hours of customer inactivity after last team/Grok reply before auto-completing a conversation (✅) |
|
||||
| `--card-flush-seconds` | No | `300` | number | Seconds between card dashboard update flushes |
|
||||
| `--timezone` | No | `"UTC"` | IANA tz | For weekend detection (24h vs 48h). Weekend = Sat 00:00 – Sun 23:59 in this tz. `parseConfig` validates the value by constructing a probe `Intl.DateTimeFormat` and throws with a clear error on `RangeError` (invalid IANA zone) — bot exits before init. |
|
||||
| `--complete-hours` | No | `3` | integer ≥ 0 | Hours of customer inactivity after last team/Grok reply before auto-completing a conversation (✅). `parseConfig` rejects non-numeric, negative, or `NaN` values with a fail-fast error. `0` is allowed and disables auto-complete. |
|
||||
| `--card-flush-seconds` | No | `300` | integer ≥ 0 | Seconds between card dashboard update flushes. `parseConfig` rejects non-numeric, negative, or `NaN` values with a fail-fast error. `0` is allowed and disables periodic flush (card updates still occur on explicit `scheduleUpdate` callers but never auto-drain). |
|
||||
|
||||
**Env vars:** `GROK_API_KEY` (optional) — xAI API key. If unset or empty, the bot starts with Grok support fully disabled: it logs `"No GROK_API_KEY provided, disabling Grok support"`, skips Grok profile/contact setup and event handler registration, omits `/grok` from the bot command list, drops the `/grok` clause from customer-facing messages, and treats any `/grok` the customer still types as an unknown command.
|
||||
|
||||
**Numeric argument validation:** `parseConfig` MUST validate every numeric flag (`--complete-hours`, `--card-flush-seconds`) using a helper that throws on non-finite or negative results, rather than raw `parseInt`:
|
||||
|
||||
```typescript
|
||||
function parseNonNegativeInt(raw: string, flag: string): number {
|
||||
const n = parseInt(raw, 10)
|
||||
if (!Number.isFinite(n) || n < 0) {
|
||||
throw new Error(`${flag} must be a non-negative integer, got "${raw}"`)
|
||||
}
|
||||
return n
|
||||
}
|
||||
|
||||
const completeHours = parseNonNegativeInt(optionalArg(args, "--complete-hours", "3"), "--complete-hours")
|
||||
const cardFlushSeconds = parseNonNegativeInt(optionalArg(args, "--card-flush-seconds", "300"), "--card-flush-seconds")
|
||||
```
|
||||
|
||||
Rationale: `parseInt("foo", 10)` returns `NaN`, and `NaN * 3600_000 === NaN`. Every subsequent comparison (`now - lastTeamGrokTime >= completeMs`) is `false`, so the feature silently becomes a no-op — auto-complete never fires, cards never auto-refresh — and the operator has no signal that they typo'd a flag. Failing fast at startup surfaces the typo before customers interact. `0` is explicitly allowed as a valid "disable" setting.
|
||||
|
||||
**Timezone validation:** `parseConfig` MUST validate `--timezone` by constructing a probe `Intl.DateTimeFormat`:
|
||||
|
||||
```typescript
|
||||
try {
|
||||
new Intl.DateTimeFormat("en-US", {timeZone: timezone, weekday: "short"})
|
||||
} catch (err) {
|
||||
throw new Error(`--timezone "${timezone}" is not a valid IANA time zone: ${(err as Error).message}`)
|
||||
}
|
||||
```
|
||||
|
||||
Rationale: `isWeekend` is called from `queueMessage` and `teamAddedMessage` — both run on the hot customer message path. `new Intl.DateTimeFormat(..., {timeZone: <invalid>, ...})` throws `RangeError: Invalid time zone specified` at every call. Without startup validation, a typo in `--timezone` turns every `/grok`, `/team`, or first-customer-message dispatch into an unhandled error that crashes the per-item handler (though the outer try/catch in `onNewChatItems` contains it, customers receive no reply at all). Validating once at startup surfaces the typo in the operator's console before any customer interaction.
|
||||
|
||||
```typescript
|
||||
interface Config {
|
||||
dbPrefix: string
|
||||
@@ -92,10 +127,58 @@ Only two keys. All other state is persisted in the group's `customData` (per-con
|
||||
|
||||
**Grok contact resolution** (state-file lookup always runs; contact establishment only when enabled):
|
||||
1. Read `grokContactId` from state file → validate via `apiListContacts` → set `config.grokContactId` (this always runs, even when `grokApiKey === null`, so the one-way gate can identify and remove Grok members from groups)
|
||||
2. If not found and `grokEnabled`: main profile creates one-time invite link, Grok profile connects, wait `contactConnected` (60s), persist new contact ID
|
||||
2. If not found and `grokEnabled`: main profile creates one-time invite link, Grok profile connects, wait for a `contactConnected` event filtered by profile identity (60s — see "Grok contact identification" below), persist the resulting `contactId` atomically before proceeding.
|
||||
3. If unavailable (with Grok otherwise enabled), bot runs but `/grok` returns "temporarily unavailable"
|
||||
4. If `grokApiKey === null`: the Grok profile is not resolved or created, no invite link is issued — but `config.grokContactId` is still set from the state file if the contact exists.
|
||||
|
||||
### Grok contact identification
|
||||
|
||||
`grokContactId` is written once and used forever — it is the single identifier for every subsequent Grok check (one-way gate, `onMemberConnected` skip, `isGrok` in card rendering). Identification MUST be narrowly scoped so that the `contactId` stored is unambiguously Grok's and no other contact completing a handshake in the 60s establishment window can be latched by mistake.
|
||||
|
||||
Use the predicate form of `ChatApi.wait`. The signature (defined in `node_modules/simplex-chat/src/api.ts:217`) is:
|
||||
|
||||
```typescript
|
||||
wait<K extends CEvt.Tag>(
|
||||
event: K,
|
||||
predicate: ((event: ChatEvent & {type: K}) => boolean) | undefined,
|
||||
timeout: number,
|
||||
): Promise<ChatEvent & {type: K} | undefined>
|
||||
```
|
||||
|
||||
The implementation (api.ts:234) keeps the subscriber attached when the predicate returns `false`, so non-matching events are silently discarded and the wait continues until a matching event arrives or the timeout fires.
|
||||
|
||||
Identification accepts only a `contactConnected` event observed by the MAIN profile (the profile whose `apiCreateLink` issued the invite, and whose `contactId` we persist and later pass to `apiAddMember`) whose connecting contact's profile `displayName` equals the Grok profile's displayName:
|
||||
|
||||
```typescript
|
||||
const grokProfileName = grokUser.profile.displayName // "Grok" or "Grok AI"
|
||||
const evt = await chat.wait(
|
||||
"contactConnected",
|
||||
(e) =>
|
||||
e.user.userId === mainUser.userId &&
|
||||
e.contact.profile.displayName === grokProfileName,
|
||||
60_000,
|
||||
)
|
||||
if (!evt) {
|
||||
console.error(`Timeout waiting for Grok contact (60s, displayName="${grokProfileName}"). ` +
|
||||
`Check SMP relay availability or re-run after clearing state. Exiting.`)
|
||||
process.exit(1)
|
||||
}
|
||||
config.grokContactId = evt.contact.contactId
|
||||
state.grokContactId = config.grokContactId
|
||||
writeState(stateFilePath, state) // atomic: tmp-file + rename (see §13 state persistence)
|
||||
log(`Grok contact established: ID=${config.grokContactId} (displayName="${grokProfileName}")`)
|
||||
```
|
||||
|
||||
Filter rationale:
|
||||
- `e.user.userId === mainUser.userId` selects the main profile's view of the handshake. Both profiles observe the handshake (the Grok-side event describes the main profile as the `contact`); only the main-side event carries the `contactId` we need for subsequent `apiAddMember` calls.
|
||||
- `e.contact.profile.displayName === grokProfileName` accepts only the contact whose profile matches the Grok profile just created/updated. This rejects stray inbound contacts (late business-request acceptance, operator test DM, a reconnect of an existing contact) that may complete in the same 60s window. The displayName is read from `evt.contact.profile`, which is `LocalProfile` (see `@simplex-chat/types/src/types.ts:2867`).
|
||||
|
||||
`grokProfileName` is captured from `grokUser.profile.displayName` immediately before the wait, so whichever name the Grok profile was created/updated with earlier in startup is the exact string matched here.
|
||||
|
||||
Single-tenant deployment caveat: if a human contact happens to set its SimpleX displayName to the literal `"Grok"` (or `"Grok AI"`) and completes a handshake with the main profile in the 60s window, the displayName filter alone cannot distinguish them. MVP is single-tenant and Grok's profile is created by the bot itself, so this is not expected in practice; deployments that need stronger guarantees can add a second filter (e.g. `e.contact.profile.image === grokImage` — the bot knows the exact image bytes it assigned to the Grok profile).
|
||||
|
||||
Persistence: `writeState` is atomic (tmp-file + `fs.renameSync`, see §13 "State persistence") so a crash between identification and persistence cannot corrupt the state file. `state.grokContactId` is flushed to disk BEFORE proceeding to bot event wiring — if the process dies after wiring but before persistence, the next startup would issue a second invite link and leave the first Grok contact orphaned in the database.
|
||||
|
||||
**Team group resolution** (auto-create):
|
||||
1. Read `teamGroupId` from state file → validate via group list
|
||||
2. If not found: create with `apiNewGroup`, persist new group ID
|
||||
@@ -115,6 +198,8 @@ Per-conversation state is stored in the group's `customData` and written at the
|
||||
|
||||
**WELCOME detection:** customData has no `state` field until the bot handles the first transition. `deriveState` returns `WELCOME` precisely when `customData.state` is absent.
|
||||
|
||||
**Type vs. persisted state.** The `ConversationState` union in `cards.ts` enumerates all five conceptual states (`WELCOME | QUEUE | GROK | TEAM-PENDING | TEAM`) so event handlers and composition can reason about them uniformly. However, `WELCOME` is NEVER written to `customData.state` — the runtime invariant is "persisted state ∈ {QUEUE, GROK, TEAM-PENDING, TEAM}; absence of the `state` field derives as WELCOME". The `isConversationState` guard in `cards.ts` rejects `WELCOME` on read to preserve this invariant (any stale `state: "WELCOME"` from a crashed transition is treated as absent). Do NOT introduce a separate `PersistedState` type in MVP — the invariant is small enough to enforce at two choke points: `getRawCustomData` on read and the dispatch handlers on write.
|
||||
|
||||
**State-write matrix:**
|
||||
|
||||
| Bot-observed event | `customData.state` written |
|
||||
@@ -188,7 +273,7 @@ Card is two messages. **Message 1 (card text):**
|
||||
|
||||
**Message count:** All messages in chat history except the bot's own (`groupSnd` from main profile).
|
||||
|
||||
**Message preview:** Last several messages, most recent last, separated by ` / `. Newlines in message text are replaced with spaces to prevent card layout bloat from spam. The customer's display name is also sanitized (newlines → spaces) for the card header, but the `/join` command uses the raw name so it matches the actual group profile. Newest messages are prioritized — when the total exceeds ~1000 chars, the oldest messages are truncated (with `[truncated]` prepended) while the newest are always shown. When truncation occurs, the first visible message is guaranteed to have a sender prefix even if it was a continuation in the original sequence. Each message is prefixed with the sender's name (`Name: message`) on the first message in a consecutive run from that sender - subsequent messages from the same sender omit the prefix until a different sender's message appears. Sender identification: Grok contact is detected by `grokContactId` and labeled "Grok"; the customer is identified by matching `memberId` to the group's `customerId` and labeled with their display name; all other members use their `memberProfile.displayName`. Bot's own messages (`groupSnd`) are excluded. Each message truncated to ~200 chars. Media-only messages show type labels: `[image]`, `[file]`, `[voice]`, `[video]`.
|
||||
**Message preview:** Last several messages, most recent last, separated by ` / `. Newlines in message text are replaced with spaces to prevent card layout bloat from spam. The customer's display name is also sanitized (newlines → spaces) for the card header, but the `/join` command uses the raw name so it matches the actual group profile. Newest messages are prioritized — when the total exceeds ~500 chars (`maxTotal = 500` in `composeCard`), the oldest messages are truncated (with `[truncated]` prepended) while the newest are always shown. When truncation occurs, the first visible message is guaranteed to have a sender prefix even if it was a continuation in the original sequence. Each message is prefixed with the sender's name (`Name: message`) on the first message in a consecutive run from that sender - subsequent messages from the same sender omit the prefix until a different sender's message appears. Sender identification: Grok contact is detected by `grokContactId` and labeled "Grok"; the customer is identified by matching `memberId` to the group's `customerId` and labeled with their display name; all other members use their `memberProfile.displayName`. Bot's own messages (`groupSnd`) are excluded. Each message truncated to ~200 chars. Media-only messages show type labels: `[image]`, `[file]`, `[voice]`, `[video]`.
|
||||
|
||||
**Join command:** `/join groupId:name` — `groupId` is the customer group's ID, `name` is the customer's display name. Names with spaces single-quoted: `/join 42:'First Last'`.
|
||||
|
||||
@@ -203,17 +288,17 @@ Card is two messages. **Message 1 (card text):**
|
||||
|
||||
**Update** (delete + repost) — on every subsequent event (new customer msg, team/Grok reply, state change, agent join):
|
||||
1. Read `{cardItemId, joinItemId}` from `customData`
|
||||
2. Delete old card + join command via `apiDeleteChatItems([Group, teamGroupId], [cardItemId, joinItemId], "broadcast")` — ignore errors
|
||||
3. Post new card text + `/join` command as two messages → get new IDs
|
||||
4. Overwrite `customData` with new `{cardItemId, joinItemId}`
|
||||
2. Delete old card + join command via `apiDeleteChatItems([Group, teamGroupId], [cardItemId, joinItemId], "broadcast")`. Per `simplex-chat/src/api.ts:436-445` the call either returns `T.ChatItemDeletion[]` (possibly empty if the items no longer exist) or throws `ChatCommandError`. Both outcomes are acceptable: the surrounding `try { ... } catch { /* log and continue */ }` allows execution to proceed whether the items were still present, already gone, or the server returned a transient error.
|
||||
3. Post new card text + `/join` command as two messages via `apiSendMessages` → get new IDs. **On failure** the partial-failure policy below applies: log, re-queue this groupId into `pendingUpdates`, return without writing `customData`.
|
||||
4. Write `{cardItemId, joinItemId, complete?}` to `customData` via `mergeCustomData`. **On failure** the tracking-write policy below applies.
|
||||
|
||||
**Debouncing:** Card updates debounced globally — pending changes flushed every `cardFlushSeconds` seconds (default 300, configurable via `--card-flush-seconds`). Within a batch, each group's card reposted at most once with latest state.
|
||||
|
||||
**Wait time rules:** Time since the customer's last unanswered message. For ✅ (auto-completed) conversations, the wait field shows the literal string "done". If customer sends a follow-up, wait time resets to count from that message.
|
||||
|
||||
**Auto-complete:** A conversation is marked ✅ when `completeHours` (default 3h, configurable via `--complete-hours`) have passed since the last team/Grok message **without any customer reply**. The card debounce flush (every 15 min) checks elapsed time and transitions to ✅ when the threshold is met. Customer follow-up at any point — including after ✅ — reverts to the derived active icon (👋/💬/⏰ for team states, 🟡/🔴 for queue), and wait time resets from that message.
|
||||
**Auto-complete:** A conversation is marked ✅ when `completeHours` (default 3h, configurable via `--complete-hours`) have passed since the last team/Grok message **without any customer reply**. The card debounce flush (every 300 seconds / 5 min, configurable via `--card-flush-seconds`) checks elapsed time and transitions to ✅ when the threshold is met. Customer follow-up at any point — including after ✅ — reverts to the derived active icon (👋/💬/⏰ for team states, 🟡/🔴 for queue), and wait time resets from that message.
|
||||
|
||||
**Card icon state machine (TEAM states):**
|
||||
**Card icon derivation (TEAM states) — computed at each card render by comparing the timestamps of the most recent customer and team/Grok messages in the group; nothing about the icon is stored:**
|
||||
```
|
||||
Team added, no reply yet → 👋
|
||||
Team replied → 💬
|
||||
@@ -226,6 +311,67 @@ Customer sends after ✅ → back to 💬 or ⏰ (derived from wait ti
|
||||
|
||||
**Restart recovery:** On startup, `CardManager.refreshAllCards()` lists all groups, finds those with `customData.cardItemId` set and `customData.complete` not set, sorts by `cardItemId` ascending (higher ID = more recently updated), and re-posts them oldest-first so the most recently active cards end up at the bottom of the team group. Completed cards (`complete: true`) and old/pre-bot groups (no `customData`) are skipped. Old card messages are deleted before reposting; deletion failures (e.g., >24h old) are silently ignored. Individual card failures are caught and logged without aborting the batch.
|
||||
|
||||
### Partial-failure and retry policy
|
||||
|
||||
`createCard` and `updateCard` perform a multi-step sequence (delete + send + customData write). To design the correct policy we MUST be explicit about which failures the SimpleX core already handles for us vs. which surface to the bot:
|
||||
|
||||
**SimpleX core semantics** (per `simplex-chat/src/api.ts` JSDoc):
|
||||
- `apiSendMessages` — "Network usage: background". The call returns `newChatItems` once the chat item is CREATED LOCALLY (written to SQLite) and the SMP broadcast is QUEUED. The core's background machinery retries relay delivery transparently — **the bot never observes a transient relay failure from `apiSendMessages`**. A thrown `ChatCommandError` means the local create step itself failed: permission denied, chat does not exist, invalid content, DB locked/corrupted.
|
||||
- `apiDeleteChatItems` — "Network usage: background". Same pattern: local delete + queued broadcast + core-managed delivery retry. A thrown error means the local delete step failed (item not found, permission, DB error).
|
||||
- `apiSetGroupCustomData` — "Network usage: **no**". Pure local SQLite write, no SMP involvement at all. A thrown error means a local DB error.
|
||||
|
||||
Consequence: failures surfaced to the bot are **terminal local errors** (bad state, DB problem, permission change), not transient network blips. Retrying the same operation against the same DB/relay state will usually hit the same error. Retry value comes from the narrow slice of genuinely transient local conditions — a brief SQLite lock held by a concurrent write, a race with group-state mutation elsewhere in the same process — where the next attempt sees a different state.
|
||||
|
||||
This reshapes the policy: the bot does not need aggressive retry for "network" reasons (core handles that), and compensating actions for customData-write failure are rarely useful (if the pure-local customData write fails, the retry's customData write will almost certainly fail for the same reason). The bot needs a light safety net: re-queue on any step failure, let the flush loop try again at most once per `cardFlushSeconds`, and on persistent failure accept that operator intervention is needed.
|
||||
|
||||
Policy (applies to both `createCard` and `updateCard`):
|
||||
|
||||
**Any step fails** — whether step 2 (delete), step 3 (send), or step 4 (customData write):
|
||||
- Log via `logError` with `{groupId, step, err}` so the operator can diagnose the underlying cause (permission change, DB corruption, bot removed from team group, etc).
|
||||
- Re-add `groupId` to `pendingUpdates` via `this.scheduleUpdate(groupId)`.
|
||||
- Return. Do NOT attempt compensating actions (no compensating delete for tracking-write failure — the scenario where send succeeds locally but customData write fails requires the SQLite DB to be healthy-then-unhealthy between two synchronous calls in the same transaction window, which is not a realistic transient state; the retry path handles any resulting duplicate by reading the stale `cardItemId` and deleting it on the next update attempt).
|
||||
|
||||
**Flush dispatch** — the current `flush` loop calls `updateCard` unconditionally and `updateCard` returns early when `customData.cardItemId` is unset. This silently drops the retry path for a failed `createCard` — the group is in `pendingUpdates` but nothing will ever create a card for it. Replace with a single `flushOne(groupId)` that reads `customData` once and dispatches to create or update:
|
||||
|
||||
```typescript
|
||||
private async flushOne(groupId: number): Promise<void> {
|
||||
const groupInfo = await this.getGroupInfo(groupId)
|
||||
if (!groupInfo) return // group deleted
|
||||
const customData = this.deriveCustomData(groupInfo)
|
||||
if (customData.complete) return // ✅ conversations don't auto-repost
|
||||
if (typeof customData.cardItemId === "number") {
|
||||
await this.updateCard(groupId, groupInfo)
|
||||
} else {
|
||||
await this.createCard(groupId, groupInfo)
|
||||
}
|
||||
}
|
||||
|
||||
async flush(): Promise<void> {
|
||||
const groups = [...this.pendingUpdates]
|
||||
this.pendingUpdates.clear()
|
||||
for (const groupId of groups) {
|
||||
try { await this.flushOne(groupId) }
|
||||
catch (err) {
|
||||
logError(`flush failed for group ${groupId}`, err)
|
||||
this.scheduleUpdate(groupId) // re-queue on any thrown error
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Retry behavior for each failure point under this design:
|
||||
|
||||
| Failure point | `customData` after failure | Retry's `flushOne` path | Retry outcome if condition cleared |
|
||||
|---|---|---|---|
|
||||
| `createCard` send fails | `cardItemId` absent | create-path | fresh card posted, `customData` written |
|
||||
| `updateCard` delete fails | old `cardItemId` still set | update-path | delete retried (idempotent — see below) + send + write |
|
||||
| `updateCard` send fails (delete succeeded) | old (now-deleted) `cardItemId` still set | update-path | delete retried against stale ID — tolerated (see below) — then send + write |
|
||||
| `updateCard` write fails (send succeeded, duplicate may exist) | old `cardItemId` still set, new card orphaned in team group | update-path | delete retried against stale old ID — tolerated — new card posted, tracking written; **leaked** new-card from the failed attempt persists until operator removes it |
|
||||
|
||||
**Delete idempotency on retry** — `apiDeleteChatItems` against already-deleted IDs returns either an empty `ChatItemDeletion[]` or throws `ChatCommandError`. The step-2 `try { ... } catch { logError(...) }` swallows both; execution proceeds to step 3. Do NOT escalate a step-2 error to the partial-failure policy — that would create a retry loop for a permanent condition (items past the 24h deletion window will throw on every retry forever).
|
||||
|
||||
**Persistent failures** — if the underlying condition is not transient (bot removed from team group, DB corruption, permission revoked), every retry hits the same error and the group stays in `pendingUpdates` indefinitely, logging at each flush. MVP accepts this — the operator-visible log stream makes the problem diagnosable. A bounded-retry-with-backoff-and-giveup strategy can be added later without changing the failure-point table above.
|
||||
|
||||
### Card implementation
|
||||
|
||||
```typescript
|
||||
@@ -390,6 +536,51 @@ async function withProfile<T>(userId: number, fn: () => Promise<T>): Promise<T>
|
||||
|
||||
Grok HTTP API calls are made **outside** the mutex to avoid blocking.
|
||||
|
||||
**Per-group customData mutex** — `mergeCustomData` and `clearCustomData` must be serialized per customer group. `mergeCustomData` has two awaits (read via `getRawCustomData` → `apiListGroups`, then write via `apiSetGroupCustomData`); between them the event loop runs, so two concurrent async chains operating on the same `groupId` can both read the same snapshot, both produce a merged object, and the second write clobbers the first's patch.
|
||||
|
||||
Concrete call sites that can overlap on one `groupId`:
|
||||
- `processMainChatItem` writing `state` transitions (WELCOME→QUEUE, WELCOME→GROK, QUEUE→GROK, one-way gate →TEAM)
|
||||
- `activateGrok`'s `revertStateOnFail` (fire-and-forget) racing with subsequent customer messages
|
||||
- `activateTeam` writing `TEAM-PENDING` racing with `/grok` or another `/team` on the same group
|
||||
- `CardManager.flush → updateCard` writing `{cardItemId, joinItemId, complete}` racing with dispatch writing `state`
|
||||
- `createCard` writing `{cardItemId, joinItemId}` immediately after dispatch writes `state`
|
||||
|
||||
The CAS-on-state inside `revertStateOnFail` guards only the `state` key — other keys (`cardItemId`, `joinItemId`, `complete`) can still be lost when spread from a stale snapshot.
|
||||
|
||||
Implementation:
|
||||
|
||||
```typescript
|
||||
// In CardManager
|
||||
private customDataMutexes = new Map<number, Mutex>()
|
||||
|
||||
private getCustomDataMutex(groupId: number): Mutex {
|
||||
let m = this.customDataMutexes.get(groupId)
|
||||
if (!m) { m = new Mutex(); this.customDataMutexes.set(groupId, m) }
|
||||
return m
|
||||
}
|
||||
|
||||
async mergeCustomData(groupId: number, patch: Partial<CardData>): Promise<void> {
|
||||
return this.getCustomDataMutex(groupId).runExclusive(async () => {
|
||||
const current = (await this.getRawCustomData(groupId)) ?? {}
|
||||
const merged = {...current, ...patch}
|
||||
for (const key of Object.keys(merged) as (keyof CardData)[]) {
|
||||
if (merged[key] === undefined) delete merged[key]
|
||||
}
|
||||
await this.withMainProfile(() => this.chat.apiSetGroupCustomData(groupId, merged))
|
||||
})
|
||||
}
|
||||
|
||||
async clearCustomData(groupId: number): Promise<void> {
|
||||
return this.getCustomDataMutex(groupId).runExclusive(() =>
|
||||
this.withMainProfile(() => this.chat.apiSetGroupCustomData(groupId))
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
Nesting rule: the per-group customData mutex is the **outer** lock; `profileMutex` (via `withMainProfile`) is the **inner** lock. Never acquire them in the opposite order, and never hold the customData mutex while calling an external (non-SimpleX) async function — this prevents cross-group deadlock and keeps the critical section short.
|
||||
|
||||
Cleanup: entries in `customDataMutexes` are bounded by the number of customer groups. Removing the entry on `onLeftMember(customer)` is sufficient (the group's `customData` is also cleared at that point). Skip this refinement in MVP if acceptable — a long-running bot with many customers accumulates a few bytes per group.
|
||||
|
||||
**Profile images:** Both profiles have base64-encoded JPEG profile pictures (128x128, quality 85, under the 12,500-char data URI limit enforced by iOS/Android clients) set via the `image` field in `T.Profile`. The images are defined as `data:image/jpg;base64,...` string constants in `index.ts`. The main profile image is passed to `bot.run()` which handles update-on-change automatically. The Grok profile image is passed to `apiCreateActiveUser()` on first run; on subsequent runs, the bot compares the current profile against the desired one using `util.fromLocalProfile()` and calls `apiUpdateProfile()` if any field differs — this sends the update to all Grok contacts.
|
||||
|
||||
**Startup sequence:**
|
||||
@@ -405,7 +596,7 @@ Grok HTTP API calls are made **outside** the mutex to avoid blocking.
|
||||
9. Validate `--auto-add-team-members` (`-a`) if provided
|
||||
10. Register Grok event handlers on `chat` (filtered by `event.user === grokUserId`)
|
||||
10b. Refresh stale cards: `CardManager.refreshAllCards()` — lists all groups, skips those with `customData.complete` or no `customData.cardItemId`, sorts remaining by `cardItemId` ascending, re-posts oldest-first so newest cards land at the bottom of team group
|
||||
11. On SIGINT/SIGTERM → delete invite link, exit
|
||||
11. On SIGINT/SIGTERM → `clearTimeout(inviteLinkTimer)` (noop if already deleted), `cards.destroy()` (stops the card-flush interval), `deleteInviteLink()` (profileMutex-gated `apiDeleteGroupLink`), `process.exit(0)`. Signal handler is reentrant-safe: an `inviteLinkDeleted` flag prevents double-deletion; `clearTimeout`/`clearInterval` are no-op on undefined.
|
||||
|
||||
**Grok event registration** (same ChatApi, filtered by profile):
|
||||
|
||||
@@ -459,8 +650,7 @@ chat.on("connectedToGroupMember", (evt) => {
|
||||
// 4. Skip groupSnd (own messages)
|
||||
// 5. Identify sender via businessChat.customerId
|
||||
// 6. Team member message → check if first team text (trigger one-way gate: remove Grok, disable /grok), schedule card update
|
||||
// 7. Team member or Grok reaction → schedule card update (auto-complete)
|
||||
// 8. Customer message → derive state, dispatch:
|
||||
// 7. Customer message → derive state, dispatch:
|
||||
// - WELCOME: create card, send queue msg (or handle /grok first msg → WELCOME→GROK, skip queue)
|
||||
// - QUEUE: /grok → invite Grok; /team → add ALL configured team members; else schedule card update
|
||||
// - GROK: /team → add ALL configured team members (Grok stays); else schedule card update
|
||||
@@ -516,10 +706,10 @@ Type signatures affected:
|
||||
|
||||
**Main profile side (invite + failure detection):**
|
||||
0. Send `grokInvitingMessage` ("Inviting Grok, please wait...")
|
||||
1. `apiAddMember(groupId, grokContactId, Member)` → get `member.memberId`. If `groupDuplicateMember` (customer sent `/grok` again before join completed), silent return — the in-flight activation handles the outcome.
|
||||
2. Store `pendingGrokJoins.set(memberId, mainGroupId)`
|
||||
3. Drain `bufferedGrokInvitations` — if the `receivedGroupInvitation` event arrived during step 1's await (race condition), process it now.
|
||||
4. Set `grokInitialResponsePending.add(groupId)` — suppresses per-message responses from `onGrokNewChatItems` for this group until the initial combined response completes. Without this gate, the message backlog arriving via `newChatItems` would trigger individual per-message responses racing with the initial combined response — producing duplicate replies (e.g., 3 replies for 2 messages).
|
||||
1. **Set `grokInitialResponsePending.add(groupId)` FIRST** — the gate must be raised before any operation that could make Grok recognizable to `onGrokNewChatItems`. Specifically: before `apiAddMember`, before `pendingGrokJoins` is set, and before `bufferedGrokInvitations` is drained (which populates `reverseGrokMap`). Without this ordering, the sequence `apiAddMember → pendingGrokJoins.set → drain → reverseGrokMap.set → gate.add` contains a window where `reverseGrokMap` identifies the group as a Grok-active group but the gate is still DOWN. A customer message arriving in that window triggers a per-message response concurrent with the initial combined response — producing duplicate Grok replies. Every error path below MUST clear the gate.
|
||||
2. `apiAddMember(groupId, grokContactId, Member)` → get `member.memberId`. On `groupDuplicateMember` (customer sent `/grok` again before join completed), **clear the gate** and silent return — the in-flight activation handles the outcome. On any other error, clear the gate, revert state, send `grokUnavailableMessage`.
|
||||
3. Store `pendingGrokJoins.set(memberId, mainGroupId)`
|
||||
4. Drain `bufferedGrokInvitations` — if the `receivedGroupInvitation` event arrived during step 2's await (race condition), process it now. (The gate is already up from step 1, so `onGrokNewChatItems` suppresses any per-message responses during drain and the subsequent join.)
|
||||
5. `waitForGrokJoin(120s)` — awaits resolver from Grok profile's `connectedToGroupMember` (step 8 below)
|
||||
6. Timeout → notify customer (`grokUnavailableMessage`), send queue message if was WELCOME→GROK, fall back to QUEUE (CAS-guarded: only if `customData.state` is still `GROK` — a concurrent `/team` that switched to `TEAM-PENDING` is respected), clear `grokInitialResponsePending`
|
||||
|
||||
@@ -532,7 +722,7 @@ Type signatures affected:
|
||||
10. If no customer messages found (visible history disabled or API failed), send generic greeting asking customer to repeat their question
|
||||
11. Call Grok HTTP API (outside mutex)
|
||||
12. Send response via `apiSendTextMessage` (through mutex with Grok profile)
|
||||
13. Clear `grokInitialResponsePending` (via `finally` block — runs on success, failure, or early return). After this, per-message responses from `onGrokNewChatItems` resume normally for subsequent customer messages.
|
||||
13. Clear `grokInitialResponsePending` (via `finally` block — runs on success, failure, or early return). After this, per-message responses from `onGrokNewChatItems` resume normally for subsequent customer messages. Note: because the gate is raised at step 1 (before any other work), the `finally` block MUST be wired to cover every code path from step 1 onward — including the `groupDuplicateMember` silent-return and all revert/timeout branches — otherwise per-message responses stay suppressed indefinitely for the affected group.
|
||||
|
||||
```typescript
|
||||
const pendingGrokJoins = new Map<string, number>() // memberId → mainGroupId
|
||||
@@ -551,12 +741,12 @@ Grok profile's `onGrokNewChatItems` handler:
|
||||
3. Only trigger for `groupRcv` **text** messages from customer (identified via `businessChat.customerId`)
|
||||
4. Ignore: non-text messages (images, files, voice — card update handled by main profile), bot messages, own messages (`groupSnd`), team member messages
|
||||
5. Read last 100 messages from own view (customer → `user`, own → `assistant`)
|
||||
6. Call Grok HTTP API (serialized per group — queue if call in flight)
|
||||
6. Call Grok HTTP API — different groups' calls run concurrently (see "Cross-group Grok parallelism" below). Per-group serialization of overlapping in-flight calls is NOT implemented in MVP (see §20.6).
|
||||
7. Send response into group
|
||||
|
||||
**Per-message error:** Send error message in group ("Sorry, I couldn't process that. Please try again or send /team for a human team member."), stay GROK. Customer can retry.
|
||||
|
||||
**Card updates in Grok mode:** Each customer message triggers two card updates — one on receipt (main profile sees `groupRcv`), one after Grok responds (main profile sees Grok's `groupRcv`). Both go through the 15-min debounce.
|
||||
**Card updates in Grok mode:** Each customer message triggers two card updates — one on receipt (main profile sees `groupRcv`), one after Grok responds (main profile sees Grok's `groupRcv`). Both go through the 300-second debounce (default `--card-flush-seconds`).
|
||||
|
||||
### Grok removal
|
||||
|
||||
@@ -597,6 +787,55 @@ If `GROK_API_KEY` is set but `--context-file` is missing, `parseConfig` throws a
|
||||
|
||||
Customer messages always in `user` role, never `system`.
|
||||
|
||||
### Grok HTTP request timeout
|
||||
|
||||
Every `fetch` to `api.x.ai/v1/chat/completions` MUST pass an `AbortSignal.timeout(60_000)` (60-second default). Without a timeout, a stuck TCP connection or an unresponsive server blocks the awaiting call indefinitely; because `processGrokChatItem` runs under the Grok profile's sequential event dispatch, a single hung call stalls per-message responses for ALL customer groups using Grok — and the same hang in `activateGrok`'s initial-response path leaves `grokInitialResponsePending` stuck (gate never released) until the process is killed.
|
||||
|
||||
Implementation in `GrokApiClient.chatRaw`:
|
||||
|
||||
```typescript
|
||||
const response = await fetch("https://api.x.ai/v1/chat/completions", {
|
||||
method: "POST",
|
||||
headers: { ... },
|
||||
body: JSON.stringify({ ... }),
|
||||
signal: AbortSignal.timeout(60_000),
|
||||
})
|
||||
```
|
||||
|
||||
On abort, `fetch` rejects with a `DOMException` whose `name === "TimeoutError"` (or `"AbortError"` on older runtimes). Callers treat this identically to other `chat()` failures:
|
||||
- `processGrokChatItem` → sends `grokErrorMessage` to the customer group, conversation stays GROK.
|
||||
- `activateGrok` initial-response path → logs, sends `grokUnavailableMessage`, lets the `finally` block clear `grokInitialResponsePending`.
|
||||
|
||||
Rationale for 60s: typical xAI responses return in 1–10s; a 60s ceiling accommodates cold-start / heavy-load latencies while still bounding worst-case per-customer wait. Not exposed as a CLI flag in MVP — a later iteration can add `--grok-timeout-seconds` if operator tuning is needed.
|
||||
|
||||
### Cross-group Grok parallelism
|
||||
|
||||
`onGrokNewChatItems` MUST dispatch per-group work concurrently. A naïve `for (const ci of lastPerGroup.values()) { await this.processGrokChatItem(ci) }` serializes calls across unrelated customer groups — if xAI takes 3s per call and five customers message in one event batch, customer #5 waits ~15s instead of ~3s. This is pure latency amplification with no ordering benefit (the groups are independent; within-group order is already preserved by batch deduplication picking the last message).
|
||||
|
||||
Implementation:
|
||||
|
||||
```typescript
|
||||
async onGrokNewChatItems(evt: CEvt.NewChatItems): Promise<void> {
|
||||
const lastPerGroup = new Map<number, T.AChatItem>()
|
||||
for (const ci of evt.chatItems) {
|
||||
// filter: groupRcv, customer text, not bot/team
|
||||
// keep last per groupId
|
||||
}
|
||||
await Promise.allSettled(
|
||||
[...lastPerGroup.values()].map((ci) => this.processGrokChatItem(ci)),
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
Why `Promise.allSettled` (not `Promise.all`): one group's Grok API failure MUST NOT cancel or reject pending work for other groups. Each `processGrokChatItem` already handles its own errors (sends `grokErrorMessage`, logs); the outer handler only needs to wait until all per-group tasks finish before returning control to the event dispatcher.
|
||||
|
||||
Concurrency bound: the number of distinct customer groups that have new Grok-eligible messages in a single event batch — typically ≤ the SimpleX batch-delivery size, practically small. No global semaphore needed in MVP. If xAI rate limits become a concern, add a shared semaphore later; orthogonal to this fix.
|
||||
|
||||
Ordering guarantees preserved:
|
||||
- Within a group, batch deduplication still picks only the latest message and earlier messages appear in the history context via `apiGetChat`.
|
||||
- Across groups, there is no ordering requirement — each customer group is an independent conversation.
|
||||
- The per-group gate (`grokInitialResponsePending`) still serializes against `activateGrok`'s initial response; this is a group-local check unaffected by cross-group parallelism.
|
||||
|
||||
## 11. Team Group Commands
|
||||
|
||||
| Command | Effect |
|
||||
@@ -835,7 +1074,7 @@ npx ts-node src/index.ts \
|
||||
7. Team member sends text → verify Grok removed, `/grok` rejected, card → 💬
|
||||
8. `/grok` in TEAM → verify "team mode" rejection
|
||||
9. `/team` when already invited → verify "already invited" message
|
||||
10. Card debouncing: multiple rapid events → verify single card update per 15min flush
|
||||
10. Card debouncing: multiple rapid events → verify single card update per 300s flush (default)
|
||||
11. `/join` from team group → verify team member added to customer group, promoted to Owner
|
||||
12. `/join` with non-business group → verify error
|
||||
13. Weekend → verify "48 hours"
|
||||
@@ -983,7 +1222,7 @@ async function reachTeam(groupId?) // reachTeamPending → add team membe
|
||||
|
||||
Called as: `const p = simulateGrokJoinSuccess(); await bot.onNewChatItems(...); await p;`
|
||||
|
||||
### 20.4 Test Catalog (129 tests, 27 suites)
|
||||
### 20.4 Test Catalog (129 tests, 28 suites)
|
||||
|
||||
#### 1. Welcome & First Message (4 tests)
|
||||
- first message → queue reply + card created with /join command
|
||||
@@ -1022,6 +1261,10 @@ Called as: `const p = simulateGrokJoinSuccess(); await bot.onNewChatItems(...);
|
||||
- customer text in TEAM → no bot reply, card update scheduled
|
||||
- /grok in TEAM-PENDING → invite Grok if not present
|
||||
|
||||
#### 5b. One-Way Gate with Grok Disabled (2 tests)
|
||||
- team text removes Grok even when grokApi is null
|
||||
- Grok does not respond when disabled even if grokContactId is set
|
||||
|
||||
#### 6. Team Member Lifecycle (6 tests)
|
||||
- team member connected → promoted to Owner
|
||||
- customer connected → NOT promoted
|
||||
@@ -1178,11 +1421,11 @@ Called as: `const p = simulateGrokJoinSuccess(); await bot.onNewChatItems(...);
|
||||
### 20.6 Test Coverage Notes
|
||||
|
||||
**Covered vs plan catalog:**
|
||||
- §20.4 items 1-13, 15, 17-27 fully covered (129 tests across 27 suites)
|
||||
- §20.4 item 14 (Weekend Detection) — not unit-tested; `isWeekend` depends on `Intl.DateTimeFormat(new Date())`, would need clock mocking
|
||||
- §20.4 item 16 (Profile Mutex) — not unit-tested; mutex serialization is verified implicitly through all other tests (MockChatApi tracks activeUserId)
|
||||
- §20.4 item 19 (Startup & State Persistence) — not unit-tested; tests `index.ts` startup which requires native ChatApi. Integration test only. This includes `deleteInviteLink` (profileMutex + `apiSetActiveUser` before `apiDeleteGroupLink`), the conditional `apiUpdateGroupProfile` (compare `fullGroupPreferences` before calling), and the best-effort `apiCreateGroupLink` (catch + log on SMP relay failure) — all are in startup code and cannot be covered by MockChatApi-based tests.
|
||||
- §20.4 suites 1-13, 15, 17-27 plus 5b fully covered (129 tests across 28 suites)
|
||||
- Weekend detection (`util.isWeekend`) — not unit-tested; depends on `Intl.DateTimeFormat(new Date())`, would need clock mocking. Not present in the §20.4 catalog.
|
||||
- Profile Mutex serialization — not a standalone suite in §20.4; verified implicitly through all other tests (MockChatApi tracks activeUserId).
|
||||
- Startup & state persistence (`index.ts` path) — not unit-tested; requires native ChatApi. Integration-test only. Includes `deleteInviteLink` (profileMutex + `apiSetActiveUser` before `apiDeleteGroupLink`), the conditional `apiUpdateGroupProfile` (compare `fullGroupPreferences` before calling), and the best-effort `apiCreateGroupLink` (catch + log on SMP relay failure). Not in §20.4 catalog.
|
||||
|
||||
**Known plan items NOT implemented (conscious gaps, not test gaps):**
|
||||
- Per-group Grok API call serialization (plan §10) — not implemented or tested
|
||||
- Team member replacement on leave after sending (plan §15) — not implemented
|
||||
- Team member replacement on leave after sending — out of MVP scope. No plan section currently asserts it as a requirement; if added later, specify in SPEC §4.2 "Team replies" and implementation plan §15 "Error Handling" together.
|
||||
|
||||
@@ -462,7 +462,7 @@ In Grok mode, each customer message triggers two card updates — one on receipt
|
||||
|
||||
If the Grok HTTP API call fails or times out for a per-message request, the Grok profile sends an error message into the group: "Sorry, I couldn't process that. Please try again or send /team for a human team member." Grok remains in the group and the state stays GROK — the customer can retry by sending another message.
|
||||
|
||||
Grok API calls are serialized per customer group — if a new customer message arrives while a Grok API call is in flight, it is queued and processed after the current call completes. This ensures Grok's history includes its own prior response before handling the next message.
|
||||
Grok API calls are NOT serialized per customer group in the MVP. If a new customer message arrives while a Grok API call is in flight, a second call runs concurrently — `apiGetChat` is re-read at the start of each call so history converges eventually, but two rapid messages in the same group can produce interleaved context. Cross-group calls run concurrently by design (see implementation plan §10 "Cross-group Grok parallelism"). Per-group serialization is a planned future improvement.
|
||||
|
||||
#### Grok removal
|
||||
|
||||
|
||||
Reference in New Issue
Block a user