desktop: prevent duplicate launches

Acquires a file lock and listens on a loopback ServerSocket in dataDir.
A second launch signals the running instance to restore its window and
exits silently. See plans/2026-05-13-desktop-single-instance.md.
This commit is contained in:
shum
2026-05-13 18:51:28 +00:00
parent 7497b90e7c
commit f0d7532a91
5 changed files with 793 additions and 0 deletions
@@ -0,0 +1,194 @@
package chat.simplex.common
import chat.simplex.common.platform.Log
import chat.simplex.common.platform.TAG
import chat.simplex.common.platform.dataDir
import java.io.IOException
import java.net.InetAddress
import java.net.InetSocketAddress
import java.net.ServerSocket
import java.net.Socket
import java.nio.channels.FileChannel
import java.nio.channels.FileLock
import java.nio.channels.OverlappingFileLockException
import java.nio.charset.StandardCharsets
import java.nio.file.AtomicMoveNotSupportedException
import java.nio.file.Files
import java.nio.file.StandardCopyOption
import java.nio.file.StandardOpenOption.CREATE
import java.nio.file.StandardOpenOption.READ
import java.nio.file.StandardOpenOption.WRITE
import javax.swing.SwingUtilities
import kotlin.concurrent.thread
// Held for the process lifetime. Module-level var deliberately so the FileLock
// isn't garbage-collected — and FileLock itself pins its FileChannel, so a
// single reference here keeps both alive.
private var lockHandle: FileLock? = null
// Explicit IPv4 loopback. InetAddress.getLoopbackAddress() may return ::1 on
// dual-stack systems where IPv6 is preferred, and Windows Defender's loopback
// exemption is most reliable for the 127.0.0.0/8 family.
private val LOOPBACK: InetAddress = InetAddress.getByAddress(byteArrayOf(127, 0, 0, 1))
// Bound on the bytes we'll read from a signal — long enough for plausible
// future single-token commands, short enough that an adversarial same-UID
// process can't OOM us by streaming without a newline.
private const val MAX_SIGNAL_BYTES = 256
// Returns true if this process owns the single-instance lock (caller proceeds
// with normal startup). Returns false if another instance already owns it
// (caller must return from main without further init).
fun acquireSingleInstanceOrSignalAndExit(): Boolean {
dataDir.mkdirs()
val lockFile = dataDir.resolve("simplex.lock").toPath()
val channel: FileChannel = try {
FileChannel.open(lockFile, READ, WRITE, CREATE)
} catch (e: IOException) {
// Filesystem doesn't allow opening the lock file. Proceed without
// single-instance enforcement — no worse than today's behaviour.
Log.w(TAG, "single-instance: cannot open $lockFile: ${e.stackTraceToString()}")
return true
}
val held: FileLock? = try {
// Lock exactly one byte. Zero-arg tryLock() locks [0, Long.MAX_VALUE) and
// is rejected by some SMB/NFS implementations (JDK-6674134).
channel.tryLock(0L, 1L, false)
} catch (e: OverlappingFileLockException) {
// Unreachable by construction (function is called once at process start),
// but if it ever fires the caller can't safely proceed as the singleton:
// we don't know who holds the lock and we haven't started a listener.
// Fail closed — close the channel and behave like a second instance.
Log.w(TAG, "single-instance: overlapping lock on $lockFile")
channel.close()
return false
} catch (e: IOException) {
Log.w(TAG, "single-instance: tryLock failed on $lockFile: ${e.stackTraceToString()}")
channel.close()
return true
}
if (held == null) {
channel.close()
signalShowAndReturn()
return false
}
lockHandle = held
startSingleInstanceListener()
return true
}
private const val LISTENER_THREAD_NAME = "simplex-single-instance"
private fun startSingleInstanceListener() {
// Drop any stale simplex.port left by a previous primary BEFORE binding the
// new ServerSocket. A second instance arriving between our lock acquisition
// and our writePortFile() would otherwise read the old port and signal SHOW
// to whatever process now owns it. With the file gone, that race exits the
// signaller silently via readPortWithRetry returning null.
try { Files.deleteIfExists(dataDir.resolve("simplex.port").toPath()) } catch (_: IOException) {}
val server = try {
ServerSocket(0, 0, LOOPBACK)
} catch (e: IOException) {
// Ephemeral port range starved (rare, observed behind some VPNs). Lock is
// still held, so duplicate launches see the lock but no port to signal —
// they retry and exit silently. Worst case: user clicks the tray icon.
Log.w(TAG, "single-instance: ServerSocket bind failed: ${e.stackTraceToString()}")
return
}
writePortFile(server.localPort)
thread(name = LISTENER_THREAD_NAME, isDaemon = true) {
while (true) {
val socket = try {
server.accept()
} catch (e: IOException) {
Log.w(TAG, "single-instance: accept() failed: ${e.stackTraceToString()}")
return@thread
}
try {
socket.soTimeout = 1000
// Single bounded read: enough for any sensible command, capped so a
// hostile client can't grow a buffer to OOM. Slow senders that drip
// bytes lose, but our own signaller writes the full payload in one
// call so this is fine in practice.
val buf = ByteArray(MAX_SIGNAL_BYTES)
val read = socket.getInputStream().read(buf)
if (read > 0) {
val line = String(buf, 0, read, StandardCharsets.UTF_8).substringBefore('\n').trimEnd('\r')
Log.i(TAG, "single-instance: received $line")
// Only SHOW is recognised today. Future commands (e.g. open-URL) will
// extend this with new top-level branches; unknown lines are ignored.
if (line == "SHOW") {
SwingUtilities.invokeLater { showWindow() }
}
}
} catch (e: IOException) {
Log.w(TAG, "single-instance: read failed: ${e.stackTraceToString()}")
} finally {
try { socket.close() } catch (_: IOException) {}
}
}
}
}
private fun signalShowAndReturn() {
val port = readPortWithRetry() ?: return
try {
Socket().use { sock ->
sock.connect(InetSocketAddress(LOOPBACK, port), 1000)
val out = sock.getOutputStream()
out.write("SHOW\n".toByteArray(StandardCharsets.UTF_8))
out.flush()
}
} catch (e: IOException) {
// First instance is starting, shutting down, or stuck. Doing nothing is
// strictly less harmful than spawning a duplicate that will fail on the
// SQLite lock. The stale-port-after-crash case lands here too — handled.
Log.w(TAG, "single-instance: SHOW signal failed: ${e.stackTraceToString()}")
}
}
private fun readPortWithRetry(): Int? {
val portFile = dataDir.resolve("simplex.port").toPath()
repeat(2) { attempt ->
val raw = try {
Files.readString(portFile, StandardCharsets.UTF_8).trim()
} catch (e: IOException) {
null
}
val parsed = raw?.toIntOrNull()
if (parsed != null && parsed in 1..65535) return parsed
// First-instance may still be writing the port during startup; one retry.
if (attempt == 0) Thread.sleep(200)
}
return null
}
private fun writePortFile(port: Int) {
val portFile = dataDir.resolve("simplex.port").toPath()
// Files.createTempFile creates with O_CREAT|O_EXCL and a random name in
// dataDir. A same-UID attacker can't pre-plant a symlink at this path to
// make our subsequent write truncate their chosen target — they don't
// know the random suffix, and EXCL refuses to open an existing path.
val tmp = try {
Files.createTempFile(dataDir.toPath(), "simplex.port.", ".tmp")
} catch (e: IOException) {
Log.w(TAG, "single-instance: createTempFile failed: ${e.stackTraceToString()}")
return
}
var moved = false
try {
Files.writeString(tmp, port.toString(), StandardCharsets.UTF_8)
try {
Files.move(tmp, portFile, StandardCopyOption.ATOMIC_MOVE)
} catch (e: AtomicMoveNotSupportedException) {
// Exotic filesystem. Fall back to plain move; the reader retries on
// parse failure so a brief window of a half-written port file is fine.
Files.move(tmp, portFile, StandardCopyOption.REPLACE_EXISTING)
}
moved = true
} catch (e: IOException) {
Log.w(TAG, "single-instance: writing port file failed: ${e.stackTraceToString()}")
} finally {
if (!moved) try { Files.deleteIfExists(tmp) } catch (_: IOException) {}
}
}
@@ -0,0 +1,65 @@
package chat.simplex.app
import java.nio.channels.FileChannel
import java.nio.channels.OverlappingFileLockException
import java.nio.file.Files
import java.nio.file.StandardOpenOption.CREATE
import java.nio.file.StandardOpenOption.READ
import java.nio.file.StandardOpenOption.WRITE
import kotlin.test.Test
import kotlin.test.assertFailsWith
import kotlin.test.assertNotNull
// Pins the JDK FileLock semantics the single-instance machinery relies on.
// Cross-process contention (the path that returns null) cannot be exercised
// from inside one JVM — within the same JVM, a second tryLock on an already
// locked region throws OverlappingFileLockException instead. The production
// code in SingleInstance.kt catches that exception and fails closed, so this
// pair of tests covers both observable JDK behaviours we depend on: the
// exception itself, and the release/reacquire round-trip.
class SingleInstanceTest {
@Test
fun overlappingLockOnSameRegionThrowsWithinOneJvm() = withTempLockDir { lockPath ->
val first = FileChannel.open(lockPath, READ, WRITE, CREATE)
val firstLock = first.tryLock(0L, 1L, false)
assertNotNull(firstLock, "first acquirer must get the lock")
val second = FileChannel.open(lockPath, READ, WRITE, CREATE)
assertFailsWith<OverlappingFileLockException> {
second.tryLock(0L, 1L, false)
}
second.close()
firstLock.release()
first.close()
}
@Test
fun releasedLockCanBeReacquired() = withTempLockDir { lockPath ->
val first = FileChannel.open(lockPath, READ, WRITE, CREATE)
val firstLock = first.tryLock(0L, 1L, false)
assertNotNull(firstLock)
firstLock.release()
first.close()
val second = FileChannel.open(lockPath, READ, WRITE, CREATE)
val secondLock = second.tryLock(0L, 1L, false)
assertNotNull(secondLock, "after release, a fresh acquirer must succeed")
secondLock.release()
second.close()
}
// Creates a temp directory, runs the block with a lock-file path inside, and
// cleans the directory afterwards. File.deleteOnExit() is unreliable for
// non-empty directories — would leak a temp dir on every test run.
private fun withTempLockDir(block: (java.nio.file.Path) -> Unit) {
val tmp = Files.createTempDirectory("simplex-singleinstance-test")
try {
block(tmp.resolve("simplex.lock"))
} finally {
Files.walk(tmp).sorted(Comparator.reverseOrder()).forEach {
try { Files.delete(it) } catch (_: java.io.IOException) {}
}
}
}
}
@@ -8,6 +8,7 @@ import androidx.compose.runtime.*
import androidx.compose.ui.ExperimentalComposeUiApi
import androidx.compose.ui.Modifier
import androidx.compose.ui.input.pointer.*
import chat.simplex.common.acquireSingleInstanceOrSignalAndExit
import chat.simplex.common.model.ChatController.appPrefs
import chat.simplex.common.model.size
import chat.simplex.common.platform.*
@@ -19,6 +20,7 @@ import kotlinx.coroutines.*
import java.io.File
fun main() {
if (!acquireSingleInstanceOrSignalAndExit()) return
// Disable hardware acceleration
//System.setProperty("skiko.renderApi", "SOFTWARE")
initHaskell()
@@ -0,0 +1,402 @@
# Desktop single instance — implementation plan
Companion to the design at `plans/2026-05-13-desktop-single-instance.md`. Read that first.
## What
Three small commits that build the feature incrementally. After each commit the build is green and the app still runs; the first commit already prevents the worst symptom (duplicate process hitting the SQLite lock), and the third makes the window-restore UX work end-to-end.
## Why
We split this way so each commit is reviewable and revertable on its own. The order is chosen so that the build stays green and so that the first commit alone is a worthwhile bugfix even if the rest is held back.
## How
### Pre-flight
- On branch `sh/tray-followup` (current). It is ahead of `stable` by the tray-followup commits and the two `plans/` commits for the single-instance spec.
- Confirm the desktop build is green before changing anything: `cd apps/multiplatform && ./gradlew :common:desktopMainClasses` — should succeed.
- Confirm tests run: `./gradlew desktopTest``SemVerTest` should pass.
- Read `plans/2026-05-13-desktop-single-instance.md` end to end.
---
### Task 1 — File lock + early-exit signaller (no IPC yet)
**Files**
- Create: `apps/multiplatform/common/src/desktopMain/kotlin/chat/simplex/common/SingleInstance.desktop.kt`
- Modify: `apps/multiplatform/desktop/src/jvmMain/kotlin/chat/simplex/desktop/Main.kt` (entry-point)
- Create: `apps/multiplatform/common/src/desktopTest/kotlin/chat/simplex/app/SingleInstanceTest.kt`
**What to add.** A module-level lock acquisition that runs before any Haskell / DB init. The function returns `true` if this process owns the lock (proceed with normal startup), `false` if another process already owns it (caller exits `main`). No IPC yet — the second instance simply exits silently. This already fixes the SQLite-contention crash that motivated the work.
```kotlin
package chat.simplex.common
import chat.simplex.common.platform.Log
import chat.simplex.common.platform.TAG
import chat.simplex.common.platform.dataDir
import java.io.IOException
import java.nio.channels.FileChannel
import java.nio.channels.FileLock
import java.nio.channels.OverlappingFileLockException
import java.nio.file.StandardOpenOption.CREATE
import java.nio.file.StandardOpenOption.READ
import java.nio.file.StandardOpenOption.WRITE
// Held for the process lifetime. Module-level `var`s deliberately, so the
// FileChannel isn't garbage-collected — a GC'd FileChannel releases the lock.
private var lockChannel: FileChannel? = null
private var lockHandle: FileLock? = null
// Returns true if this process owns the single-instance lock (caller proceeds
// with normal startup). Returns false if another instance already owns it
// (caller must return from main without further init).
fun acquireSingleInstanceOrSignalAndExit(): Boolean {
dataDir.mkdirs()
val lockFile = dataDir.resolve("simplex.lock").toPath()
val channel: FileChannel = try {
FileChannel.open(lockFile, READ, WRITE, CREATE)
} catch (e: IOException) {
// Filesystem doesn't allow opening the lock file at all — proceed without
// single-instance enforcement. No worse than today's behavior.
Log.w(TAG, "single-instance: cannot open $lockFile: ${e.message}")
return true
}
val held: FileLock? = try {
// Lock exactly one byte — NOT the zero-arg form, which locks
// [0, Long.MAX_VALUE) and is rejected by some SMB/NFS impls (JDK-6674134).
channel.tryLock(0L, 1L, false)
} catch (e: OverlappingFileLockException) {
// Same JVM trying to lock twice. Treat as "we hold it" — cheaper than crashing.
Log.w(TAG, "single-instance: overlapping lock on $lockFile")
return true
} catch (e: IOException) {
Log.w(TAG, "single-instance: tryLock failed on $lockFile: ${e.message}")
channel.close()
return true
}
if (held == null) {
// Another instance owns the lock. IPC signalling comes in Task 3.
channel.close()
return false
}
lockChannel = channel
lockHandle = held
return true
}
```
**Wire into `Main.kt`.** Add the check as the very first thing in `main()`:
```kotlin
fun main() {
if (!acquireSingleInstanceOrSignalAndExit()) return
// Disable hardware acceleration
//System.setProperty("skiko.renderApi", "SOFTWARE")
initHaskell()
runMigrations()
setupUpdateChecker()
initApp()
tmpDir.deleteRecursively()
tmpDir.mkdir()
return showApp()
}
```
Import: `chat.simplex.common.acquireSingleInstanceOrSignalAndExit`.
**Test.** Two unit tests pin the JDK semantics the production code relies on. Cross-process contention (the path where `tryLock` returns `null`) can NOT be exercised from inside a single JVM — within one JVM, a second `tryLock` on an already-locked region throws `OverlappingFileLockException` rather than returning `null`. The production code's `catch (OverlappingFileLockException)` branch handles exactly that within-JVM case, so we test both: the exception, and the release/reacquire round-trip.
In `SingleInstanceTest.kt`:
```kotlin
package chat.simplex.app
import java.nio.channels.FileChannel
import java.nio.channels.OverlappingFileLockException
import java.nio.file.Files
import java.nio.file.StandardOpenOption.CREATE
import java.nio.file.StandardOpenOption.READ
import java.nio.file.StandardOpenOption.WRITE
import kotlin.test.Test
import kotlin.test.assertFailsWith
import kotlin.test.assertNotNull
class SingleInstanceTest {
@Test
fun overlappingLockOnSameRegionThrowsWithinOneJvm() {
val tmp = Files.createTempDirectory("simplex-singleinstance-test").toFile()
tmp.deleteOnExit()
val lockPath = tmp.toPath().resolve("simplex.lock")
val first = FileChannel.open(lockPath, READ, WRITE, CREATE)
val firstLock = first.tryLock(0L, 1L, false)
assertNotNull(firstLock, "first acquirer must get the lock")
val second = FileChannel.open(lockPath, READ, WRITE, CREATE)
assertFailsWith<OverlappingFileLockException> {
second.tryLock(0L, 1L, false)
}
second.close()
firstLock.release()
first.close()
}
@Test
fun releasedLockCanBeReacquired() {
val tmp = Files.createTempDirectory("simplex-singleinstance-test").toFile()
tmp.deleteOnExit()
val lockPath = tmp.toPath().resolve("simplex.lock")
val first = FileChannel.open(lockPath, READ, WRITE, CREATE)
val firstLock = first.tryLock(0L, 1L, false)
assertNotNull(firstLock)
firstLock.release()
first.close()
val second = FileChannel.open(lockPath, READ, WRITE, CREATE)
val secondLock = second.tryLock(0L, 1L, false)
assertNotNull(secondLock, "after release, a fresh acquirer must succeed")
secondLock.release()
second.close()
}
}
```
Note: the cross-process `null`-return path is reached only when an actually-separate JVM holds the lock — that's the production scenario, covered by the manual smoke test below, not by unit tests.
**Verify.**
- Build: `./gradlew :common:desktopMainClasses` — succeeds.
- Test: `./gradlew desktopTest``SingleInstanceTest.overlappingLockOnSameRegionThrowsWithinOneJvm` and `SingleInstanceTest.releasedLockCanBeReacquired` both pass. These pin the JDK semantics our code relies on; they do *not* exercise `acquireSingleInstanceOrSignalAndExit()` directly — that goes through the manual smoke test below, by design (a JVM-level integration test would require spawning a second JVM, which is heavyweight for what the manual test catches cheaply).
- Run the desktop app twice from a terminal (`./gradlew :desktop:run` in one terminal, then the same in another). The second process should exit cleanly (well under a second after JVM startup) without opening a window. The first remains running. No SQLite lock error in the second process's logs.
- Confirm `simplex.lock` exists in `dataDir` (Linux: `~/.local/share/simplex/`). It is empty.
**Commit.** `desktop: single-instance file lock`
---
### Task 2 — IPC listener on the first instance
**Files**
- Modify: `apps/multiplatform/common/src/desktopMain/kotlin/chat/simplex/common/SingleInstance.desktop.kt`
**What to add.** When the lock acquisition succeeds, spin up a daemon thread that binds a `ServerSocket` on `127.0.0.1:0`, writes the chosen port to `simplex.port` atomically, and logs each received line. Acting on `SHOW` comes in Task 3 — for now the listener just proves the IPC plumbing works without invoking any UI code.
Add to `SingleInstance.desktop.kt`. Merge these `import` lines into the existing import block at the top of the file rather than duplicating; the same applies in Task 3.
```kotlin
import java.io.BufferedReader
import java.io.InputStreamReader
import java.net.InetAddress
import java.net.ServerSocket
import java.nio.charset.StandardCharsets
import java.nio.file.Files
import java.nio.file.StandardCopyOption
import kotlin.concurrent.thread
private const val LISTENER_THREAD_NAME = "simplex-single-instance"
private fun startSingleInstanceListener() {
val server = try {
ServerSocket(0, 0, InetAddress.getLoopbackAddress())
} catch (e: IOException) {
// Ephemeral port range starved (rare, observed behind some VPNs). Lock is
// still held, so duplicate launches see the lock but no port to signal —
// they retry and exit silently. Worst case: user clicks the tray icon.
Log.w(TAG, "single-instance: ServerSocket bind failed: ${e.message}")
return
}
writePortFile(server.localPort)
thread(name = LISTENER_THREAD_NAME, isDaemon = true) {
while (true) {
val socket = try {
server.accept()
} catch (e: IOException) {
Log.w(TAG, "single-instance: accept() failed: ${e.message}")
return@thread
}
try {
socket.soTimeout = 1000
val line = BufferedReader(InputStreamReader(socket.getInputStream(), StandardCharsets.UTF_8))
.readLine()
Log.i(TAG, "single-instance: received $line")
// SHOW handler comes in Task 3.
} catch (e: IOException) {
Log.w(TAG, "single-instance: read failed: ${e.message}")
} finally {
try { socket.close() } catch (_: IOException) {}
}
}
}
}
private fun writePortFile(port: Int) {
val portFile = dataDir.resolve("simplex.port").toPath()
val tmp = dataDir.resolve("simplex.port.tmp").toPath()
try {
Files.writeString(tmp, port.toString(), StandardCharsets.UTF_8)
try {
Files.move(tmp, portFile, StandardCopyOption.ATOMIC_MOVE)
} catch (e: java.nio.file.AtomicMoveNotSupportedException) {
// Exotic filesystem. Fall back to plain move; the reader retries on
// parse failure so a brief window of a half-written port file is fine.
Files.move(tmp, portFile)
}
} catch (e: IOException) {
Log.w(TAG, "single-instance: writing port file failed: ${e.message}")
}
}
```
Add `Log.i` (it already exists in the codebase — check `DesktopTray.kt:42` for `Log.w`; `Log.i` is the same shape).
Then in `acquireSingleInstanceOrSignalAndExit()`, just before `return true`, call:
```kotlin
lockChannel = channel
lockHandle = held
startSingleInstanceListener()
return true
```
**Verify.**
- Build: `./gradlew :common:desktopMainClasses` — succeeds.
- Run desktop app once: `./gradlew :desktop:run`.
- Confirm `simplex.port` appears in `dataDir` and contains a port number (e.g. `54231`).
- From another terminal: `printf 'SHOW\n' | nc 127.0.0.1 $(cat ~/.local/share/simplex/simplex.port)` (Linux/macOS; Windows: use PowerShell `Test-NetConnection` or any TCP client).
- In the app's log output, see `single-instance: received SHOW`. App keeps running normally.
- Send garbage: `printf 'WHATEVER\n' | nc 127.0.0.1 $(cat ~/.local/share/simplex/simplex.port)` — log shows `received WHATEVER`. No crash.
**Commit.** `desktop: single-instance IPC listener`
---
### Task 3 — Second instance signals SHOW; listener restores window
**Files**
- Modify: `apps/multiplatform/common/src/desktopMain/kotlin/chat/simplex/common/SingleInstance.desktop.kt`
**What to change.**
1. Have the listener act on `SHOW` by calling the existing `showWindow()` on the AWT EDT.
2. Have the second-instance path read `simplex.port` and send `SHOW\n`.
Replace the listener's "SHOW handler comes in Task 3" comment with:
```kotlin
// Only SHOW is recognised today. Future commands (e.g. open-URL) will extend
// this with new top-level if-branches; unknown commands are intentionally ignored.
if (line == "SHOW") {
javax.swing.SwingUtilities.invokeLater { showWindow() }
}
```
In the early-exit branch of `acquireSingleInstanceOrSignalAndExit()` (where `held == null`), replace `channel.close(); return false` with `channel.close(); signalShowAndReturn(); return false`. Then add the signaller:
```kotlin
import java.net.InetSocketAddress
import java.net.Socket
private fun signalShowAndReturn() {
val port = readPortWithRetry() ?: return
try {
Socket().use { sock ->
sock.connect(InetSocketAddress(InetAddress.getLoopbackAddress(), port), 1000)
sock.getOutputStream().write("SHOW\n".toByteArray(StandardCharsets.UTF_8))
sock.getOutputStream().flush()
}
} catch (e: IOException) {
// First instance is starting, shutting down, or stuck. Doing nothing is
// strictly less harmful than spawning a duplicate that will fail on the
// SQLite lock. The stale-port-after-crash case lands here too — handled.
Log.w(TAG, "single-instance: SHOW signal failed: ${e.message}")
}
}
private fun readPortWithRetry(): Int? {
val portFile = dataDir.resolve("simplex.port").toPath()
repeat(2) { attempt ->
val raw = try {
Files.readString(portFile, StandardCharsets.UTF_8).trim()
} catch (e: IOException) {
null
}
val parsed = raw?.toIntOrNull()
if (parsed != null && parsed in 1..65535) return parsed
// First-instance may still be writing the port during startup; one retry.
if (attempt == 0) Thread.sleep(200)
}
return null
}
```
**Verify.**
- Build: `./gradlew :common:desktopMainClasses` — succeeds.
- Test: `./gradlew desktopTest` — Task 1's test still passes.
- Run the desktop app: `./gradlew :desktop:run`. Click X with "Minimize to tray" enabled → window hides (tray icon visible from the earlier tray work).
- From another terminal: launch the desktop app again (`./gradlew :desktop:run` in the same checkout). The second process exits within ~1 second; the first instance's window comes back to front and gains focus.
- Repeat with the window already visible (not minimized): second launch brings the window to front (already-visible windows still get `toFront()` + `requestFocus()`).
- Crash recovery test: kill the first instance with `kill -9 <pid>` (SIGKILL — no shutdown hooks). Launch again — the second process acquires the released lock and starts normally as the new primary (this proves the OS releases the lock on hard kill, as the spec assumes). With that new primary running, launch a third time → it signals `SHOW` and the new primary's window comes forward. The literal stale-port window (lock released, new primary started but has not yet rewritten `simplex.port`) is far narrower than one keystroke and not reliably reproducible by hand — the silent-exit-on-connect-failure path is exercised by inspection of the code path, not the smoke test.
**Manual smoke test pass on all three platforms** (per the spec's Tests section):
- **Linux** (KDE / GNOME with AppIndicator extension): run from terminal twice; second exits, first comes forward. Minimize to tray, run again, window restores from tray.
- **Windows 11**: same. Confirm Windows Defender doesn't prompt for the loopback `ServerSocket` bind. (It shouldn't, because we bind explicitly to `127.0.0.1`.)
- **macOS**: confirm `open -n /Applications/SimpleX.app` and direct `/Applications/SimpleX.app/Contents/MacOS/SimpleX` both deduplicate. Finder/Dock double-click was already deduplicated by LaunchServices before this change — confirm it still is.
**Commit.** `desktop: single-instance window restore via SHOW IPC`
---
### Task 4 — Audit fixes (post-implementation review)
After Tasks 13 landed, four parallel reviews (correctness/concurrency, security, edge-cases/integration, code-quality) ran against the branch. The genuine, cheap-to-fix findings were folded back in as a single follow-up commit. The non-issues and the more invasive suggestions (slow-loris handling, port-file ACL hardening, signal authentication, additional test coverage) were investigated and declined — see the commit message and the design spec for rationale.
**File**
- Rename `SingleInstance.desktop.kt``SingleInstance.kt` (the `.desktop.kt` suffix is the project's convention for `expect/actual` files under `platform/`, not for top-level desktop-only files; cf. `DesktopTray.kt`, `DesktopApp.kt`).
**Code changes**
- Drop redundant `lockChannel`. `FileLock` pins its own `FileChannel`, so a single reference to the `FileLock` keeps both alive against GC.
- `e.stackTraceToString()` in `Log.w` (matches `DesktopTray.kt:42`); top-of-file `SwingUtilities` import.
- Pin loopback bind and connect to explicit IPv4 `127.0.0.1` via `InetAddress.getByAddress(byteArrayOf(127, 0, 0, 1))`, instead of `InetAddress.getLoopbackAddress()`, which can return `::1` on dual-stack systems where IPv6 is preferred (and IPv6 loopback isn't always Defender-exempt on Windows).
- Replace `BufferedReader.readLine()` (unbounded) with a single bounded read of 256 bytes. Closes the OOM vector where a same-UID adversary could stream bytes without a newline.
- Delete any stale `simplex.port` at the top of the listener setup, before binding the new `ServerSocket`. Closes the race where a second instance arriving between our lock acquisition and our `writePortFile()` would read the old port and signal `SHOW` to whatever unrelated process now owns it.
- `OverlappingFileLockException` branch: close the channel and return `false` (fail-closed). Branch is unreachable by construction today, but the previous behaviour would silently break enforcement if the function were ever called twice.
**Verify**
- `./gradlew :common:desktopMainClasses :common:desktopTest` — green; both existing JDK-contract tests still pass.
- `./gradlew :desktop:assemble` — green.
**Commit.** `desktop: single-instance audit fixes`
---
### Task 5 — Round-2 audit fixes
A second parallel-review pass against the audited code found a few additional issues, fixed in one follow-up commit. The non-issues investigated this round (ATOMIC_MOVE missing REPLACE_EXISTING — empirically verified to replace, log-rate-limiting — pre-existing posture, function naming nits — subjective) were declined.
**Code changes**
- Replace fixed-name `simplex.port.tmp` with `Files.createTempFile(dataDir, "simplex.port.", ".tmp")` (`O_CREAT | O_EXCL` + randomized name). Closes a same-UID symlink-hijack vector where an attacker could pre-plant the fixed temp path as a symlink to e.g. `~/.bashrc`; our subsequent `Files.writeString` follows symlinks and would truncate the target.
- Wrap the port-file write in a `try/finally` that `deleteIfExists(tmp)` when the move never completes — closes an orphaned-`.tmp` leak on partial-failure paths.
- Skip the `SHOW` dispatch and the "received" log when `read() <= 0` (connect-then-close probes from same-UID processes).
- Fix stale `SingleInstance.desktop.kt` reference in the test file's header comment after the rename in Task 4.
- Refactor the test setup into a `withTempLockDir(block)` helper that explicitly walks-and-deletes the temp dir on each test exit. `File.deleteOnExit()` is a no-op for non-empty directories and was leaking a temp dir per test run.
**Verify**
- `./gradlew :common:desktopMainClasses :common:desktopTest :desktop:assemble` — green; tests still pass; `/tmp/simplex-singleinstance-test*` no longer leaks per run.
**Commit.** `desktop: single-instance round-2 audit fixes`
---
### Final pass
- Run the full test suite: `./gradlew desktopTest` — green.
- Confirm `git log --oneline` shows three commits since the spec commits, in the order Task 1 → Task 2 → Task 3.
- Confirm `simplex.lock` and `simplex.port` appear in `dataDir` after a normal run; nothing else.
- Sanity check that `closeBehavior` is still respected (the lock check runs before all the existing init, so it shouldn't interact with `closeBehavior` at all — but confirm by running through the tray plan's manual checks one more time).
If anything fails the smoke test on a specific platform, file a follow-up; the spec's "Failure modes" section enumerates the expected degraded behaviors and which are acceptable.
+130
View File
@@ -0,0 +1,130 @@
# Desktop single instance — restore on duplicate launch
## What
Prevent a second SimpleX desktop process from running concurrently against the same data directory. When the user launches the executable a second time (e.g., clicks the launcher icon while the window is minimized to tray), the running instance restores its window instead of a duplicate spawning.
Three pieces:
1. **Lock acquisition** at process start, before any Haskell / DB init. If the lock is already held, the second invocation becomes a "signaller" that wakes the first and exits.
2. **IPC channel** — a loopback `ServerSocket` on `127.0.0.1` with a random ephemeral port written to a file alongside the lock. Used by the signaller to send a `SHOW` command.
3. **Window restore** — first instance reacts to `SHOW` by calling the existing `showWindow()` in `DesktopTray.kt`, which un-minimizes from the tray and brings the window to front.
Scope: Linux + Windows + macOS. Per-data-directory scope — users with different `XDG_DATA_HOME` settings or portable installs still run side-by-side because each install has its own `dataDir` and therefore its own lock.
## Why
After the tray work landed (#6970), the desktop app can be minimized to the tray. Users who minimize and later click the launcher icon expect the app to come back to focus — but currently a second process spawns. The second process then tries to open the chat databases and either crashes on the SQLite lock or runs in a degraded state, depending on timing.
A survey of comparable apps (Electron / VS Code / JetBrains / Telegram Desktop) confirms the chosen shape is canonical for desktop apps: file lock in the user data directory plus a loopback IPC socket. The differences across apps are details (named pipe vs TCP, port range scan vs ephemeral port). We chose loopback TCP with an ephemeral port and a side-band port file because:
- Cross-platform with no platform-specific branching.
- No firewall prompt on Windows Defender when bound explicitly to `127.0.0.1`.
- Survives ungraceful crash — the OS releases the file lock at process exit; a stale port file is harmless because the signaller's `connect()` fails and the signaller exits silently.
- One lock per data directory is automatic: the lock lives inside `dataDir`, which already differs per `XDG_DATA_HOME` / install.
We deliberately do **not** scan a fixed port range like JetBrains (69426991). That pattern is known to fail behind corporate VPNs and virtual NICs that starve the entire range, and produces the "IDE refuses to start" class of bug (YouTrack IJPL-94). Ephemeral port via `ServerSocket(0)` with a port file is strictly better.
We also deliberately do **not** pull in a library (`unique4j`, `JUnique`). The whole feature is ~80 lines of stdlib calls; the dep is not worth it.
### Out of scope
- Carrying file-open / `simplex:` URL arguments through the IPC. The protocol is line-based, so this extends cleanly later if we add deep-link or file-association support.
- macOS `Desktop.setOpenURIHandler` / `setOpenFileHandler` for deep links. Separate feature, not required for "no duplicate instance" — LaunchServices already prevents Finder/Dock duplication on macOS, and the lock catches `open -n` / direct binary launches.
- Cleaning up the port file on graceful shutdown — unnecessary, it is overwritten on the next start, and a stale value is handled by the signaller's failure path.
## How
### Lock and port files
Two files live directly in `dataDir`:
- `simplex.lock` — empty file. The running instance holds a `FileChannel` open and an exclusive `FileLock` on bytes `[0, 1)`. The OS releases the lock when the process exits (clean or crashed). The file itself is never deleted by the app.
- `simplex.port` — current loopback port, plain integer text. Rewritten atomically each time a new instance acquires the lock.
We use `tryLock(0L, 1L, false)`, **not** the zero-arg form. Per JDK-6674134, the zero-arg form locks `[0, Long.MAX_VALUE)` and is rejected by SMB/CIFS and some NFS implementations — locking a single byte is portable.
The `FileLock` is held in a module-level `var` (assigned once after acquisition) for the process lifetime. `FileLock` pins its `FileChannel` internally, so a single reference keeps both alive against GC.
### Entry point in `Main.kt`
```kotlin
fun main() {
if (!acquireSingleInstanceOrSignalAndExit()) return
initHaskell()
runMigrations()
setupUpdateChecker()
initApp()
tmpDir.deleteRecursively()
tmpDir.mkdir()
return showApp()
}
```
The single-instance check runs **before** `initHaskell()` and `runMigrations()`. Both touch the chat databases, which the first instance has already locked at the SQLite layer — without the early check we would hit an SQLite-level lock failure that's much harder to recover from.
### New file `SingleInstance.desktop.kt`
In `apps/multiplatform/common/src/desktopMain/kotlin/chat/simplex/common/`.
**`acquireSingleInstanceOrSignalAndExit(): Boolean`**
1. `dataDir.mkdirs()` — the directory is not guaranteed to exist before `initApp()`, and we need to create the lock file inside it.
2. Open `FileChannel` on `simplex.lock` with `READ`, `WRITE`, `CREATE`.
3. Call `channel.tryLock(0L, 1L, false)`.
4. If the lock is `null`: take the second-instance path (see below). Return `false` (caller exits `main`).
5. Lock held: stash `channel` and `lock` in module-level refs so they aren't GC'd. Start the listener thread. Return `true`.
**Listener thread (daemon, name `simplex-single-instance`)**
- Before binding, delete any stale `simplex.port` left by a previous primary. This closes the window where a second instance arriving between lock acquisition and `writePortFile()` could read an old port and signal `SHOW` to whatever process now owns it. A signaller arriving in that tiny window now sees no port file and exits silently.
- `ServerSocket(0, 0, LOOPBACK)` where `LOOPBACK` is an explicit IPv4 `127.0.0.1`. `InetAddress.getLoopbackAddress()` may return `::1` on dual-stack systems where IPv6 is preferred, and IPv6 loopback is not always Defender-exempt on Windows.
- Write the assigned port to `simplex.port` atomically: create a randomized-name temp file via `Files.createTempFile(dataDir, "simplex.port.", ".tmp")` (`O_CREAT | O_EXCL`, un-plantable as a symlink by a same-UID attacker), write the port number, then `Files.move(tmp, port, StandardCopyOption.ATOMIC_MOVE)`. Both files live in `dataDir` (same volume), so `ATOMIC_MOVE` is supported on all three platforms. If `AtomicMoveNotSupportedException` is ever observed (exotic filesystem), retry with `REPLACE_EXISTING` — the second-instance read path already retries on parse failure, so a brief window of a half-written port file is tolerable. The temp file is cleaned up in a `finally` if the move never completes.
- Loop on `accept()`. For each connection, set a 1 s read timeout (`socket.soTimeout = 1000`) so a misbehaving client can't hang the thread, then do a **single bounded read** of at most 256 bytes from the input stream and parse the first newline-terminated UTF-8 line. The bound is the OOM guard: `BufferedReader.readLine()` is unbounded and a same-UID adversary could otherwise stream gigabytes without a newline. Our own signaller writes the whole payload (`SHOW\n`) in one call, so a single bounded read covers the legitimate case. If the line equals `"SHOW"`, post `showWindow()` to the AWT EDT via `SwingUtilities.invokeLater`. Close the socket.
- Daemon thread exits with the JVM at process shutdown; the server socket closes with the JVM.
**Second-instance path**
1. Read `simplex.port`. If missing or unparseable, sleep 200 ms and retry once — the first instance may still be writing it during a startup race.
2. Construct `Socket()` and call `connect(InetSocketAddress("127.0.0.1", port), 1000)`. Using `Socket(host, port)` directly has no connect timeout; the explicit `connect(addr, timeout)` form is required.
3. Write `SHOW\n` (UTF-8), close.
4. Return from `main`. The JVM exits because no non-daemon threads are running.
5. **On any failure** (port file still missing after retry, connect timeout, `IOException`): log via `Log.w(TAG, …)` — same logger the rest of `desktopMain` uses (see `DesktopTray.kt:42`) — and exit silently. The first instance is starting up, shutting down, or in a stuck state; doing nothing is strictly less harmful than spawning a duplicate process that will then fail on the SQLite lock.
### Window restore handler
`showWindow()` at `DesktopTray.kt:50` already does exactly what we need:
```kotlin
fun showWindow() {
simplexWindowState.windowVisible.value = true
simplexWindowState.window?.toFront()
simplexWindowState.window?.requestFocus()
}
```
Reuse it. The IPC thread is not the EDT, so the listener marshals via `SwingUtilities.invokeLater { showWindow() }`. No new public API.
### Failure modes
- **Stale `simplex.port` file after hard crash** — the OS releases the lock, but the port file persists pointing at a dead port. The next launch acquires the lock and overwrites the port file. The only victim is a signaller that hits the crash-restart window between "lock released" and "port file rewritten": it gets `ConnectException` and exits silently. Acceptable — a single missed signal during a crash-restart window is fine. A one-line code comment explains why we don't delete the port file on startup.
- **`tryLock()` throws `OverlappingFileLockException`** — same JVM trying to lock twice. Shouldn't happen by construction; if it does, log and treat as "we hold the lock" (return `true`). Cheaper than crashing.
- **Lock file on a filesystem that doesn't support file locks** — `tryLock` returns `null` or throws `IOException`. We treat this as "no other instance" and proceed without a lock. The user gets exactly the duplicate-instance behavior they have today, no worse. Log a warning.
- **`ServerSocket(0)` fails** (ephemeral port range starved — rare; observed behind certain VPNs). Log, skip the listener, proceed without IPC. The lock is still held, so a duplicate launch sees the lock held but no port to signal — the second instance retries the port file once, fails, exits silently. Worst-case behavior: the user must click the tray icon themselves. No data loss.
- **macOS Finder/Dock double-click** — LaunchServices dedupes by `CFBundleIdentifier` before our code ever runs. The lock fires only for terminal-driven launches (`open -n`, direct `Contents/MacOS/SimpleX`). Cost there is one `tryLock` call, negligible.
### Tests
Unit-level — `kotlin.test`, in `apps/multiplatform/common/src/desktopTest/kotlin/chat/simplex/app/SingleInstanceTest.kt`, matching the existing layout (`SemVerTest.kt` is the reference). Run via `./gradlew desktopTest`. Each test uses a fresh temp directory in place of `dataDir`.
- Acquire lock, second `tryLock` on the same path returns `null`. Release the first, second acquires.
- Write port to file atomically and re-read — round-trip integer parse.
Integration (Gradle-level, gated if we want it): spawn two short-lived JVM processes against the same temp dataDir. First binds, writes its port to `simplex.port`, listens, accepts one `SHOW`, exits. Second reads `simplex.port`, connects, sends `SHOW`, exits with status 0. Assert the first observed the payload. The port handoff goes through the file — same code path as production — not a test-only side channel.
Manual smoke test on all three platforms:
- Linux: run twice from a terminal → second exits, window comes forward.
- Linux with tray active and window minimized: same, window restores from tray.
- Windows: same.
- macOS: `open -n /Applications/SimpleX.app` and direct `Contents/MacOS/SimpleX` launches both deduplicate correctly.