mirror of
https://forgejo.ellis.link/continuwuation/continuwuity/
synced 2026-05-13 09:54:56 +00:00
259 lines
9.6 KiB
Plaintext
259 lines
9.6 KiB
Plaintext
# Performance tuning
|
|
|
|
While Continuwuity's default config parameters are generally optimised, additional modifications can be made to better utilise your server resources. This is especially helpful for homeservers with many users and/or are joined in many large federated rooms, and will increasingly be the case as the Matrix network expands.
|
|
|
|
This page aims to outline various performance tweaks for Continuwuity and their effects. As always, your mileage may vary according to your setup's specifics. If you have further discussions or recommendations, please share them in the community rooms.
|
|
|
|
## DNS tuning (recommended)
|
|
|
|
Please see the dedicated [DNS tuning guide](./dns.mdx).
|
|
|
|
## Cache capacities
|
|
|
|
If you have unused memory to spare, consider increasing the `cache_capacity_modifier` value to a larger number, as to allow more data to be stored in hot memory. This would _**significantly**_ speed up many intensive operations such as state resolutions, and also results in decreased CPU usage and disk I/O. Start with a baseline of `cache_capacity_modifier = 2.0` and tune up until you find a satisfactory RAM usage.
|
|
|
|
On the other hand, if your system doesn't have a lot of RAM, consider decreasing the cache capacity modifier to something smaller than `1.0` to avoid low-memory issues (at the cost of higher load on disk/CPU). The recommendation also works if your system has very little RAM compared to the number of CPU cores, as cache capacities tend to scale according to number of cores.
|
|
|
|
## Disabling some features
|
|
|
|
You can disable outgoing **typing notifications** and **read markers** to reduce strain on the CPU and network.
|
|
|
|
```toml
|
|
# disables sending read receipts
|
|
allow_outgoing_read_receipts = false
|
|
# disables sending typing notifications
|
|
allow_outgoing_typing = false
|
|
```
|
|
|
|
Outgoing presence updates is also considered expensive and has been disabled by default(`allow_outgoing_presence = false`).
|
|
|
|
For even more savings, you may wish to disable _all_ processing of typing notifications, read markers, and presence entirely. This can be done by also disabling the local and incoming events for these features.
|
|
|
|
<details>
|
|
|
|
<summary> `continuwuity.toml` </summary>
|
|
|
|
```toml
|
|
# disabling read receipts entirely
|
|
allow_local_read_receipts = false
|
|
allow_incoming_read_receipts = false
|
|
allow_outgoing_read_receipts = false
|
|
|
|
# disabling typing notifications entirely
|
|
allow_local_typing = false
|
|
allow_outgoing_typing = false
|
|
allow_incoming_typing = false
|
|
|
|
# disabling presence updates entirely
|
|
allow_local_presence = false
|
|
allow_incoming_presence = false
|
|
allow_outgoing_presence = false
|
|
```
|
|
|
|
</details>
|
|
|
|
## Tuning database compression
|
|
|
|
:::warning
|
|
These steps MUST be done **before** starting Continuwuity for the first time, as database compressions are irreversible.
|
|
:::
|
|
|
|
### Changing the compression algorithm
|
|
|
|
For reduced CPU usage at a tradeoff of increased storage space, consider deploying Continuwuity with the faster and less intensive `lz4` algorithm instead of `zstd` for rocksdb, and disable WAL compression entirely:
|
|
|
|
```toml
|
|
### in continuwuity.toml ###
|
|
rocksdb_compression_algo = "lz4"
|
|
rocksdb_wal_compression = "none"
|
|
```
|
|
|
|
The tweak can especially be helpful if you have an older or less performant CPU (e.g. a Raspberry Pi) and disk space to spare.
|
|
|
|
### Increasing bottommost layer compression (`zstd` only)
|
|
|
|
The bottommost layer of the database usually contains old and read-only data, and hence is a suitable place for further compression. In Continuwuity, this is possible by setting `rocksdb_bottommost_compression = true` and tuning `rocksdb_bottommost_compression_level` to a more compact level than the default one used in `rocksdb_compression_level`. The tweak comes at a cost of some increased CPU usage, but would prevent your database from growing too large especially in the long run.
|
|
|
|
For those using `zstd` compression, the compression level ranges from 1 to 22. An example like this could apply:
|
|
|
|
```toml
|
|
### in continuwuity.toml ###
|
|
rocksdb_compression_algo = "zstd"
|
|
rocksdb_compression_level = 32767 # magic number, translates to level 3 on zstd
|
|
rocksdb_bottommost_compression = true
|
|
rocksdb_bottommost_compression_level = 9 # level 9 on zstd
|
|
```
|
|
|
|
For `lz4` users, the default level (`-1`) is already the most compact. You can only further decrease it to favor compression speed over ratio.
|
|
|
|
Consult these documentations for more information on compression tuning and levels:
|
|
|
|
- [Rocksdb compression documentation][rocksdb-compression]
|
|
- [Rocksdb default compression levels][rocksdb-compression-defaults]
|
|
- [Zstd manual][zstd-manual]
|
|
- [Lz4 manual][lz4-manual]
|
|
|
|
[rocksdb-compression]: https://github.com/facebook/rocksdb/wiki/Compression
|
|
[rocksdb-compression-defaults]: https://github.com/facebook/rocksdb/blob/main/include/rocksdb/options.h#L208-L217
|
|
[zstd-manual]: https://facebook.github.io/zstd/zstd_manual.html
|
|
[lz4-manual]: https://github.com/lz4/lz4/blob/release/doc/lz4_manual.html
|
|
|
|
## Other tweaks
|
|
|
|
### Using UNIX sockets
|
|
|
|
If your homeserver and the reverse proxy lives on the same machine, you may consider exposing Continuwuity on a UNIX socket instead of a port. This would reduce TCP overhead between the two programs.
|
|
|
|
<details>
|
|
|
|
<summary>Example config with Caddy</summary>
|
|
|
|
```toml
|
|
### in continuwuity.toml ###
|
|
|
|
# `address` and `port` has to be commented out first
|
|
#address = ["127.0.0.1", "::1"]
|
|
#port = 8008
|
|
unix_socket_path = "/run/continuwuity/continuwuity.sock"
|
|
```
|
|
|
|
```
|
|
### in your Caddyfile ###
|
|
https://matrix.example.com {
|
|
reverse_proxy unix//run/continuwuity/continuwuity.sock
|
|
|
|
# alternatively, use the http2-plaintext protocol
|
|
# reverse_proxy unix+h2c//run/continuwuity/continuwuity.sock
|
|
}
|
|
```
|
|
|
|
</details>
|
|
|
|
### Increased batch size for notary queries
|
|
|
|
To speed up initial joins for large rooms, consider increasing `trusted_server_batch_size` to something higher than the default `1024`. Start with doubling to `2048` until you find a suitable value.
|
|
|
|
### Serving .well-knowns manually
|
|
|
|
Instead of [reverse proxying .well-knowns](./delegation#serving-with-a-reverse-proxy), you can serve them directly as manual files at the reverse proxy. This could decrease _some_ network request handling for Continuwuity.
|
|
|
|
<details>
|
|
|
|
<summary>Example config with Caddy</summary>
|
|
|
|
```
|
|
### in your Caddyfile ###
|
|
|
|
https://example.com {
|
|
|
|
respond /.well-known/matrix/server 200 {
|
|
body `{"m.server":"matrix.example.com:443"}`
|
|
}
|
|
|
|
respond /.well-known/matrix/client 200 {
|
|
body <<JSON
|
|
{
|
|
"m.homeserver": {
|
|
"base_url": "https://matrix.example.com/"
|
|
}
|
|
}
|
|
JSON
|
|
}
|
|
}
|
|
|
|
https://matrix.example.com {
|
|
reverse_proxy 127.0.0.1:6167
|
|
}
|
|
```
|
|
|
|
</details>
|
|
|
|
### Increasing file descriptors
|
|
|
|
On many Linux systems, file descriptors are capped to `1024`, which may not be enough for Continuwuity's heavy use of network and disk resources. Consider increasing this number by editing your `limits.conf` file:
|
|
|
|
```txt title=/etc/security/limits.conf
|
|
* soft nofile 32768
|
|
* hard nofile 65536
|
|
```
|
|
|
|
You may also need to increase your global file descriptor limit, by adding a sysctl parameter like `fs.file-max=1048576` (see [sysctl tuning section](#sysctl-tunings) for more).
|
|
|
|
Restart your system and run `ulimit -Sn` and `ulimit -Hn`. Your soft and hard limits should now be updated.
|
|
|
|
For Docker, these tweaks correspond to the following `ulimits`:
|
|
|
|
```yaml title=docker-compose.yml
|
|
services:
|
|
homeserver:
|
|
# ...
|
|
ulimits:
|
|
nofile:
|
|
soft: 32768
|
|
hard: 65536
|
|
```
|
|
|
|
### Sysctl tunings
|
|
|
|
Lastly, consider tuning kernel parameters in your `/etc/sysctl.conf` file (or for systemd distros, `/etc/sysctl.d/99-sysctl.conf`). Refer to external guides such as the [Arch Linux entry on Sysctl][arch-linux-sysctl] and the [sysctl documentation][sysctl-docs] for all possible values.
|
|
|
|
<details>
|
|
|
|
<summary>Example sysctl.conf</summary>
|
|
|
|
This example `/etc/sysctl.conf` is used for a singleuser Continuwuity instance, hosted on an 8GB RAM machine with 4 cores. The goal here is to encourage RAM usage and increase adequate buffer for network activities, as the machine runs on a high-latency network environment.
|
|
|
|
DO NOT copy-paste this directly, please consult this only as a reference example and only apply gradual changes to your system after you've understood their effects.
|
|
|
|
```toml title=/etc/sysctl.conf
|
|
# DISK & RAM
|
|
|
|
## disables slow SWAP entirely
|
|
vm.swappiness = 0
|
|
|
|
## increase pages cache ratio before writing to disk
|
|
## can help with reducing Disk I/O
|
|
vm.dirty_background_ratio=25
|
|
vm.dirty_ratio=50
|
|
|
|
## decrease kernel tendency to reclaim directory/inode caches
|
|
vm.vfs_cache_pressure = 50
|
|
|
|
## increase max file descriptors allowed
|
|
fs.file-max=1048576
|
|
## increase max kernel threads, can help with performance
|
|
kernel.threads-max=100000
|
|
|
|
# NETWORKING
|
|
|
|
## increase all network read/write buffers to 8MB
|
|
## helps increase backlogs
|
|
net.core.rmem_max=8388608
|
|
net.core.wmem_max=8388608
|
|
net.core.rmem_default=8388608
|
|
net.core.wmem_default=8388608
|
|
|
|
## increase TCP-related memories to match with above, too
|
|
net.ipv4.tcp_rmem=4096 131072 8388608
|
|
net.ipv4.tcp_wmem=4096 131072 8388608
|
|
|
|
# applies TCP window scaling
|
|
net.ipv4.tcp_window_scaling=1
|
|
# applies SYN cookie for protection against SYN flood
|
|
net.ipv4.tcp_syncookies=1
|
|
|
|
# increase range of ports assigned to outbound requests
|
|
# can help when there are plenty of outbound connections
|
|
net.ipv4.ip_local_port_range = 20000 65535
|
|
|
|
# enable the modern BBR congestion control algorithm, with fq
|
|
net.core.default_qdisc = fq
|
|
net.ipv4.tcp_congestion_control = bbr
|
|
```
|
|
|
|
Once you're happy, run `sysctl -p` to apply the changes.
|
|
|
|
</details>
|
|
|
|
[arch-linux-sysctl]: https://wiki.archlinux.org/title/Sysctl
|
|
[sysctl-docs]: https://www.kernel.org/doc/html/latest/admin-guide/sysctl/ |