Fixes the symptoms of https://github.com/element-hq/synapse/issues/19315
/ https://github.com/element-hq/synapse/issues/19588 but not the
underlying reason causing the number to grow so large in the first
place.
```
ValueError: Exceeds the limit (4300 digits) for integer string conversion; use sys.set_int_max_str_digits() to increase the limit
```
Copied from the original pull request on [Famedly's Synapse
repo](https://github.com/famedly/synapse/pull/221) (with some edits):
Basing the time interval around a 5 seconds leaves a big window of
waiting especially as this window is doubled each retry, when another
worker could be making progress but can not.
Right now, the retry interval in seconds looks like `[0.2, 5, 10, 20,
40, 80, 160, 320, (continues to double)]` after which logging should
start about excessive times and (relatively quickly) end up with an
extremely large retry interval with an unrealistic expectation past the
heat death of the universe. 1 year in seconds = 31,536,000.
With this change, retry intervals in seconds should look more like:
```
[
0.2,
0.4,
0.8,
1.6,
3.2,
6.4,
12.8,
25.6,
51.2,
60, < never goes higher than this
]
```
Logging about excessive wait times will start at 10 minutes.
<details>
<summary>Previous breakdown when we were using 15 minutes</summary>
```
[
0.2,
0.4,
0.8,
1.6,
3.2,
6.4,
12.8,
25.6,
51.2,
102.4, # 1.7 minutes
204.8, # 3.41 minutes
409.6, # 6.83 minutes
819.2, # 13.65 minutes < logging about excessive times will start here, 13th iteration
900, # 15 minutes < never goes higher than this
]
```
</details>
Further suggested work in this area could be to define the cap, the
retry interval starting point and the multiplier depending on how
frequently this lock should be checked. See data below for reasons why.
Increasing the jitter range may also be a good idea
---------
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
(cherry picked from commit 3f58bc50df)
Both `__getitem__` and `.user_id` were removed in #19680 to simplify the
event class. However, `EventBase` is exposed to modules who might also
make use of those methods, so let's reinstate them (but otherwise not
reinstate the usage of them in the code).
Follows on from #19473.
We should be recording where we have deleted up to in the same
transaction as we perform the delete, rather than at the end. This code
only starts deleting rows after a month (and the original PR isn't in a
release yet), so no server should have run into this problem yet.
Also let's log more regularly, as the initial set of deletions will
likely take a long time.
Fixes#13043
The usages of the table mostly already correctly handled if we don't
have old entries, as that was needed when we first added the table.
I arbitrarily set the prune time to 30 days. The only use for old
entries is for sync streams that haven't synced since then, and we
should very rarely see sync streams that haven't been used in 30 days.
Reviewable commit-by-commit.
---------
Co-authored-by: Olivier 'reivilibre' <oliverw@element.io>
Co-authored-by: Olivier 'reivilibre' <olivier@librepush.net>
Closes: #19688
Part of: MSC4450 whose Experimental Feature tracking issue is #19691
Add an unstable, namespaced `idp_id` query parameter to `fallback/web` \
This allows clients to specify the identity provider they'd like to log
in with for SSO when they have multiple upstream IdPs associated with
their account.
Previously, Synapse would just pick one arbitrarily. But this was
undesirable as you may want to use a different one at that point in
time. When logging in, the user is able to choose when IdP they use -
during UIA (which uses fallback auth mechanism) they should be able to
do the same.
-----
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
This moves the reference counting from PyO3 into standard Rust types,
allowing the class to be used natively from Rust without needing a
Python runtime.
When we return events to clients we need to annotate them with the
membership of the user at the time of the event, in the `unsigned`
section. We already check the membership at the event during the
visibility checks, and so we annotate events there. However, since this
a per-user field we end up having to clone the event in question.
Instead, let's add a `FilteredEvent` class that is returned by the
visibility checks, which allows returning the membership without editing
the event. This has three benefits:
1. Avoids the clones of the event.
2. Allows us to statically check that we have filtered events before
returning them to clients.
3. We no longer edit `unsigned` data after event deserialization, this
makes it easier to port the event class to Rust.
The last benefit is why we're doing this *now*, however IMV it shouldn't
affect whether we want this change or not.
Reviewable commit-by-commit
---------
Co-authored-by: Olivier 'reivilibre' <oliverw@element.io>
Fixes: #19616
This caused 2+ people trouble now, so worth batting away with a
low-effort change if we can.
Only seen on macOS so far, but nothing stops SQLite being configured in
defensive mode by default on other platforms, so it is not necessarily
entirely specific to macOS.
We *could* also do this for Python < 3.12 but it'd be more effort and I
don't know if it's worth it.
(For context @kegsay says the interpreter with this problem was
installed through `pyenv install`.)
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Follows: #19365
Part of: MSC4354 Sticky Events (experimental feature #19409)
This PR introduces a `spam_checker_spammy` flag, analogous to
`policy_server_spammy`, as an explicit flag
that an event was decided to be spammy by a spam-checker module.
The original Sticky Events PR (#18968) just reused
`policy_server_spammy`, but it didn't sit right with me
because we (at least appear to be experimenting with features that)
allow users to opt-in to seeing
`policy_server_spammy` events (presumably for moderation purposes).
Keeping these flags separate felt best, therefore.
As for why we need this flag: soon soft-failed status won't be
permanent, at least for sticky events.
The spam checker modules currently work by making events soft-failed.
We want to prevent spammy events from getting
reconsidered/un-soft-failed, so it seems like we need
a flag to track spam-checker spamminess *separately* from soft-failed.
Should be commit-by-commit friendly, but is also small.
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
This is to make it easier to port to Rust, as well as making things
conceptually simpler.
Two changes:
1. Remove the `__getitem__` interface on events
2. Remove `.user_id` as an alias of `.sender`.
The spec says `device_keys` may be omitted, but not set to `null`.
This was temporarily allowed as a workaround for misbehaving clients
(see #19023), which have since been fixed.
Fixes#19030
The Rust port of `KNOWN_ROOM_VERSIONS` (#19589) made `__contains__`
strict about key types, raising `TypeError` when called with `None`
instead of returning `False` like a Python dict would.
This broke `/sync` for rooms with a NULL `room_version` in the database.
```
File "/home/synapse/src/synapse/handlers/sync.py", line 2628, in _get_room_changes_for_initial_sync
if event.room_version_id not in KNOWN_ROOM_VERSIONS:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: argument 'key': 'NoneType' object cannot be cast as 'str'
```
# What is done?
- resolves#19403
- Adds build-time Rust compiler detection and captures the rustc
--version value during the build.
- Exposes the captured compiler version from the Rust extension via a
new Python-callable function.
- Exports a new Prometheus metric for rustc version.
# How to test?
- compile `poetry install`
- add `enable_metrics: true` and
```yaml
resources:
- compress: false
names:
- client
- federation
- metrics
```
to homeserver.yaml
- start synapse
- find the rustc version at `http://localhost:8008/_synapse/metrics`
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
---------
Co-authored-by: Quentin Gliech <quenting@element.io>