Based on #19708.
This is on the path to porting the entire event class to Rust, as
`event.content` will then return the new Rust class `JsonObject`.
This PR adds a pure Rust `JsonObject` class that is a `Mapping`
representing a json-style object. It uses `serde_json::Value` as its
in-memory representation and `pythonize` for conversion when a field is
looked up on the object.
I'm not thrilled with the name, but couldn't think of a better one.
This also adds `JsonObject` handling to the JSON serialisation functions
we use, as well as to the `freeze(..)` function.
Reviewable commit-by-commit.
Better to retry more quickly than have workers wait around. 5 seconds is
still a reasonable gap in time to not overwhelm anything.
This matters most in cross-worker scenarios. When locks are on the same
worker, when the lock holder releases, we signal to other locks (with
the same name/key) that they should try reacquiring the lock
immediately. But locks on other workers only re-check based on their
retry `_timeout_interval`.
Updating to 5 seconds to match the previous intentions based on the
[flawed
code](https://github.com/element-hq/synapse/blob/6100f6e4f7fb0c72f1ae2802683ebc811c0e3a77/synapse/handlers/worker_lock.py#L278).
We can assume they were trying to have 5 seconds as the max value to
retry.
Spawning from
https://github.com/element-hq/synapse/pull/19394#discussion_r3168458070
Fixes the symptoms of https://github.com/element-hq/synapse/issues/19315
/ https://github.com/element-hq/synapse/issues/19588 but not the
underlying reason causing the number to grow so large in the first
place.
```
ValueError: Exceeds the limit (4300 digits) for integer string conversion; use sys.set_int_max_str_digits() to increase the limit
```
Copied from the original pull request on [Famedly's Synapse
repo](https://github.com/famedly/synapse/pull/221) (with some edits):
Basing the time interval around a 5 seconds leaves a big window of
waiting especially as this window is doubled each retry, when another
worker could be making progress but can not.
Right now, the retry interval in seconds looks like `[0.2, 5, 10, 20,
40, 80, 160, 320, (continues to double)]` after which logging should
start about excessive times and (relatively quickly) end up with an
extremely large retry interval with an unrealistic expectation past the
heat death of the universe. 1 year in seconds = 31,536,000.
With this change, retry intervals in seconds should look more like:
```
[
0.2,
0.4,
0.8,
1.6,
3.2,
6.4,
12.8,
25.6,
51.2,
60, < never goes higher than this
]
```
Logging about excessive wait times will start at 10 minutes.
<details>
<summary>Previous breakdown when we were using 15 minutes</summary>
```
[
0.2,
0.4,
0.8,
1.6,
3.2,
6.4,
12.8,
25.6,
51.2,
102.4, # 1.7 minutes
204.8, # 3.41 minutes
409.6, # 6.83 minutes
819.2, # 13.65 minutes < logging about excessive times will start here, 13th iteration
900, # 15 minutes < never goes higher than this
]
```
</details>
Further suggested work in this area could be to define the cap, the
retry interval starting point and the multiplier depending on how
frequently this lock should be checked. See data below for reasons why.
Increasing the jitter range may also be a good idea
---------
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
(cherry picked from commit 3f58bc50df)
Similar to #19706, let's port the `unsigned` field into a Rust class.
This does change things a bit in that we now define exactly what
unsigned fields that are allowed to be added to an event, and what
actually gets persisted. This should be a noop though, as we carefully
filter out what unsigned fields we allow in from federation, for example
As a side effect of this cleanup, I think this fixes handling
`unsigned.age` on events received over federation.
This is another stepping stone in porting the event class fully to Rust.
The new `Signatures` class is relatively simple, as we actually don't
interact with it that much in the code. It does *not* implement
`Mapping` or `MutableMapping` as that takes quite a lot of effort that
we don't need, even though it would be more ergonomic.
Fixes the symptoms of https://github.com/element-hq/synapse/issues/19315
/ https://github.com/element-hq/synapse/issues/19588 but not the
underlying reason causing the number to grow so large in the first
place.
```
ValueError: Exceeds the limit (4300 digits) for integer string conversion; use sys.set_int_max_str_digits() to increase the limit
```
Copied from the original pull request on [Famedly's Synapse
repo](https://github.com/famedly/synapse/pull/221) (with some edits):
Basing the time interval around a 5 seconds leaves a big window of
waiting especially as this window is doubled each retry, when another
worker could be making progress but can not.
Right now, the retry interval in seconds looks like `[0.2, 5, 10, 20,
40, 80, 160, 320, (continues to double)]` after which logging should
start about excessive times and (relatively quickly) end up with an
extremely large retry interval with an unrealistic expectation past the
heat death of the universe. 1 year in seconds = 31,536,000.
With this change, retry intervals in seconds should look more like:
```
[
0.2,
0.4,
0.8,
1.6,
3.2,
6.4,
12.8,
25.6,
51.2,
60, < never goes higher than this
]
```
Logging about excessive wait times will start at 10 minutes.
<details>
<summary>Previous breakdown when we were using 15 minutes</summary>
```
[
0.2,
0.4,
0.8,
1.6,
3.2,
6.4,
12.8,
25.6,
51.2,
102.4, # 1.7 minutes
204.8, # 3.41 minutes
409.6, # 6.83 minutes
819.2, # 13.65 minutes < logging about excessive times will start here, 13th iteration
900, # 15 minutes < never goes higher than this
]
```
</details>
Further suggested work in this area could be to define the cap, the
retry interval starting point and the multiplier depending on how
frequently this lock should be checked. See data below for reasons why.
Increasing the jitter range may also be a good idea
---------
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
This comes from
https://github.com/erikjohnston/rust-signed-json/blob/main/src/json.rs.
We need to be able to serialise canonical JSON in Rust to be able to
calculate event IDs once we port the event class to Rust.
We could instead make the above a properly published crate, but feels
easier to pull it into Synapse utils.
When upgrading a room to v12, we accidentally ended up mutating the
content of the old power level. Since we cache events, this meant any
future usage of the old power level event would see the wrong content
(until it dropped from the cache).
This meant that the creator of the new room would not be able to perform
admin actions on the old room. Any federation requests for the event
would fail the hash checks, since the content had been changed.
All in all, quite a nasty bug.
MSC3266 is merged in v1.15, let's stabilize it as part of #18731
1. Add support for the stable `/_matrix/client/v1/room_summary/`
endpoint, keeping both unstable endpoints for compat
2. Remove the experimental `msc3266_enabled` flag
This fixes the bug described in #19713 (and double-checked against the
SDK integration test, which now passes with this change). A sync
response must be returned immediately if a room subscription
configuration change caused a new non-empty response (checked with `if
response` in the code) to be produced.
Fixes#19713.
Fixes#18844.
---------
Co-authored-by: Erik Johnston <erik@matrix.org>
Currently synapse returns `M_FORBIDDEN` when trying to use the account
deactivation API, if the server admin disabled displayname changes. This
is undesirable, since it prevents GDPR erasure without admin
interaction. The admin API seems to work fine though. This also only
seems to affect the deactivate API, when the erase flag is true.
Relevant endpoint:
https://spec.matrix.org/latest/client-server-api/#post_matrixclientv3accountdeactivate
This change only removes the checked for condition that the displayname
and profile avatar are allowed to be changed per the configuration
setting. If a user is deleting themselves, why is that denied?
There did not seem to be a basic test for this endpoint that checks the
`erase` usage, so that was added as well as checking the above mentioned
behavior.
Both `__getitem__` and `.user_id` were removed in #19680 to simplify the
event class. However, `EventBase` is exposed to modules who might also
make use of those methods, so let's reinstate them (but otherwise not
reinstate the usage of them in the code).
Follows on from #19473.
We should be recording where we have deleted up to in the same
transaction as we perform the delete, rather than at the end. This code
only starts deleting rows after a month (and the original PR isn't in a
release yet), so no server should have run into this problem yet.
Also let's log more regularly, as the initial set of deletions will
likely take a long time.
Fixes#13043
The usages of the table mostly already correctly handled if we don't
have old entries, as that was needed when we first added the table.
I arbitrarily set the prune time to 30 days. The only use for old
entries is for sync streams that haven't synced since then, and we
should very rarely see sync streams that haven't been used in 30 days.
Reviewable commit-by-commit.
---------
Co-authored-by: Olivier 'reivilibre' <oliverw@element.io>
Co-authored-by: Olivier 'reivilibre' <olivier@librepush.net>