Commit Graph

25757 Commits

Author SHA1 Message Date
Erik Johnston c8fbb004de Use jemalloc 2026-05-19 15:34:40 +01:00
Erik Johnston c36c75ef3c Use jemalloc 2026-05-19 13:53:13 +01:00
Erik Johnston e58c2972fb Correctly handle failing to parse event dict from DB
Now that we do a bit more validadtion of events, it's possible that an
event persisted in the database may now not pass validation. This
shouldn't happen, but let's handle it correctly by logging and returning
that we couldn't find the event.

This is the same as what we do if we can't parse the JSON.
2026-05-18 09:49:42 +01:00
Erik Johnston a5081d102e Newsfile 2026-05-15 16:37:30 +01:00
Erik Johnston 5542419c28 Fix EventProtocol to work with the Rust class
We have to take a slightly different approach here as we can't subclass
the native Event type.
2026-05-15 16:37:30 +01:00
Erik Johnston 1cc69f1b76 Fix tests after porting EventBase to Rust
Adapt tests that mutated Python event internals (`_event_id`, `_dict`,
direct attribute assignment, `FrozenEventV3(...)` construction) to work
with the new Rust-backed `Event` class:

- Rebuild events via `make_event_from_dict` / `make_test_event` instead
  of patching attributes in place.
- Plumb `rejected_reason` through `_join_rules_event` rather than
  assigning to `rejected_reason` after construction.
- Replace the hand-built event in `test_msc4242_state_dag` with a
  `Mock(spec=EventBase)` since the test only needs a handful of
  attributes.
- Add `# type: ignore` for the deprecated `event.user_id` / `event[key]`
  accessors and for assigning to `event.content`.
- In `make_test_event`, drop the default `room_id` for v11+ create
  events so each gets a distinct hash-derived room ID.
2026-05-15 16:37:30 +01:00
Erik Johnston 8e64822538 Port EventBase hierarchy to the Rust Event class
Replace the abstract `synapse.events.EventBase` and the concrete
`FrozenEvent`, `FrozenEventV2`, `FrozenEventV3`, `FrozenEventV4`, and
`FrozenEventVMSC4242` Python classes with a single Rust-backed
`Event`, exposed via `synapse.synapse_rust.events.Event`. `EventBase`
becomes a `TypeAlias` for `Event` so that the existing type annotations
across the codebase keep working.

Notable behavioural notes:

- `make_event_from_dict()` now constructs the Rust class. Event IDs for
  v3+ formats are computed in the constructor (instead of lazily on
  first access).
- `clone_event()` is now a single `event.deep_copy()` call. The old
  shallow copy of `unsigned` was effectively a deep copy in practice;
  `deep_copy()` matches that.
- The third-party event-rules callback no longer needs to call
  `event.freeze()` — Events are immutable from Python by construction.
- A small `assert_never` is added in `events_worker.py` to make the
  `redact_behaviour` switch exhaustive now that the type checker can
  see all branches.

All test fixtures that constructed `FrozenEventV3` etc. directly are
updated to construct `Event` instead.
2026-05-15 16:36:57 +01:00
Erik Johnston 6135aaca11 Add Event pyclass to Rust
Adds a single `Event` Rust pyclass that replaces the Python EventBase /
FrozenEventV{1,2,3,4,VMSC4242} hierarchy. The class is added but not yet
wired into Python — callers continue to use the existing Python classes
in this commit; the migration follows in the next commit.

The internals use an `FormattedEvent` over
`EventFormatV{1,2V3,4,VMSC4242}` structs sharing an `EventCommonFields`.
Format-specific behaviour (prev_event_ids, auth_event_ids, room_id
derivation for v12 create events, etc) is encapsulated per variant.
Event IDs are computed in the constructor for v3+ formats; v1/v2 use the
`event_id` field as-is.

Two supporting Rust modules are added at the same time:

- `events::constants` — string constants for event types, top-level
  fields, and per-event-type content fields, used to keep the redaction
  rules and field accessors readable.
- `events::utils` — `redact()`, `compute_event_reference_hash()`, and
  `calculate_event_id()`, ported from `synapse.crypto.event_signing` /
  `synapse.events.utils`.
2026-05-15 15:59:55 +01:00
Erik Johnston e028519772 Add helpers and visibility for the upcoming Event port
Small prerequisites for porting the Python EventBase hierarchy to Rust:

- duration: make `from_milliseconds` const and add an `IntoPyObject` impl
  for owned `SynapseDuration`, so the new Rust `Event.sticky_duration()`
  can return one directly to Python.
- internal_metadata: rename `copy()` to `deep_copy()` (matching the new
  naming used by the rest of the events module) and make `new()` callable
  from sibling modules.
- json_object: expose `object` as a `pub` field and add a `get_field`
  helper so the new Event class can read from it without going through
  Python.
- signatures, unsigned: add `deep_copy()` methods so the new Event class
  can implement its own deep-copy.
2026-05-15 15:59:03 +01:00
Olivier 'reivilibre 3f0f03d536 Revert "Send a SSS response immediately if the config has changed and there are new results to sync (#19714)" (#19784)
Reverts: #19714

Opens: #19783

Closes: https://github.com/element-hq/backend-internal/issues/242

Related: #18880 (the performance problem that is aggravated by #19714)

This reverts commit 2691d0b8b1.

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2026-05-15 10:36:47 +00:00
dependabot[bot] 9ce68a6a4a Bump gitpython from 3.1.47 to 3.1.50 (#19767)
Signed-off-by: dependabot[bot] <support@github.com>
2026-05-15 10:29:39 +00:00
dependabot[bot] 5c8419eed7 Bump authlib from 1.6.11 to 1.6.12 (#19776)
Signed-off-by: dependabot[bot] <support@github.com>
2026-05-15 10:26:51 +00:00
dependabot[bot] cf64199ea0 Bump urllib3 from 2.6.3 to 2.7.0 (#19771)
Signed-off-by: dependabot[bot] <support@github.com>
2026-05-15 10:24:49 +00:00
Erik Johnston ff55aff5b2 Fix up event-construction in tests ahead of the Rust event port (#19781)
When we port the `Event` class to Rust, the constructor will check for
the existence of required fields. To support that, we tidy up the test
code where we construct fake events to add all the required fields.

There should be no behavioural changes.

Review commit-by-commit.
2026-05-15 10:12:42 +01:00
Eric Eastwood b233892a13 Update wait_for_stream_token(...) patterns and fix sync fetching with unbounded token (#19644)
Spawning from trying to find the proper way to wait for a token, see
https://github.com/element-hq/synapse/pull/19558#discussion_r2977673208

- Update `wait_for_stream_token(...)` patterns so
validation/sanitization is handled upstream in usage.
- Fix sync waiting for bounded token but using unbounded token to fetch
data. Noticed while working on adding the new method.

Part of https://github.com/element-hq/synapse/issues/19647
2026-05-14 14:53:16 -05:00
Erik Johnston 86a1e73ef4 Consolidate MSC4242 state DAG checks via a TypeIs helper (#19774)
The reason for the change is to make it easier to support these checks
when porting event class to Rust.

Previously, code that needed to access `prev_state_events` had to
combine a `room_version.msc4242_state_dags` boolean check with an
`isinstance(event, FrozenEventVMSC4242)` cast (or `cast()`) for the type
checker. Introduce `supports_msc4242_state_dag()` in a new
`synapse/events/py_protocol.py` which does both in one step via
`TypeIs[MSC4242Event]`, removing the need to import the concrete
`FrozenEventVMSC4242` class at every call site.

`MSC4242Event` is an `EventBase` subclass used purely for type narrowing
— it's marked with a metaclass that rejects `isinstance()` to make
accidental runtime use loud.

No behavioural change: callers continue to gate on the same room version
flag and access the same `prev_state_events` attribute.
2026-05-14 18:02:33 +00:00
Erik Johnston ace8447037 Tidy up Rust RoomVersion structs (#19766)
This is in prep for using the room versions more from Rust.

Main changes:
- Change it so each room version is defined as a delta to the last one.
This is a cosmetic change that makes it easier to ensure the room
version definitions are correct (as they're defined as deltas from
previous versions).
- Move constants to `RoomVersion` constants, like `RoomVersion::V1`, for
convenience.
- Change visibility of various attributes.
2026-05-14 11:21:00 +01:00
Erik Johnston b90a0e9fe9 Use StrCollection for prev_state_events. (#19777)
Convert `prev_state_events` to use `StrCollection` rather than requiring
it to be a mutable list. None of the usages require it to be a proper
list, and besides, events are immutable and therefore so should
`event.prev_state_events`.
2026-05-14 10:29:16 +01:00
Eric Eastwood ff0420a03c Improve We can't get valid state history. logging (#19765)
Add `event_id` so you can actually correlate everything together in the logs.
2026-05-13 12:31:38 -05:00
Andrew Morgan 1409dbc229 Merge remote-tracking branch 'origin/release-v1.152' into develop 2026-05-13 17:27:06 +02:00
Denis Kasak 16c17f3a42 Add CVE IDs to changelog for 1.152.1. (#19778)
Since this is just a change log update, I've removed the entire
checklist. Please tell me if this is incorrect.
2026-05-13 15:26:16 +00:00
Olivier 'reivilibre 1b0622fa99 Merge branch 'release-v1.153' into develop 2026-05-13 13:10:18 +01:00
Olivier 'reivilibre f109c25960 1.153.0rc2 v1.153.0rc2 2026-05-13 12:01:11 +01:00
Erik Johnston 5efeac44b2 Handle arbitrary sized integers in unsigned. (#19769)
Handle arbitrary sized integers in `unsigned` (and other Rust objects
that use `serde_json::Value`)
2026-05-13 11:28:06 +01:00
Eric Eastwood b8bd35105f Update WorkerLock tests to better stress the WORKER_LOCK_MAX_RETRY_INTERVAL (#19772)
There is no behavioral change, only a change to the tests. See
https://github.com/element-hq/synapse/pull/19772#discussion_r3222059105
for an explanation of why the tests needed changing (and diff comments).

Follow-up to https://github.com/element-hq/synapse/pull/19394. The test
discussion originally happened in
https://github.com/element-hq/synapse/pull/19394#discussion_r2789673181

This is spawning from thinking about the problem again.
2026-05-12 10:10:09 -05:00
Will Hunt 5c87faf9e9 MSC4452: Preview URL capability (#19715)
Implementation of
https://github.com/matrix-org/matrix-spec-proposals/pull/4452
2026-05-11 12:39:38 +01:00
Olivier 'reivilibre b2d196f3ed Merge branch 'release-v1.153' into develop 2026-05-08 16:20:19 +01:00
Erik Johnston c430c16df4 Port event content to Rust (#19725)
Based on #19708.

This is on the path to porting the entire event class to Rust, as
`event.content` will then return the new Rust class `JsonObject`.

This PR adds a pure Rust `JsonObject` class that is a `Mapping`
representing a json-style object. It uses `serde_json::Value` as its
in-memory representation and `pythonize` for conversion when a field is
looked up on the object.

I'm not thrilled with the name, but couldn't think of a better one.

This also adds `JsonObject` handling to the JSON serialisation functions
we use, as well as to the `freeze(..)` function.

Reviewable commit-by-commit.
2026-05-08 14:19:03 +01:00
Olivier 'reivilibre eb2ae9d3da Tweak changelog v1.153.0rc1 2026-05-08 14:03:41 +01:00
Olivier 'reivilibre 0e508ba80f 1.153.0rc1 2026-05-08 13:22:15 +01:00
Eric Eastwood 8dbbc4000b Commit stray Rust change that keeps popping up (rust/src/canonical_json.rs) (#19763)
(introduced in https://github.com/element-hq/synapse/pull/19739)

Seems like some automatic change from `poetry run ./scripts-dev/lint.sh`
2026-05-08 06:20:25 -05:00
Eric Eastwood 4911296fb5 Force keyword-only args for Duration (prevent footgun) (#19756)
So people have to specify which time unit they want to use.

Spawning from
https://github.com/element-hq/synapse/pull/19394#discussion_r3188418426
2026-05-07 10:38:56 -05:00
Eric Eastwood 2829a146d3 Reduce WORKER_LOCK_MAX_RETRY_INTERVAL to 5 seconds (#19755)
Better to retry more quickly than have workers wait around. 5 seconds is
still a reasonable gap in time to not overwhelm anything.

This matters most in cross-worker scenarios. When locks are on the same
worker, when the lock holder releases, we signal to other locks (with
the same name/key) that they should try reacquiring the lock
immediately. But locks on other workers only re-check based on their
retry `_timeout_interval`.

Updating to 5 seconds to match the previous intentions based on the
[flawed
code](https://github.com/element-hq/synapse/blob/6100f6e4f7fb0c72f1ae2802683ebc811c0e3a77/synapse/handlers/worker_lock.py#L278).
We can assume they were trying to have 5 seconds as the max value to
retry.

Spawning from
https://github.com/element-hq/synapse/pull/19394#discussion_r3168458070
2026-05-07 10:36:25 -05:00
Olivier 'reivilibre 92b985cae3 Merge branch 'master' into develop 2026-05-07 15:29:06 +01:00
Olivier 'reivilibre d97b5b9e21 1.152.1 v1.152.1 2026-05-07 13:49:49 +01:00
Olivier 'reivilibre 2d48851438 Prevent pagination ending when a page is full of rejected events (ELEMENTSEC-2025-1636)
Fixes: https://github.com/element-hq/synapse/security/advisories/GHSA-6qf2-7x63-mm6v

Reviewed-on: https://github.com/element-hq/synapse-private/pull/117
2026-05-07 13:26:43 +01:00
Jason Little 0eefdbcb95 fix: Cap WorkerLock timeout intervals to 60 seconds (#19394)
Fixes the symptoms of https://github.com/element-hq/synapse/issues/19315
/ https://github.com/element-hq/synapse/issues/19588 but not the
underlying reason causing the number to grow so large in the first
place.

```
ValueError: Exceeds the limit (4300 digits) for integer string conversion; use sys.set_int_max_str_digits() to increase the limit
```

Copied from the original pull request on [Famedly's Synapse
repo](https://github.com/famedly/synapse/pull/221) (with some edits):

Basing the time interval around a 5 seconds leaves a big window of
waiting especially as this window is doubled each retry, when another
worker could be making progress but can not.

Right now, the retry interval in seconds looks like `[0.2, 5, 10, 20,
40, 80, 160, 320, (continues to double)]` after which logging should
start about excessive times and (relatively quickly) end up with an
extremely large retry interval with an unrealistic expectation past the
heat death of the universe. 1 year in seconds = 31,536,000.

With this change, retry intervals in seconds should look more like:

```
[
0.2,
0.4,
0.8,
1.6,
3.2,
6.4,
12.8,
25.6,
51.2,
60, < never goes higher than this
]
```

Logging about excessive wait times will start at 10 minutes.

<details>
<summary>Previous breakdown when we were using 15 minutes</summary>

```
[
0.2,
0.4,
0.8,
1.6,
3.2,
6.4,
12.8,
25.6,
51.2,
102.4,  # 1.7 minutes
204.8,  # 3.41 minutes
409.6,  # 6.83 minutes
819.2,  # 13.65 minutes  < logging about excessive times will start here, 13th iteration
900,  # 15 minutes < never goes higher than this
]
```
</details>

Further suggested work in this area could be to define the cap, the
retry interval starting point and the multiplier depending on how
frequently this lock should be checked. See data below for reasons why.
Increasing the jitter range may also be a good idea

---------

Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
(cherry picked from commit 3f58bc50df)
2026-05-07 13:25:04 +01:00
Erik Johnston 23b8fcf85e Port Event.unsigned field to Rust (#19708)
Similar to #19706, let's port the `unsigned` field into a Rust class.

This does change things a bit in that we now define exactly what
unsigned fields that are allowed to be added to an event, and what
actually gets persisted. This should be a noop though, as we carefully
filter out what unsigned fields we allow in from federation, for example

As a side effect of this cleanup, I think this fixes handling
`unsigned.age` on events received over federation.
2026-05-06 18:51:42 +01:00
Erik Johnston 3e6bf10640 Port Event.signatures field to Rust (#19706)
This is another stepping stone in porting the event class fully to Rust.

The new `Signatures` class is relatively simple, as we actually don't
interact with it that much in the code. It does *not* implement
`Mapping` or `MutableMapping` as that takes quite a lot of effort that
we don't need, even though it would be more ergonomic.
2026-05-06 11:38:15 +01:00
Jason Little 3f58bc50df fix: Cap WorkerLock timeout intervals to 60 seconds (#19394)
Fixes the symptoms of https://github.com/element-hq/synapse/issues/19315
/ https://github.com/element-hq/synapse/issues/19588 but not the
underlying reason causing the number to grow so large in the first
place.

```
ValueError: Exceeds the limit (4300 digits) for integer string conversion; use sys.set_int_max_str_digits() to increase the limit
```

Copied from the original pull request on [Famedly's Synapse
repo](https://github.com/famedly/synapse/pull/221) (with some edits):

Basing the time interval around a 5 seconds leaves a big window of
waiting especially as this window is doubled each retry, when another
worker could be making progress but can not.

Right now, the retry interval in seconds looks like `[0.2, 5, 10, 20,
40, 80, 160, 320, (continues to double)]` after which logging should
start about excessive times and (relatively quickly) end up with an
extremely large retry interval with an unrealistic expectation past the
heat death of the universe. 1 year in seconds = 31,536,000.

With this change, retry intervals in seconds should look more like:

```
[
0.2, 
0.4, 
0.8, 
1.6, 
3.2, 
6.4, 
12.8, 
25.6, 
51.2, 
60, < never goes higher than this
]
```

Logging about excessive wait times will start at 10 minutes.

<details>
<summary>Previous breakdown when we were using 15 minutes</summary>

```
[
0.2, 
0.4, 
0.8, 
1.6, 
3.2, 
6.4, 
12.8, 
25.6, 
51.2, 
102.4,  # 1.7 minutes
204.8,  # 3.41 minutes
409.6,  # 6.83 minutes
819.2,  # 13.65 minutes  < logging about excessive times will start here, 13th iteration
900,  # 15 minutes < never goes higher than this
]
```
</details>

Further suggested work in this area could be to define the cap, the
retry interval starting point and the multiplier depending on how
frequently this lock should be checked. See data below for reasons why.
Increasing the jitter range may also be a good idea

---------

Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
2026-05-05 14:40:17 +01:00
Eric Eastwood 6100f6e4f7 Backfill from nearby points past pagination token (#19611)
The juicy details and explanation are in the diff itself.

Split out from https://github.com/element-hq/synapse/pull/18873 in order
to fix paginating from
[MSC3871](https://github.com/matrix-org/matrix-spec-proposals/pull/3871)
gap tokens actually backfilling history. To be clear, this is a good
change to make outside of the
[MSC3871](https://github.com/matrix-org/matrix-spec-proposals/pull/3871)
use case. For example (as the new Complement test shows), fixes a
problem where if you try to paginate `/messages` from tokens returned by
`/context`, we could fail to backfill anything new and hide away
history.

Also fixes https://github.com/matrix-org/complement/pull/853
2026-05-01 11:42:00 -05:00
dependabot[bot] 697ef33dcb Bump gitpython from 3.1.46 to 3.1.47 (#19731)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-01 09:38:32 +00:00
dependabot[bot] b8d7324373 Bump the minor-and-patches group across 1 directory with 3 updates (#19736)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-01 09:22:27 +00:00
Noah Markert 2e7019ebc8 Expose tombstone status in room details (#19737)
Exposes `tombstoned` and `replacement_room` in room details on admin API
endpoint `GET /_synapse/admin/v1/rooms/<room_id>`. Resolves #18347
2026-04-30 13:37:40 +01:00
dependabot[bot] 8fc23aa665 Bump pillow from 12.1.1 to 12.2.0 (#19686) 2026-04-29 20:16:11 +01:00
Olivier 'reivilibre c376cdd2ee Configure Dependabot to only update Python dependencies in the lockfile. (#19743)
See:
- https://github.com/element-hq/synapse/pull/19742
- https://github.com/element-hq/synapse/pull/19686

(etc)

Documentation
https://docs.github.com/en/code-security/reference/supply-chain-security/dependabot-options-reference#versioning-strategy--

We were considering `lockfile-only` but it sounds like
`increase-if-necessary` would increase the upper bound for us, if we had
one. Let's try it.

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2026-04-29 18:17:53 +01:00
Oleg Girko ed3cafdb73 Partially revert "Bump authlib from 1.6.9 to 1.6.11 (#19703)" (#19742)
The original commit should only have changed the lockfile.

This reverts commit bdb1cf7416 (from
https://github.com/element-hq/synapse/pull/19703).

---------

Co-authored-by: Olivier 'reivilibre <oliverw@matrix.org>
2026-04-29 18:03:58 +01:00
Erik Johnston 76b4fdceed Add a canonical JSON impl (#19739)
This comes from
https://github.com/erikjohnston/rust-signed-json/blob/main/src/json.rs.
We need to be able to serialise canonical JSON in Rust to be able to
calculate event IDs once we port the event class to Rust.

We could instead make the above a properly published crate, but feels
easier to pull it into Synapse utils.
2026-04-28 17:46:03 +01:00
Olivier 'reivilibre 5e7cbfe4ae Merge branch 'master' into develop 2026-04-28 17:16:24 +01:00
Olivier 'reivilibre 16863c87d5 Changelog tweaks v1.152.0 2026-04-28 13:45:53 +01:00