990 Commits

Author SHA1 Message Date
Erik Johnston ff55aff5b2 Fix up event-construction in tests ahead of the Rust event port (#19781)
When we port the `Event` class to Rust, the constructor will check for
the existence of required fields. To support that, we tidy up the test
code where we construct fake events to add all the required fields.

There should be no behavioural changes.

Review commit-by-commit.
2026-05-15 10:12:42 +01:00
Eric Eastwood b233892a13 Update wait_for_stream_token(...) patterns and fix sync fetching with unbounded token (#19644)
Spawning from trying to find the proper way to wait for a token, see
https://github.com/element-hq/synapse/pull/19558#discussion_r2977673208

- Update `wait_for_stream_token(...)` patterns so
validation/sanitization is handled upstream in usage.
- Fix sync waiting for bounded token but using unbounded token to fetch
data. Noticed while working on adding the new method.

Part of https://github.com/element-hq/synapse/issues/19647
2026-05-14 14:53:16 -05:00
Eric Eastwood b8bd35105f Update WorkerLock tests to better stress the WORKER_LOCK_MAX_RETRY_INTERVAL (#19772)
There is no behavioral change, only a change to the tests. See
https://github.com/element-hq/synapse/pull/19772#discussion_r3222059105
for an explanation of why the tests needed changing (and diff comments).

Follow-up to https://github.com/element-hq/synapse/pull/19394. The test
discussion originally happened in
https://github.com/element-hq/synapse/pull/19394#discussion_r2789673181

This is spawning from thinking about the problem again.
2026-05-12 10:10:09 -05:00
Erik Johnston 3e6bf10640 Port Event.signatures field to Rust (#19706)
This is another stepping stone in porting the event class fully to Rust.

The new `Signatures` class is relatively simple, as we actually don't
interact with it that much in the code. It does *not* implement
`Mapping` or `MutableMapping` as that takes quite a lot of effort that
we don't need, even though it would be more ergonomic.
2026-05-06 11:38:15 +01:00
Jason Little 3f58bc50df fix: Cap WorkerLock timeout intervals to 60 seconds (#19394)
Fixes the symptoms of https://github.com/element-hq/synapse/issues/19315
/ https://github.com/element-hq/synapse/issues/19588 but not the
underlying reason causing the number to grow so large in the first
place.

```
ValueError: Exceeds the limit (4300 digits) for integer string conversion; use sys.set_int_max_str_digits() to increase the limit
```

Copied from the original pull request on [Famedly's Synapse
repo](https://github.com/famedly/synapse/pull/221) (with some edits):

Basing the time interval around a 5 seconds leaves a big window of
waiting especially as this window is doubled each retry, when another
worker could be making progress but can not.

Right now, the retry interval in seconds looks like `[0.2, 5, 10, 20,
40, 80, 160, 320, (continues to double)]` after which logging should
start about excessive times and (relatively quickly) end up with an
extremely large retry interval with an unrealistic expectation past the
heat death of the universe. 1 year in seconds = 31,536,000.

With this change, retry intervals in seconds should look more like:

```
[
0.2, 
0.4, 
0.8, 
1.6, 
3.2, 
6.4, 
12.8, 
25.6, 
51.2, 
60, < never goes higher than this
]
```

Logging about excessive wait times will start at 10 minutes.

<details>
<summary>Previous breakdown when we were using 15 minutes</summary>

```
[
0.2, 
0.4, 
0.8, 
1.6, 
3.2, 
6.4, 
12.8, 
25.6, 
51.2, 
102.4,  # 1.7 minutes
204.8,  # 3.41 minutes
409.6,  # 6.83 minutes
819.2,  # 13.65 minutes  < logging about excessive times will start here, 13th iteration
900,  # 15 minutes < never goes higher than this
]
```
</details>

Further suggested work in this area could be to define the cap, the
retry interval starting point and the multiplier depending on how
frequently this lock should be checked. See data below for reasons why.
Increasing the jitter range may also be a good idea

---------

Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
2026-05-05 14:40:17 +01:00
Eric Eastwood 6100f6e4f7 Backfill from nearby points past pagination token (#19611)
The juicy details and explanation are in the diff itself.

Split out from https://github.com/element-hq/synapse/pull/18873 in order
to fix paginating from
[MSC3871](https://github.com/matrix-org/matrix-spec-proposals/pull/3871)
gap tokens actually backfilling history. To be clear, this is a good
change to make outside of the
[MSC3871](https://github.com/matrix-org/matrix-spec-proposals/pull/3871)
use case. For example (as the new Complement test shows), fixes a
problem where if you try to paginate `/messages` from tokens returned by
`/context`, we could fail to backfill anything new and hide away
history.

Also fixes https://github.com/matrix-org/complement/pull/853
2026-05-01 11:42:00 -05:00
Jason Little 93e0497fc3 Avoid a M_FORBIDDEN response when a user tries to erase their account and profile updates are disabled (#19398)
Currently synapse returns `M_FORBIDDEN` when trying to use the account
deactivation API, if the server admin disabled displayname changes. This
is undesirable, since it prevents GDPR erasure without admin
interaction. The admin API seems to work fine though. This also only
seems to affect the deactivate API, when the erase flag is true.

Relevant endpoint:
https://spec.matrix.org/latest/client-server-api/#post_matrixclientv3accountdeactivate

This change only removes the checked for condition that the displayname
and profile avatar are allowed to be changed per the configuration
setting. If a user is deleting themselves, why is that denied?

There did not seem to be a basic test for this endpoint that checks the
`erase` usage, so that was added as well as checking the above mentioned
behavior.
2026-04-23 17:04:48 +01:00
Erik Johnston 2a8285931e Prune old rows in device_lists_changes_in_room table. (#19473)
Fixes #13043

The usages of the table mostly already correctly handled if we don't
have old entries, as that was needed when we first added the table.

I arbitrarily set the prune time to 30 days. The only use for old
entries is for sync streams that haven't synced since then, and we
should very rarely see sync streams that haven't been used in 30 days.

Reviewable commit-by-commit.

---------

Co-authored-by: Olivier 'reivilibre' <oliverw@element.io>
Co-authored-by: Olivier 'reivilibre' <olivier@librepush.net>
2026-04-17 11:54:22 +01:00
Kegan Dougal 15c03b9689 MSC4242: State DAGs (CSAPI) (#19424)
This implements [MSC4242: State
DAGs](https://github.com/matrix-org/matrix-spec-proposals/pull/4242),
without support for federation.

A general overview:
 - It adds a new room version and new event type.
 - It adds a new field `calculated_auth_event_ids` to internal metadata.
- It stores the state DAG via new state DAG edges / forward extremities
tables.
 - It adds new auth rules as per the MSC.
- It uses the new `prev_state_events` field instead of
`prev_event_ids()` when doing state resolution.

Complement tests: https://github.com/matrix-org/complement/pull/841

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Eric Eastwood <erice@element.io>
2026-04-16 15:46:47 +00:00
Erik Johnston 71781de707 Add a FilteredEvent type to handle per-user data on events (#19640)
When we return events to clients we need to annotate them with the
membership of the user at the time of the event, in the `unsigned`
section. We already check the membership at the event during the
visibility checks, and so we annotate events there. However, since this
a per-user field we end up having to clone the event in question.

Instead, let's add a `FilteredEvent` class that is returned by the
visibility checks, which allows returning the membership without editing
the event. This has three benefits:
1. Avoids the clones of the event.
2. Allows us to statically check that we have filtered events before
returning them to clients.
3. We no longer edit `unsigned` data after event deserialization, this
makes it easier to port the event class to Rust.

The last benefit is why we're doing this *now*, however IMV it shouldn't
affect whether we want this change or not.

Reviewable commit-by-commit

---------

Co-authored-by: Olivier 'reivilibre' <oliverw@element.io>
2026-04-16 09:47:08 +01:00
Erik Johnston 8c1ac41cea Small simplifications to the events class (#19680)
This is to make it easier to port to Rust, as well as making things
conceptually simpler.

Two changes:
1. Remove the `__getitem__` interface on events
2. Remove `.user_id` as an alias of `.sender`.
2026-04-13 17:52:13 +01:00
Erik Johnston 0549307198 Revert "Limit outgoing to_device EDU size to 65536" (#19614)
Reverts element-hq/synapse#18416


Unfortunately, this causes failures on `/sendToDevice` endpoint in
normal circumstances. If a single user has, say, a hundred devices then
we easily go over the limit. This blocks message sending entirely in
encrypted rooms.

cc @MadLittleMods @MatMaul
2026-03-27 10:53:16 +00:00
Olivier 'reivilibre 6c7e05fe20 Allow Synapse to start up even when discovery fails for an OpenID Connect provider. (#19509)
Fixes: #8088

Previously we would perform OIDC discovery on startup,
which involves making HTTP requests to the identity provider(s).

If that took a long time, we would block startup.

If that failed, we would crash startup.

This commit:

- makes the loading happen in the background on startup
- makes an error in the 'preload' non-fatal (though it logs at CRITICAL
for visibility)
- adds a templated error page to show on failed redirects (for
unavailable providers), as otherwise you get a JSON response in your
navigator.
- This involves introducing 2 new exception types to mark other
exceptions and keep the error handling fine-grained.

The machinery was already there to load-on-demand the discovery config,
so when the identity provider
comes back up, the discovery is reattempted and login can succeed.

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2026-03-24 17:39:21 +00:00
Mathieu Velten 7fad50fd76 Limit outgoing to_device EDU size to 65536 (#18416)
If a set of messages exceeds this limit, the messages are split
across several EDUs.

Fix #17035 (should)

There is currently [no official specced limit for
EDUs](https://github.com/matrix-org/matrix-spec/issues/807), but the
consensus seems to be that it would be useful to have one to avoid this
bug by bounding the transaction size.

As a side effect it also limits the size of a single to-device message
to a bit less than 65536.

This should probably be added to the spec similarly to the [message size
limit.](https://spec.matrix.org/v1.14/client-server-api/#size-limits)

Spec PR: https://github.com/matrix-org/matrix-spec/pull/2340

---------

Co-authored-by: mcalinghee <mcalinghee.dev@gmail.com>
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
2026-03-24 11:22:11 -05:00
Travis Ralston 40d699b1d4 Stable support for MSC4284 policy servers (#19503)
Fixes https://github.com/element-hq/synapse/issues/19494

MSC4284 policy servers

This:
* removes the old `/check` (recommendation) support because it's from an
older design. Policy servers should have updated to `/sign` by now. We
also remove optionality around the policy server's public key because it
was only optional to support `/check`.
* supports the stable `m.room.policy` state event and `/sign` endpoints,
falling back to unstable if required. Note the changes between unstable
and stable:
* Stable `/sign` uses errors instead of an empty signatures block to
indicate refusal.
* Stable `m.room.policy` nests the public key in an object with explicit
key algorithm (always ed25519 for now)
* does *not* introduce tests that the above fallback to unstable works.
If it breaks, we're not going to be sad about an early transition. Tests
can be added upon request, though.
* fixes a bug where the policy server was asked to sign policy server
state events (the events were correctly skipped in `is_event_allowed`,
but `ask_policy_server_to_sign_event` didn't do the same).
* fixes a bug where the original event sender's signature can be deleted
if the sending server is the same as the policy server.
* proxies Matrix-shaped errors from the policy server to the
Client-Server API as `SynapseError`s (a new capability of the stable
API).


Membership event handling (from the issue) is expected to be a different
PR due to the size of changes involved (tracked by
https://github.com/element-hq/synapse/issues/19587).



### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: turt2live <1190097+turt2live@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
2026-03-20 19:34:26 +00:00
Mathieu Velten 699a898b30 Backgrounds membership updates when changing the avatar or the display name (#19311) 2026-03-05 14:46:05 +00:00
Richard van der Hoff b9ea2285b3 Add stable support for MSC4380 invite blocking. (#19431)
MSC4380 has now completed FCP, so we can add stable support for it.

Co-authored-by: Quentin Gliech <quenting@element.io>
2026-02-27 14:47:07 +00:00
Will Hunt 8f42f07bef Remove MSC2697 (legacy dehydrated devices) (#19346)
Fixes #19347 

This deprecates MSC2697 which has been closed since May 2024. As per
#19347 this seems to be a thing we can just rip out. The crypto team
have moved onto MSC3814 and are suggesting that developers who rely on
MSC2697 should use MSC3814 instead.

MSC2697 implementation originally introduced by https://github.com/matrix-org/synapse/pull/8380
2026-01-12 10:32:38 -06:00
Andrew Morgan 1500733f4a Replace usage of deprecated assertEquals with assertEqual (#19345) 2026-01-06 17:30:21 +00:00
Erik Johnston dfd00a986f Fix sliding sync performance slow down for long lived connections. (#19206)
Fixes https://github.com/element-hq/synapse/issues/19175

This PR moves tracking of what lazy loaded membership we've sent to each
room out of the required state table. This avoids that table from
continuously growing, which massively helps performance as we pull out
all matching rows for the connection when we receive a request.

The new table is only read when we have data in a room to send, so we
end up reading a lot fewer rows from the DB. Though we now read from
that table for every room we have events to return in, rather than once
at the start of the request.

For an explanation of how the new table works, see the
[comment](https://github.com/element-hq/synapse/blob/erikj/sss_better_membership_storage2/synapse/storage/schema/main/delta/93/02_sliding_sync_members.sql#L15-L38)
on the table schema.

The table is designed so that we can later prune old entries if we wish,
but that is not implemented in this PR.

Reviewable commit-by-commit.

---------

Co-authored-by: Eric Eastwood <erice@element.io>
2025-12-12 10:02:57 +00:00
Erik Johnston 1bddd25a85 Port Clock functions to use Duration class (#19229)
This changes the arguments in clock functions to be `Duration` and
converts call sites and constants into `Duration`. There are still some
more functions around that should be converted (e.g.
`timeout_deferred`), but we leave that to another PR.

We also changes `.as_secs()` to return a float, as the rounding broke
things subtly. The only reason to keep it (its the same as
`timedelta.total_seconds()`) is for symmetry with `as_millis()`.

Follows on from https://github.com/element-hq/synapse/pull/19223
2025-12-01 13:55:06 +00:00
Andrew Morgan 778897a4e9 Add a unit test that ensures that deleting a device purges the associated refresh token (#19230) 2025-11-28 17:01:15 +00:00
Richard van der Hoff c928347779 Implement MSC4380: Invite blocking (#19203)
MSC4380 aims to be a simplified implementation of MSC4155; the hope is
that we can get it specced and rolled out rapidly, so that we can
resolve the fact that `matrix.org` has enabled MSC4155.

The implementation leans heavily on what's already there for MSC4155.

It has its own `experimental_features` flag. If both MSC4155 and MSC4380
are enabled, and a user has both configurations set, then we prioritise
the MSC4380 one.

Contributed wearing my 🎩 Spec Core Team hat.
2025-11-26 16:12:14 +00:00
Devon Hudson bc42899008 Allow subpaths in MAS endpoints (#19186)
Fixes #19184

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [X] Pull request is based on the develop branch
* [X] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
2025-11-18 18:45:33 +00:00
Andrew Ferrazzutti fcac7e0282 Write union types as X | Y where possible (#19111)
aka PEP 604, added in Python 3.10
2025-11-06 14:02:33 -06:00
Erik Johnston 5408101d21 Speed up pruning of ratelimiter (#19129)
I noticed this in some profiling. Basically, we prune the ratelimiters
by copying and iterating over every entry every 60 seconds. Instead,
let's use a wheel timer to track when we should potentially prune a
given key, and then we a) check fewer keys, and b) can run more
frequently. Hopefully this should mean we don't have a large pause
everytime we prune a ratelimiter with lots of keys.

Also fixes a bug where we didn't prune entries that were added via
`record_action` and never subsequently updated. This affected the media
and joins-per-room ratelimiter.
2025-11-04 12:44:57 +00:00
Andrew Ferrazzutti fc244bb592 Use type hinting generics in standard collections (#19046)
aka PEP 585, added in Python 3.9

 - https://peps.python.org/pep-0585/
 - https://docs.astral.sh/ruff/rules/non-pep585-annotation/
2025-10-22 16:48:19 -05:00
Andrew Morgan 5fff5a1893 Merge branch 'develop' of github.com:element-hq/synapse into develop 2025-10-01 09:40:38 +01:00
Devon Hudson 396de6544a Cleanly shutdown SynapseHomeServer object (#18828)
This PR aims to allow for a clean shutdown of the `SynapseHomeServer`
object so that it can be fully deleted and cleaned up by garbage
collection without shutting down the entire python process.

Fix https://github.com/element-hq/synapse-small-hosts/issues/50

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Eric Eastwood <erice@element.io>
2025-10-01 02:42:09 +00:00
Eric Eastwood 5adb08f3c9 Remove MockClock() (#18992)
Spawning from adding some logcontext debug logs in
https://github.com/element-hq/synapse/pull/18966 and since we're not
logging at the `set_current_context(...)` level (see reasoning there),
this removes some usage of `set_current_context(...)`.

Specifically, `MockClock.call_later(...)` doesn't handle logcontexts
correctly. It uses the calling logcontext as the callback context
(wrong, as the logcontext could finish before the callback finishes) and
it didn't reset back to the sentinel context before handing back to the
reactor. It was like this since it was [introduced 10+ years
ago](https://github.com/element-hq/synapse/commit/38da9884e70e8e44bde14c67a7a8a9d49a8b87ac).
Instead of fixing the implementation which would just be a copy of our
normal `Clock`, we can just remove `MockClock`
2025-09-30 11:27:29 -05:00
Andrew Morgan 2aab171042 Remove unstable prefixes for MSC2732
This MSC was accepted in 2022. We shouldn't need to continue supporting the unstable field names.
2025-09-30 17:10:32 +01:00
Eric Eastwood 5143f93dc9 Fix server_name in logging context for multiple Synapse instances in one process (#18868)
### Background

As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process), we're
currently diving into the details and implications of running multiple
instances of Synapse in the same Python process.

"Per-tenant logging" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48

### Prior art

Previously, we exposed `server_name` by providing a static logging
`MetadataFilter` that injected the values:


https://github.com/element-hq/synapse/blob/205d9e4fc4774850f34971469ae500e70119d17a/synapse/config/logger.py#L216

While this can work fine for the normal case of one Synapse instance per
Python process, this configures things globally and isn't compatible
when we try to start multiple Synapse instances because each subsequent
tenant will overwrite the previous tenant.


### What does this PR do?

We remove the `MetadataFilter` and replace it by tracking the
`server_name` in the `LoggingContext` and expose it with our existing
[`LoggingContextFilter`](https://github.com/element-hq/synapse/blob/205d9e4fc4774850f34971469ae500e70119d17a/synapse/logging/context.py#L584-L622)
that we already use to expose information about the `request`.

This means that the `server_name` value follows wherever we log as
expected even when we have multiple Synapse instances running in the
same process.


### A note on logcontext

Anywhere, Synapse mistakenly uses the `sentinel` logcontext to log
something, we won't know which server sent the log. We've been fixing up
`sentinel` logcontext usage as tracked by
https://github.com/element-hq/synapse/issues/18905

Any further `sentinel` logcontext usage we find in the future can be
fixed piecemeal as normal.


https://github.com/element-hq/synapse/blob/d2a966f922fdc95bc86f7fe55b7b54a9ab3f25c1/docs/log_contexts.md#L71-L81


### Testing strategy

1. Adjust your logging config to include `%(server_name)s` in the format
    ```yaml
    formatters:
        precise:
format: '%(asctime)s - %(server_name)s - %(name)s - %(lineno)d -
%(levelname)s - %(request)s - %(message)s'
    ```
1. Start Synapse: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Make some requests (`curl
http://localhost:8008/_matrix/client/versions`, etc)
1. Open the homeserver logs and notice the `server_name` in the logs as
expected. `unknown_server_from_sentinel_context` is expected for the
`sentinel` logcontext (things outside of Synapse).
2025-09-26 17:10:48 -05:00
Travis Ralston d2a966f922 Use signature support from policy servers when available (#18934)
Opening on Kegan's behalf


[MSC4284](https://github.com/matrix-org/matrix-spec-proposals/pull/4284)
has already been opened accordingly.

---------

Co-authored-by: Kegan Dougal <7190048+kegsay@users.noreply.github.com>
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-25 19:30:24 +00:00
Eric Eastwood 9a88d25f8e Fix run_in_background not be awaited properly causing LoggingContext problems (#18937)
Basically, searching for any instance of `run_in_background(...)` and
making sure we wrap the deferred in `make_deferred_yieldable(...)` if we
try to `await` the result to make it follow the [Synapse logcontext
rules](https://github.com/element-hq/synapse/blob/develop/docs/log_contexts.md).

Turns out, we only have this problem in some tests (phew)

Part of https://github.com/element-hq/synapse/issues/18905
2025-09-22 10:55:45 -05:00
Eric Eastwood 5a9ca1e3d9 Introduce Clock.call_when_running(...) to include logcontext by default (#18944)
Introduce `Clock.call_when_running(...)` to wrap startup code in a
logcontext, ensuring we can identify which server generated the logs.

Background:

>  Ideally, nothing from the Synapse homeserver would be logged against the `sentinel` 
>  logcontext as we want to know which server the logs came from. In practice, this is not 
>  always the case yet especially outside of request handling. 
>   
>  Global things outside of Synapse (e.g. Twisted reactor code) should run in the 
>  `sentinel` logcontext. It's only when it calls into application code that a logcontext 
>  gets activated. This means the reactor should be started in the `sentinel` logcontext, 
>  and any time an awaitable yields control back to the reactor, it should reset the 
>  logcontext to be the `sentinel` logcontext. This is important to avoid leaking the 
>  current logcontext to the reactor (which would then get picked up and associated with 
>  the next thing the reactor does). 
>
> *-- `docs/log_contexts.md`

Also adds a lint to prefer `Clock.call_when_running(...)` over
`reactor.callWhenRunning(...)`

Part of https://github.com/element-hq/synapse/issues/18905
2025-09-22 10:27:59 -05:00
Tulir Asokan d80f515622 Update MSC4190 support (#18946) 2025-09-22 14:45:05 +01:00
reivilibre dfccde9f60 Remove obsolete and experimental /sync/e2ee endpoint. (#18583)
Introduced in: https://github.com/element-hq/synapse/pull/17167

The endpoint was part of experiments for MSC3575 but does not feature in
that MSC.

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-09-09 09:28:45 +01:00
Quentin Gliech 537e14169e Support stable endpoint and scopes from the MSC3861 family (#18549)
This adds stable APIs for both MSC2965 and MSC2967
2025-09-02 13:55:12 +02:00
Quentin Gliech 7ed55666b5 Stabilise MAS integration (#18759)
This can be reviewed commit by commit

There are a few improvements over the experimental support:

- authorisation of Synapse <-> MAS requests is simplified, with a single
shared secret, removing the need for provisioning a client on the MAS
side
- the tests actually spawn a real server, allowing us to test the rust
introspection layer
- we now check that the device advertised in introspection actually
exist, making it so that when a user logs out, the tokens are
immediately invalidated, even if the cache doesn't expire
- it doesn't rely on discovery anymore, rather on a static endpoint
base. This means users don't have to override the introspection endpoint
to avoid internet roundtrips
- it doesn't depend on `authlib` anymore, as we simplified a lot the
calls done from Synapse to MAS

We still have to update the MAS documentation about the Synapse setup,
but that can be done later.

---------

Co-authored-by: reivilibre <oliverw@element.io>
2025-08-04 15:48:45 +02:00
reivilibre a31d53b28f Use twisted.internet.testing module in tests instead of deprecated twisted.test.proto_helpers. (#18728)
Follows: #18727

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-07-30 12:32:10 +01:00
Eric Eastwood 2c236be058 Refactor Counter metrics to be homeserver-scoped (#18656)
Bulk refactor `Counter` metrics to be homeserver-scoped. We also add
lints to make sure that new `Counter` metrics don't sneak in without
using the `server_name` label (`SERVER_NAME_LABEL`).

All of the "Fill in" commits are just bulk refactor.

Part of https://github.com/element-hq/synapse/issues/18592



### Testing strategy

 1. Add the `metrics` listener in your `homeserver.yaml`
    ```yaml
    listeners:
      # This is just showing how to configure metrics either way
      #
      # `http` `metrics` resource
      - port: 9322
        type: http
        bind_addresses: ['127.0.0.1']
        resources:
          - names: [metrics]
            compress: false
      # `metrics` listener
      - port: 9323
        type: metrics
        bind_addresses: ['127.0.0.1']
    ```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` and/or
`http://localhost:9323/metrics`
1. Observe response includes the `synapse_user_registrations_total`,
`synapse_http_server_response_count_total`, etc metrics with the
`server_name` label
2025-07-25 14:58:47 -05:00
reivilibre 8344c944b1 Add configurable rate limiting for the creation of rooms. (#18514)
Default values will be 1 room per minute, with a burst count of 10.

It's hard to imagine most users will be affected by this default rate,
but it's intentionally non-invasive in case of bots or other users that
need to create rooms at a large rate.
Server admins might want to down-tune this on their deployments.

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-07-24 14:08:02 +00:00
Eric Eastwood 98f84256e9 Configure HTTP proxy in file config (#18686)
This PR makes it possible to configure the HTTP proxy on a per-homeserver-tenant basis.

`http_proxy`, `https_proxy`, `no_proxy_hosts`
2025-07-22 10:33:00 -05:00
Quentin Gliech 5ea2cf2484 Move device changes off the main process (#18581)
The main goal of this PR is to handle device list changes onto multiple
writers, off the main process, so that we can have logins happening
whilst Synapse is rolling-restarting.

This is quite an intrusive change, so I would advise to review this
commit by commit; I tried to keep the history as clean as possible.

There are a few things to consider:

- the `device_list_key` in stream tokens becomes a
`MultiWriterStreamToken`, which has a few implications in sync and on
the storage layer
- we had a split between `DeviceHandler` and `DeviceWorkerHandler` for
master vs. worker process. I've kept this split, but making it rather
writer vs. non-writer worker, using method overrides for doing
replication calls when needed
- there are a few operations that need to happen on a single worker at a
time. Instead of using cross-worker locks, for now I made them run on
the first writer on the list

---------

Co-authored-by: Eric Eastwood <erice@element.io>
2025-07-18 09:06:14 +02:00
Eric Eastwood 88785dbaeb Refactor cache metrics to be homeserver-scoped (#18604)
(add `server_name` label to cache metrics).

Part of https://github.com/element-hq/synapse/issues/18592
2025-07-16 16:04:57 -05:00
Eric Eastwood fc10a5ee29 Refactor Measure block metrics to be homeserver-scoped (v2) (#18601)
Refactor `Measure` block metrics to be homeserver-scoped (add
`server_name` label to block metrics).

Part of https://github.com/element-hq/synapse/issues/18592


### Testing strategy

#### See behavior of previous `metrics` listener

 1. Add the `metrics` listener in your `homeserver.yaml`
    ```yaml
    listeners:
      - port: 9323
        type: metrics
        bind_addresses: ['127.0.0.1']
    ```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
 1. Fetch `http://localhost:9323/metrics`
1. Observe response includes the block metrics
(`synapse_util_metrics_block_count`,
`synapse_util_metrics_block_in_flight`, etc)


#### See behavior of the `http` `metrics` resource

1. Add the `metrics` resource to a new or existing `http` listeners in
your `homeserver.yaml`
    ```yaml
    listeners:
      - port: 9322
        type: http
        bind_addresses: ['127.0.0.1']
        resources:
          - names: [metrics]
            compress: false
    ```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` (it's just a `GET`
request so you can even do in the browser)
1. Observe response includes the block metrics
(`synapse_util_metrics_block_count`,
`synapse_util_metrics_block_in_flight`, etc)
2025-07-15 15:55:23 -05:00
Eric Eastwood d72c278a07 Remove allow_no_prev_events option (MSC2716 cleanup) (#18676)
This option is no longer used
since we backed out the MSC2716 changes in
https://github.com/matrix-org/synapse/pull/15748 and is even mentioned
as a follow-up task in the PR description there.

The `allow_no_prev_events` option was first introduced in
https://github.com/matrix-org/synapse/pull/11243 to support MSC2716 back
in the day.
2025-07-15 15:53:56 -05:00
Krishan a2bee2f255 Add via param to hierarchy enpoint (#18070)
### Pull Request Checklist

Implementation of
[MSC4235](https://github.com/matrix-org/matrix-spec-proposals/pull/4235)
as per suggestion in [pull request
17750](https://github.com/element-hq/synapse/pull/17750#issuecomment-2411248598).

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Quentin Gliech <quenting@element.io>
2025-06-30 12:42:14 +00:00
Erik Johnston f500c7d982 Speed up MAS token introspection (#18357)
We do this by shoving it into Rust. We believe our python http client is
a bit slow.

Also bumps minimum rust version to 1.81.0, released last September (over
six months ago)

To allow for async Rust, includes some adapters between Tokio in Rust
and the Twisted reactor in Python.
2025-06-16 16:41:35 +01:00
Quentin Gliech 0de7aa9953 Enable flake8-logging and flake8-logging-format rules in Ruff and fix related issues throughout the codebase (#18542)
This can be reviewed commit by commit.

This enables the `flake8-logging` and `flake8-logging-format` rules in
Ruff, as well as logging exception stack traces in a few places where it
makes sense

 - https://docs.astral.sh/ruff/rules/#flake8-logging-log
 - https://docs.astral.sh/ruff/rules/#flake8-logging-format-g

### Linting to avoid pre-formatting log messages

See [`adamchainz/flake8-logging` -> *LOG011 avoid pre-formatting log
messages*](https://github.com/adamchainz/flake8-logging/blob/152db2f167355fb23e401bf68046c57cb128a2ae/README.rst#log011-avoid-pre-formatting-log-messages)

Practically, this means prefer placeholders (`%s`) over f-strings for
logging.

This is because placeholders are passed as args to loggers, so they can
do special handling of them.
For example, Sentry will record the args separately in their logging
integration:
https://github.com/getsentry/sentry-python/blob/c15b390dfe1ca5c01b30dd56b35d693bb50b413c/sentry_sdk/integrations/logging.py#L280-L284

One theoretical small perf benefit is that log levels that aren't
enabled won't get formatted, so it doesn't unnecessarily create
formatted strings
2025-06-13 09:44:18 +02:00