What was done:
1. synapse/logging/context.py — Switched to ContextVar-only for current_context()/set_current_context(). Removed _thread_local. Made Twisted imports conditional. Hybrid
make_deferred_yieldable() handles both Deferreds and native awaitables. Collapsed native function aliases.
2. tests/__init__.py — Removed do_patch() and twisted.trial.util import.
3. tests/unittest.py — Switched base class from twisted.trial.unittest.TestCase to stdlib unittest.TestCase. Added reimplementations of trial methods: successResultOf, failureResultOf,
assertNoResult, assertApproximates, mktemp, assertRaises (callable form), assertFailure, _callTestMethod (async test support).
4. 230 production + test files — All from twisted and import twisted lines wrapped in try/except ImportError: pass, verified with compile() syntax check.
5. pyproject.toml — Twisted and treq commented out from required dependencies. aiohttp added as required dependency.
6. 198 test files — MemoryReactor type hint → typing.Any (from earlier).
Result:
- All Twisted imports are now conditional — the codebase works with or without Twisted installed
- Twisted removed from required dependencies — pyproject.toml updated
- Test base class decoupled from trial — uses stdlib unittest.TestCase
- 96 asyncio-native tests + 518+ production tests verified passing
Updates the error codes to match MSC2666 changes (user ID query param
validation + proper errcode for requesting rooms with self), added the
new `count` field, and stabilized the endpoint.
Companion PR:
https://github.com/element-hq/matrix-authentication-service/pull/5550
to 1) send this flag
and 2) provision users proactively when their lock status changes.
---
Currently Synapse and MAS have two independent user lock
implementations. This PR makes it so that MAS can push its lock status
to Synapse when 'provisioning' the user.
Having the lock status in Synapse is useful for removing users from the
user directory
when they are locked.
There is otherwise no authentication requirement to have it in Synapse;
the enforcement is done
by MAS at token introspection time.
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Fixes: #19540Fixes: #16290 (side effect of the proposed fix)
Closes: #12804 (side effect of the proposed fix)
Introduced in: https://github.com/matrix-org/synapse/pull/8932
---
This PR is a relatively simple simplification of the profile change on
deactivation that appears to remove multiple bugs.
This PR's **primary motivating fix** is #19540: when a user is
deactivated and erased, they would be kept in the user directory. This
bug appears to have been here since #8932 (previously
https://github.com/matrix-org/synapse/pull/8932) (v1.26.0).
The root cause of this bug is that after removing the user from the user
directory, we would immediately update their displayname and avatar to
empty strings (one at a time), which re-inserts
the user into the user directory.
With this PR, we now delete the entire `profiles` row upon user erasure,
which is cleaner (from a 'your database goes back to zero after
deactivating and erasing a user' point of view) and
only needs one database operation (instead of doing displayname then
avatar).
With this PR, we also no longer send the 2 (deferred) `m.room.member`
`join` events to every room to propagate the displayname and avatar_url
changes.
This is good for two reasons:
- the user is about to get parted from those rooms anyway, so this
reduces the number of state events sent per room from 3 to 1. (More
efficient for us in the moment and leaves less litter in the room DAG.)
- it is possible for the displayname/avatar update to be sent **after**
the user parting, which seems as though it could trigger the user to be
re-joined to a public room.
(With that said, although this sounds vaguely familiar in my lossy
memory, I can't find a ticket that actually describes this bug, so this
might be fictional. Edit: #16290 seems to describe this, although the
title is misleading.)
Additionally, as a side effect of the proposed fix (deleting the
`profiles` row), this PR also now deletes custom profile fields upon
user erasure, which is a new feature/bugfix (not sure which) in its own
right.
I do not see a ticket that corresponds to this feature gap, possibly
because custom profile fields are still a niche feature without
mainstream support (to the best of my knowledge).
Tests are included for the primary bugfix and for the cleanup of custom
profile fields.
### `set_displayname` module API change
This change includes a minor _technically_-breaking change to the module
API.
The change concerns `set_displayname` which is exposed to the module API
with a `deactivation: bool = False` flag, matching the internal handler
method it wraps.
I suspect that this is a mistake caused by overly-faithfully piping
through the args from the wrapped method (this Module API was introduced
in
https://github.com/matrix-org/synapse/pull/14629/changes#diff-0b449f6f95672437cf04f0b5512572b4a6a729d2759c438b7c206ea249619885R1592).
The linked PR did the same for `by_admin` originally before it was
changed.
The `deactivation` flag's only purpose is to be piped through to other
Module API callbacks when a module has registered to be notified about
profile changes.
My claim is that it makes no sense for the Module API to have this flag
because it is not the one doing the deactivation, thus it should never
be in a position to set this to `True`.
My proposed change keeps the flag (for function signature
compatibility), but turns it into a no-op (with a `ERROR` log when it's
set to True by the module).
The Module API callback notifying of the module-caused displayname
change will therefore now always have `deactivation = False`.
*Discussed in
[`#synapse-dev:matrix.org`](https://matrix.to/#/!i5D5LLct_DYG-4hQprLzrxdbZ580U9UB6AEgFnk6rZQ/$1f8N6G_EJUI_I_LvplnVAF2UFZTw_FzgsPfB6pbcPKk?via=element.io&via=matrix.org&via=beeper.com)*
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Part of: MSC4354 whose experimental feature tracking issue is
https://github.com/element-hq/synapse/issues/19409
Follows: #19340 (a necessary bugfix for `/event/` to set this metadata)
Partially supersedes: #18968
This PR implements the first batch of work to support MSC4354 Sticky
Events.
Sticky events are events that have been configured with a finite
'stickiness' duration,
capped to 1 hour per current MSC draft.
Whilst an event is sticky, we provide stronger delivery guarantees for
the event, both to
our clients and to remote homeservers, essentially making it reliable
delivery as long as we
have a functional connection to the client/server and until the
stickiness expires.
This PR merely supports creating sticky events and receiving the sticky
TTL metadata in clients.
It is not suitable for trialling sticky events since none of the other
semantics are implemented.
Contains a temporary SQLite workaround due to a bug in our supported
version enforcement: https://github.com/element-hq/synapse/issues/19452
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Co-authored-by: Eric Eastwood <erice@element.io>
Store the JSON content of scheduled delayed events as text instead of a
byte array. This brings it in line with the `event_json` table's `json`
column, and fixes the inability to schedule a delayed event with
non-ASCII characters in its content.
Fixes#19242
Fixes#19347
This deprecates MSC2697 which has been closed since May 2024. As per
#19347 this seems to be a thing we can just rip out. The crypto team
have moved onto MSC3814 and are suggesting that developers who rely on
MSC2697 should use MSC3814 instead.
MSC2697 implementation originally introduced by https://github.com/matrix-org/synapse/pull/8380
Fix /event/ endpoint not transforming event with per-requester metadata
Pass notif_event through filter_events_for_client \
Not aware of an actual issue here, but seems silly to bypass it
Call it filter_and_transform_events_for_client to make it more obvious
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Fixes https://github.com/element-hq/synapse/issues/19175
This PR moves tracking of what lazy loaded membership we've sent to each
room out of the required state table. This avoids that table from
continuously growing, which massively helps performance as we pull out
all matching rows for the connection when we receive a request.
The new table is only read when we have data in a room to send, so we
end up reading a lot fewer rows from the DB. Though we now read from
that table for every room we have events to return in, rather than once
at the start of the request.
For an explanation of how the new table works, see the
[comment](https://github.com/element-hq/synapse/blob/erikj/sss_better_membership_storage2/synapse/storage/schema/main/delta/93/02_sliding_sync_members.sql#L15-L38)
on the table schema.
The table is designed so that we can later prune old entries if we wish,
but that is not implemented in this PR.
Reviewable commit-by-commit.
---------
Co-authored-by: Eric Eastwood <erice@element.io>
Related to https://github.com/element-hq/synapse/issues/17035, when
Synapse receives a request that is larger than the maximum size allowed,
it aborts the connection without ever sending back a HTTP response.
I dug into our usage of twisted and how best to try and report such an
error and this is what I came up with.
It would be ideal to be able to report the status from within
`handleContentChunk` but that is called too early on in the twisted http
handling code, before things have been setup enough to be able to
properly write a response.
I tested this change out locally (both with C-S and S-S apis) and they
do receive a 413 response now in addition to the connection being
closed.
Hopefully this will aid in being able to quickly detect when
https://github.com/element-hq/synapse/issues/17035 is occurring as the
current situation makes it very hard to narrow things down to that
specific issue without making a lot of assumptions.
This PR also responds with more meaningful error codes now in the case
of:
- multiple `Content-Length` headers
- invalid `Content-Length` header value
- request content size being larger than the `Content-Length` value
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [X] Pull request is based on the develop branch
* [X] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
---------
Co-authored-by: Eric Eastwood <erice@element.io>
This changes the arguments in clock functions to be `Duration` and
converts call sites and constants into `Duration`. There are still some
more functions around that should be converted (e.g.
`timeout_deferred`), but we leave that to another PR.
We also changes `.as_secs()` to return a float, as the rounding broke
things subtly. The only reason to keep it (its the same as
`timedelta.total_seconds()`) is for symmetry with `as_millis()`.
Follows on from https://github.com/element-hq/synapse/pull/19223
We have various constants to try and avoid mistyping of durations, e.g.
`ONE_HOUR_SECONDS * MILLISECONDS_PER_SECOND`, however this can get a
little verbose and doesn't help with typing.
Instead, let's move towards a dedicated `Duration` class (basically a
[`timedelta`](https://docs.python.org/3/library/datetime.html#timedelta-objects)
with helper methods).
This PR introduces the new types and converts all usages of the existing
constants with it. Future PRs may work to move the clock methods to also
use it (e.g. `call_later` and `looping_call`).
Reviewable commit-by-commit.
We add some logic to expire sliding sync connections if they get old or
if there is too much pending data to return.
The values of the constants are picked fairly arbitrarily, these are
currently:
1. More than 100 rooms with pending events if the connection hasn't been
used in over an hour
2. The connection hasn't been used for over a week
Reviewable commit-by-commit
---------
Co-authored-by: Eric Eastwood <erice@element.io>
As per recent proposals in MSC4140, remove authentication for
restarting/cancelling/sending a delayed event, and give each of those
actions its own endpoint. (The original consolidated endpoint is still
supported for backwards compatibility.)
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
---------
Co-authored-by: Half-Shot <will@half-shot.uk>
This is a normal
problem where we `await` a deferred without wrapping it in
`make_deferred_yieldable(...)`. But I've opted to replace the usage of
`deferLater` with something more standard for the Synapse codebase.
Part of https://github.com/element-hq/synapse/issues/18905
It's unclear why we're only now seeing these failures happen with the
changes from https://github.com/element-hq/synapse/pull/19057
Example failures seen in
https://github.com/element-hq/synapse/actions/runs/18477454390/job/52645183606?pr=19057
```
builtins.AssertionError: Expected `looping_call` callback from the reactor to start with the sentinel logcontext but saw task-_resumable_task-0-IBzAmHUoepQfLnEA. In other words, another task shouldn't have leaked their logcontext to us.
```
It is often useful when investigating a space to get information about
that space and it's children. This PR adds an Admin API to return
information about a space and it's children, regardless of room
membership. Will not fetch information over federation about remote
rooms that the server is not participating in.
Spawning from adding some logcontext debug logs in
https://github.com/element-hq/synapse/pull/18966 and since we're not
logging at the `set_current_context(...)` level (see reasoning there),
this removes some usage of `set_current_context(...)`.
Specifically, `MockClock.call_later(...)` doesn't handle logcontexts
correctly. It uses the calling logcontext as the callback context
(wrong, as the logcontext could finish before the callback finishes) and
it didn't reset back to the sentinel context before handing back to the
reactor. It was like this since it was [introduced 10+ years
ago](https://github.com/element-hq/synapse/commit/38da9884e70e8e44bde14c67a7a8a9d49a8b87ac).
Instead of fixing the implementation which would just be a copy of our
normal `Clock`, we can just remove `MockClock`
### Background
As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process), we're
currently diving into the details and implications of running multiple
instances of Synapse in the same Python process.
"Per-tenant logging" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48
### Prior art
Previously, we exposed `server_name` by providing a static logging
`MetadataFilter` that injected the values:
https://github.com/element-hq/synapse/blob/205d9e4fc4774850f34971469ae500e70119d17a/synapse/config/logger.py#L216
While this can work fine for the normal case of one Synapse instance per
Python process, this configures things globally and isn't compatible
when we try to start multiple Synapse instances because each subsequent
tenant will overwrite the previous tenant.
### What does this PR do?
We remove the `MetadataFilter` and replace it by tracking the
`server_name` in the `LoggingContext` and expose it with our existing
[`LoggingContextFilter`](https://github.com/element-hq/synapse/blob/205d9e4fc4774850f34971469ae500e70119d17a/synapse/logging/context.py#L584-L622)
that we already use to expose information about the `request`.
This means that the `server_name` value follows wherever we log as
expected even when we have multiple Synapse instances running in the
same process.
### A note on logcontext
Anywhere, Synapse mistakenly uses the `sentinel` logcontext to log
something, we won't know which server sent the log. We've been fixing up
`sentinel` logcontext usage as tracked by
https://github.com/element-hq/synapse/issues/18905
Any further `sentinel` logcontext usage we find in the future can be
fixed piecemeal as normal.
https://github.com/element-hq/synapse/blob/d2a966f922fdc95bc86f7fe55b7b54a9ab3f25c1/docs/log_contexts.md#L71-L81
### Testing strategy
1. Adjust your logging config to include `%(server_name)s` in the format
```yaml
formatters:
precise:
format: '%(asctime)s - %(server_name)s - %(name)s - %(lineno)d -
%(levelname)s - %(request)s - %(message)s'
```
1. Start Synapse: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Make some requests (`curl
http://localhost:8008/_matrix/client/versions`, etc)
1. Open the homeserver logs and notice the `server_name` in the logs as
expected. `unknown_server_from_sentinel_context` is expected for the
`sentinel` logcontext (things outside of Synapse).
Introduce `Clock.call_when_running(...)` to wrap startup code in a
logcontext, ensuring we can identify which server generated the logs.
Background:
> Ideally, nothing from the Synapse homeserver would be logged against the `sentinel`
> logcontext as we want to know which server the logs came from. In practice, this is not
> always the case yet especially outside of request handling.
>
> Global things outside of Synapse (e.g. Twisted reactor code) should run in the
> `sentinel` logcontext. It's only when it calls into application code that a logcontext
> gets activated. This means the reactor should be started in the `sentinel` logcontext,
> and any time an awaitable yields control back to the reactor, it should reset the
> logcontext to be the `sentinel` logcontext. This is important to avoid leaking the
> current logcontext to the reactor (which would then get picked up and associated with
> the next thing the reactor does).
>
> *-- `docs/log_contexts.md`
Also adds a lint to prefer `Clock.call_when_running(...)` over
`reactor.callWhenRunning(...)`
Part of https://github.com/element-hq/synapse/issues/18905