This is a simplification so that `unsigned` only includes "simple"
values, to make it easier to port to Rust.
Reviewable commit-by-commit
Summary:
1. **Add `recheck` column to `redactions` table**
A new boolean `recheck` column (default true) is added to the
`redactions` table. This captures whether a redaction needs its sender
domain checked at read time — required for room v3+ where redactions are
accepted speculatively and later validated. When persisting a new
redaction, `recheck` is set directly from
`event.internal_metadata.need_to_check_redaction()`.
It's fine if initially we recheck all redactions, as it only results in
a little more CPU overhead (as we always pull out the redaction event
regardless).
2. **Backfill `recheck` via background update**
A background update (`redactions_recheck`) backfills the new column for
existing rows by reading `recheck_redaction` from each event's
`internal_metadata` JSON. This avoids loading full event objects by
reading `event_json` directly via a SQL JOIN.
3. **Don't fetch confirmed redaction events from the DB**
Previously, when loading events, Synapse recursively fetched all
redaction events regardless of whether they needed domain rechecking.
Now `_fetch_event_rows` reads the `recheck` column and splits redactions
into two lists:
- `unconfirmed_redactions` — need fetching and domain validation
- `confirmed_redactions` — already validated, applied directly without
fetching the event
This avoids unnecessary DB reads for the common case of
already-confirmed redactions.
4. **Move `redacted_because` population to `EventClientSerializer`**
Previously, `redacted_because` (the full redaction event object) was
stored in `event.unsigned` at DB fetch time, coupling storage-layer code
to client serialization concerns. This is removed from
`_maybe_redact_event_row` and moved into
`EventClientSerializer.serialize_event`, which fetches the redaction
event on demand. The storage layer now only sets
`unsigned["redacted_by"]` (the redaction event ID).
5. **Always use `EventClientSerializer`**
The standalone `serialize_event` function was made private
(`_serialize_event`). All external callers — `rest/client/room.py`,
`rest/admin/events.py, appservice/api.py`, and `tests` — were updated to
use `EventClientSerializer.serialize_event` / `serialize_events`,
ensuring
`redacted_because` is always populated correctly via the serializer.
6. **Batch-fetch redaction events in `serialize_events`**
`serialize_events` now collects all `redacted_by` IDs from the event
batch upfront and fetches them in a single `get_events` call, passing
the result as a `redaction_map` to each `serialize_event` call. This
reduces N individual DB round-trips to one when serializing a batch of
events that includes redacted events.
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Principally so that we can share the same room version configuration
between Python and Rust.
For the most part, this is a direct port. Some special handling has had
to go into `KNOWN_ROOM_VERSIONS` so that it can be sensibly shared
between Python and Rust, since we do update it during config parsing.
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This allows the Rust HTTP client to be configured to force HTTP/2 even
on plaintext connections. This is useful in contexts where the remote
server is known to server HTTP/2 over plain text.
Added because we use the Synapse Rust HTTP client with the Synapse Pro
`event-cache` module. We use this because it's independent from the
Python reactor which makes things slower than expected.
Currently, the Synapse Rust HTTP client uses HTTP/1 which means a new
connection for every request. With HTTP/2, we can share the connection
across requests.
We want to see if this will make a performance difference and less
stress on the database connection situation, see
https://github.com/element-hq/synapse-rust-apps/issues/452#issuecomment-3897717599
Here is the sibling PR for using HTTP/2 on the Synapse Pro `event-cache`
module side: https://github.com/element-hq/synapse-pro-modules/pull/35
Hello,
I'm writing on behalf of the Citadel product developed by ERCOM.
This PR bumps `pyo3` from 0.26.0 to 0.27.2 and `pythonize` from 0.26.0
to 0.27.0.
For the code migration I followed the guide found here:
[link](https://pyo3.rs/v0.27.0/migration.html).
This changes the arguments in clock functions to be `Duration` and
converts call sites and constants into `Duration`. There are still some
more functions around that should be converted (e.g.
`timeout_deferred`), but we leave that to another PR.
We also changes `.as_secs()` to return a float, as the rounding broke
things subtly. The only reason to keep it (its the same as
`timedelta.total_seconds()`) is for symmetry with `as_millis()`.
Follows on from https://github.com/element-hq/synapse/pull/19223
Fixes https://github.com/element-hq/synapse/issues/18659
This changes the Tokio runtime to be attached to the Twisted reactor.
This way, the Tokio runtime starts when the Twisted reactor starts, and
*not* when the module gets loaded.
This is important as starting the runtime on module load meant that it
broke when Synapse was started with `daemonize`/`synctl`, as forks only
retain the calling threads, breaking the Tokio runtime.
This also changes so that the HttpClient gets the Twisted reactor
explicitly as parameter instead of loading it from
`twisted.internet.reactor`
The background updates are being registered on an object that is for the
_state_ database, but the actual tables are on the _main_ database. This
just moves them to a different store that can access the right stuff.
I noticed this when trying to do a full schema dump cause I was curious
what has changed since the last one.
Fixes#16054
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
We do this by shoving it into Rust. We believe our python http client is
a bit slow.
Also bumps minimum rust version to 1.81.0, released last September (over
six months ago)
To allow for async Rust, includes some adapters between Tokio in Rust
and the Twisted reactor in Python.
MSC4108 relies on ETag to determine if something has changed on the
rendez-vous channel.
Strong and correct ETag comparison works if the response body is
bit-for-bit identical, which isn't the case if a proxy in the middle
compresses the response on the fly.
This adds a `no-transform` directive to the `Cache-Control` header,
which tells proxies not to transform the response body.
Additionally, some proxies (nginx) will switch to `Transfer-Encoding:
chunked` if it doesn't know the Content-Length of the response, and
'weakening' the ETag if that's the case. I've added `Content-Length`
headers to all responses, to hopefully solve that.
This basically fixes QR-code login when nginx or cloudflare is involved,
with gzip/zstd/deflate compression enabled.
This is a workaround for some proxy setup, where the ETag header gets
stripped from the response headers unless there is a Content-Type header
set.
In particular, we saw this bug when putting Cloudflare in front of
Synapse.
I'm pretty sure this is a Cloudflare bug, as this behaviour isn't
documented anywhere, and doesn't make sense whatsoever.
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Add `event.internal_metadata.instance_name` (the worker instance that persisted the event) to go alongside the existing `event.internal_metadata.stream_ordering`.
`instance_name` is useful to properly compare and query for events with a token since you need to compare both the `stream_ordering` and `instance_name` against the vector clock/`instance_map` in the `RoomStreamToken`.
This is pre-requisite work and may be used in https://github.com/element-hq/synapse/pull/17293
Adding `event.internal_metadata.instance_name` was first mentioned in the initial Sliding Sync PR while pairing with @erikjohnston, see 09609cb0db (diff-5cd773fb307aa754bd3948871ba118b1ef0303f4d72d42a2d21e38242bf4e096R405-R410)
This version change requires a migration to a new API. See
https://pyo3.rs/v0.21.2/migration#from-020-to-021
This will fix the annoying warnings added when using the recent rust
nightly:
> warning: non-local `impl` definition, they should be avoided as they
go against expectation
This adds functions to transform a Twisted request to the
`http::Request`, and then to send back an `http::Response` through it.
It also imports the SynapseError exception so that we can throw that
from Rust code directly
Example usage of this would be:
```rust
use crate::http::{http_request_from_twisted, http_response_to_twisted, HeaderMapPyExt};
fn handler(twisted_request: &PyAny) -> PyResult<()> {
let request = http_request_from_twisted(twisted_request)?;
let ua: headers::UserAgent = request.headers().typed_get_required()?;
if whatever {
return Err((crate::errors::SynapseError::new(
StatusCode::UNAUTHORIZED,
"Whatever".to_owned
"M_UNAUTHORIZED",
None,
None,
)));
}
let response = Response::new("hello".as_bytes());
http_response_to_twisted(twisted_request, response)?;
Ok(())
}
```
During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
There are a couple of things we need to be careful of here:
1. The current python code does no validation when loading from the DB,
so we need to be careful to ignore such errors (at least on jki.re there
are some old events with internal metadata fields of the wrong type).
2. We want to be memory efficient, as we often have many hundreds of
thousands of events in the cache at a time.
---------
Co-authored-by: Quentin Gliech <quenting@element.io>
* Updates the rule ID.
* Use `event_property_is` instead of `event_match`.
This updates the implementation of MSC3958 to match the latest
text from the MSC.
A dont_notify action is a no-op (and coalesce is undefined). These are
both considered no-ops by the spec, per MSC3987 and the predefined
push rules were updated to remove dont_notify from the list of actions.
This removes the experimental configuration option and
always escapes the push rule condition keys.
Also escapes any (experimental) push rule condition keys
in the base rules which contain dot in a field name.