Spawning from adding some logcontext debug logs in
https://github.com/element-hq/synapse/pull/18966 and since we're not
logging at the `set_current_context(...)` level (see reasoning there),
this removes some usage of `set_current_context(...)`.
Specifically, `MockClock.call_later(...)` doesn't handle logcontexts
correctly. It uses the calling logcontext as the callback context
(wrong, as the logcontext could finish before the callback finishes) and
it didn't reset back to the sentinel context before handing back to the
reactor. It was like this since it was [introduced 10+ years
ago](https://github.com/element-hq/synapse/commit/38da9884e70e8e44bde14c67a7a8a9d49a8b87ac).
Instead of fixing the implementation which would just be a copy of our
normal `Clock`, we can just remove `MockClock`
Introduce `Clock.call_when_running(...)` to wrap startup code in a
logcontext, ensuring we can identify which server generated the logs.
Background:
> Ideally, nothing from the Synapse homeserver would be logged against the `sentinel`
> logcontext as we want to know which server the logs came from. In practice, this is not
> always the case yet especially outside of request handling.
>
> Global things outside of Synapse (e.g. Twisted reactor code) should run in the
> `sentinel` logcontext. It's only when it calls into application code that a logcontext
> gets activated. This means the reactor should be started in the `sentinel` logcontext,
> and any time an awaitable yields control back to the reactor, it should reset the
> logcontext to be the `sentinel` logcontext. This is important to avoid leaking the
> current logcontext to the reactor (which would then get picked up and associated with
> the next thing the reactor does).
>
> *-- `docs/log_contexts.md`
Also adds a lint to prefer `Clock.call_when_running(...)` over
`reactor.callWhenRunning(...)`
Part of https://github.com/element-hq/synapse/issues/18905
You can now configure how much media can be uploaded by a user in a
given time period.
Note the first commit here is a refactor of create/upload content
function
This PR makes a few radical changes to media. This now stores the SHA256
hash of each file stored in the database (excluding thumbnails, more on
that later). If a set of media is quarantined, any additional uploads of
the same file contents or any other files with the same hash will be
quarantined at the same time.
Currently this does NOT:
- De-duplicate media, although a future extension could be to do that.
- Run any background jobs to identify the hashes of older files. This
could also be a future extension, though the value of doing so is
limited to combat the abuse of recent media.
- Hash thumbnails. It's assumed that thumbnails are parented to some
form of media, so you'd likely be wanting to quarantine the media and
the thumbnail at the same time.
I thought ruff check would also format, but it doesn't.
This runs ruff format in CI and dev scripts. The first commit is just a
run of `ruff format .` in the root directory.
Prior to this PR, remote downloads which did not provide a
`content-length` were decremented from the remote download ratelimiter
at the max allowable size, leading to excessive ratelimiting - see
https://github.com/element-hq/synapse/issues/17394.
This PR adds a linearizer to limit concurrent remote downloads to 6 per
IP address, and decrements remote downloads without a `content-length`
from the ratelimiter *after* the download is complete and the response
length is known.
Also adds logic to ensure that responses with a known length respect the
`max_download_size`.
During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
Implement MSC3860 to follow redirects for federated media downloads.
Note that the Client-Server API doesn't support this (yet) since the media
repository in Synapse doesn't have a way of supporting redirects.
This converts the media servlet URLs in the same way as
(most) of the rest of Synapse. This will give more flexibility
in the versions each endpoint exists under.
The stubs have some issues so this has some generous cast
and ignores in it, but it is better than not having stubs.
Note that confusing that Element is a function which creates
_Element instances (and similarly for Comment).
There are two situations which were previously not properly checked:
1. If the requested URL was replaced with an oEmbed URL, then the
oEmbed URL was not checked against url_preview_url_blacklist.
2. Follow-up URLs (either via autodiscovery of oEmbed or to pre-cache
images) were not checked against url_preview_url_blacklist.
* Removes the `v1` directory from `test.rest.media.v1`.
* Moves the non-REST code from `synapse.rest.media.v1` to `synapse.media`.
* Flatten the `v1` directory from `synapse.rest.media`, but leave compatiblity
with 3rd party media repositories and spam checkers.