mirror of
https://github.com/element-hq/synapse.git
synced 2026-04-21 17:55:44 +00:00
Fixes https://github.com/element-hq/synapse/issues/19352 (See issue for history of this feature and previous PRs) > First, a [naive implementation](https://github.com/element-hq/synapse/pull/19268) of the endpoint was introduced, but it quickly ran into [performance issues on query](https://github.com/element-hq/synapse/pull/19312) and [long startup times](https://github.com/element-hq/synapse/issues/19349), leading to its [removal](https://github.com/element-hq/synapse/pull/19351). It also didn't actually work, and would fail to expose media when it was "unquarantined", so a [partial fix](https://github.com/element-hq/synapse/pull/19308) was attempted, where the suggested direction is to use a [stream](https://element-hq.github.io/synapse/latest/development/synapse_architecture/streams.html#cheatsheet-for-creating-a-new-stream) instead of a timestamp column. This PR re-introduces the API building on the previous feedback: * Adds a stream which tracks when media becomes (un)quarantined. * Runs a background update to capture already-quarantined media. * Adds a new admin API to return rows from the stream table. We track both quarantine and unquarantine actions in the stream to allow downstream consumers to process the records appropriately. Namely, to allow our Synapse exchange in HMA to remove hashes for unquarantined media (use case further explained in the [issue](https://github.com/element-hq/synapse/issues/19352)). **Note**: This knowingly does not capture all cases of media being quarantined. Other call sites are lower priority for T&S, and can be addressed in a future PR. ~~An issue will be created after this PR is merged to track those sites.~~ https://github.com/element-hq/synapse/issues/19672 ### Pull Request Checklist <!-- Please read https://element-hq.github.io/synapse/latest/development/contributing_guide.html before submitting your pull request --> * [x] Pull request is based on the develop branch * [x] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [x] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters)) --------- Co-authored-by: turt2live <1190097+turt2live@users.noreply.github.com> Co-authored-by: Eric Eastwood <madlittlemods@gmail.com> Co-authored-by: Eric Eastwood <erice@element.io>
409 lines
10 KiB
Markdown
409 lines
10 KiB
Markdown
# Querying media
|
|
|
|
These APIs allow extracting media information from the homeserver.
|
|
|
|
Details about the format of the `media_id` and storage of the media in the file system
|
|
are documented under [media repository](../media_repository.md).
|
|
|
|
To use it, you will need to authenticate by providing an `access_token`
|
|
for a server admin: see [Admin API](../usage/administration/admin_api/).
|
|
|
|
## List all media in a room
|
|
|
|
This API gets a list of known media in a room.
|
|
However, it only shows media from unencrypted events or rooms.
|
|
|
|
The API is:
|
|
```
|
|
GET /_synapse/admin/v1/room/<room_id>/media
|
|
```
|
|
|
|
The API returns a JSON body like the following:
|
|
```json
|
|
{
|
|
"local": [
|
|
"mxc://localhost/xwvutsrqponmlkjihgfedcba",
|
|
"mxc://localhost/abcdefghijklmnopqrstuvwx"
|
|
],
|
|
"remote": [
|
|
"mxc://matrix.org/xwvutsrqponmlkjihgfedcba",
|
|
"mxc://matrix.org/abcdefghijklmnopqrstuvwx"
|
|
]
|
|
}
|
|
```
|
|
|
|
## List all media uploaded by a user
|
|
|
|
Listing all media that has been uploaded by a local user can be achieved through
|
|
the use of the
|
|
[List media uploaded by a user](user_admin_api.md#list-media-uploaded-by-a-user)
|
|
Admin API.
|
|
|
|
## Query a piece of media by ID
|
|
|
|
This API returns information about a piece of local or cached remote media given the origin server name and media id. If
|
|
information is requested for remote media which is not cached the endpoint will return 404.
|
|
|
|
Request:
|
|
```http
|
|
GET /_synapse/admin/v1/media/<origin>/<media_id>
|
|
```
|
|
|
|
The API returns a JSON body with media info like the following:
|
|
|
|
Response:
|
|
```json
|
|
{
|
|
"media_info": {
|
|
"media_origin": "remote.com",
|
|
"user_id": null,
|
|
"media_id": "sdginwegWEG",
|
|
"media_type": "img/png",
|
|
"media_length": 67,
|
|
"upload_name": "test.png",
|
|
"created_ts": 300,
|
|
"filesystem_id": "wgeweg",
|
|
"url_cache": null,
|
|
"last_access_ts": 400,
|
|
"quarantined_by": null,
|
|
"authenticated": false,
|
|
"safe_from_quarantine": null,
|
|
"sha256": "ebf4f635a17d10d6eb46ba680b70142419aa3220f228001a036d311a22ee9d2a"
|
|
}
|
|
}
|
|
```
|
|
|
|
# Quarantine media
|
|
|
|
Quarantining media means that it is marked as inaccessible by users. It applies
|
|
to any local media, and any locally-cached copies of remote media.
|
|
|
|
The media file itself (and any thumbnails) is not deleted from the server.
|
|
|
|
Since Synapse 1.128.0, hashes of uploaded media are tracked. If this media
|
|
is quarantined, Synapse will:
|
|
|
|
- Quarantine any media with a matching hash that has already been uploaded.
|
|
- Quarantine any future media.
|
|
- Quarantine any existing cached remote media.
|
|
- Quarantine any future remote media.
|
|
|
|
## Downloading quarantined media
|
|
|
|
Normally, when media is quarantined, it will return a 404 error when downloaded.
|
|
Admins can bypass this by adding `?admin_unsafely_bypass_quarantine=true`
|
|
to the [normal download URL](https://spec.matrix.org/v1.16/client-server-api/#get_matrixclientv1mediadownloadservernamemediaid).
|
|
|
|
Bypassing the quarantine check is not recommended. Media is typically quarantined
|
|
to prevent harmful content from being served to users, which includes admins. Only
|
|
set the bypass parameter if you intentionally want to access potentially harmful
|
|
content.
|
|
|
|
Non-admin users cannot bypass quarantine checks, even when specifying the above
|
|
query parameter.
|
|
|
|
## Quarantining media by ID
|
|
|
|
This API quarantines a single piece of local or remote media.
|
|
|
|
Request:
|
|
|
|
```
|
|
POST /_synapse/admin/v1/media/quarantine/<server_name>/<media_id>
|
|
|
|
{}
|
|
```
|
|
|
|
Where `server_name` is in the form of `example.org`, and `media_id` is in the
|
|
form of `abcdefg12345...`.
|
|
|
|
Response:
|
|
|
|
```json
|
|
{}
|
|
```
|
|
|
|
## Remove media from quarantine by ID
|
|
|
|
This API removes a single piece of local or remote media from quarantine.
|
|
|
|
Request:
|
|
|
|
```
|
|
POST /_synapse/admin/v1/media/unquarantine/<server_name>/<media_id>
|
|
|
|
{}
|
|
```
|
|
|
|
Where `server_name` is in the form of `example.org`, and `media_id` is in the
|
|
form of `abcdefg12345...`.
|
|
|
|
Response:
|
|
|
|
```json
|
|
{}
|
|
```
|
|
|
|
## Quarantining media in a room
|
|
|
|
This API quarantines all local and remote media in a room.
|
|
|
|
Request:
|
|
|
|
```
|
|
POST /_synapse/admin/v1/room/<room_id>/media/quarantine
|
|
|
|
{}
|
|
```
|
|
|
|
Where `room_id` is in the form of `!roomid12345:example.org`.
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"num_quarantined": 10
|
|
}
|
|
```
|
|
|
|
The following fields are returned in the JSON response body:
|
|
|
|
* `num_quarantined`: integer - The number of media items successfully quarantined
|
|
|
|
Note that there is a legacy endpoint, `POST
|
|
/_synapse/admin/v1/quarantine_media/<room_id>`, that operates the same.
|
|
However, it is deprecated and may be removed in a future release.
|
|
|
|
## Quarantining all media of a user
|
|
|
|
This API quarantines all *local* media that a *local* user has uploaded. That is to say, if
|
|
you would like to quarantine media uploaded by a user on a remote homeserver, you should
|
|
instead use one of the other APIs.
|
|
|
|
Request:
|
|
|
|
```
|
|
POST /_synapse/admin/v1/user/<user_id>/media/quarantine
|
|
|
|
{}
|
|
```
|
|
|
|
URL Parameters
|
|
|
|
* `user_id`: string - User ID in the form of `@bob:example.org`
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"num_quarantined": 10
|
|
}
|
|
```
|
|
|
|
The following fields are returned in the JSON response body:
|
|
|
|
* `num_quarantined`: integer - The number of media items successfully quarantined
|
|
|
|
## Protecting media from being quarantined
|
|
|
|
This API protects a single piece of local media from being quarantined using the
|
|
above APIs. This is useful for sticker packs and other shared media which you do
|
|
not want to get quarantined, especially when
|
|
[quarantining media in a room](#quarantining-media-in-a-room).
|
|
|
|
Request:
|
|
|
|
```
|
|
POST /_synapse/admin/v1/media/protect/<media_id>
|
|
|
|
{}
|
|
```
|
|
|
|
Where `media_id` is in the form of `abcdefg12345...`.
|
|
|
|
Response:
|
|
|
|
```json
|
|
{}
|
|
```
|
|
|
|
## Unprotecting media from being quarantined
|
|
|
|
This API reverts the protection of a media.
|
|
|
|
Request:
|
|
|
|
```
|
|
POST /_synapse/admin/v1/media/unprotect/<media_id>
|
|
|
|
{}
|
|
```
|
|
|
|
Where `media_id` is in the form of `abcdefg12345...`.
|
|
|
|
Response:
|
|
|
|
```json
|
|
{}
|
|
```
|
|
|
|
## Listing quarantined media changes
|
|
|
|
When media is quarantined or unquarantined, a change record is created in the
|
|
database. This API returns those change records in the order they were created.
|
|
|
|
**Note**: This API should be considered *best-effort* and expected to have missing or
|
|
duplicate records. Currently, this only captures any media explicitly (un)quarantined by
|
|
the media quarantine admin API, and the other cases are tracked by
|
|
https://github.com/element-hq/synapse/issues/19672. Historical media uploaded before
|
|
Synapse 1.152.0 is backfilled in a background update on a best-effort basis.
|
|
|
|
Each page has a maximum of 100 records. The first page has the oldest records,
|
|
paginating forwards with each `next_batch` value.
|
|
|
|
Request:
|
|
|
|
```
|
|
GET /_synapse/admin/v1/media/quarantine_changes?from=2
|
|
```
|
|
|
|
Where `from` is the `next_batch` value from a previous request. It is optional
|
|
and defaults to the first page (the value `0`).
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"next_batch": 4,
|
|
"changes": [
|
|
{ "origin": "example.org", "media_id": "abcdefg12345...", "quarantined": true },
|
|
{ "origin": "example.org", "media_id": "abcdefg12345...", "quarantined": false },
|
|
{ "origin": "another.example.org", "media_id": "abcdefg12345...", "quarantined": true }
|
|
]
|
|
}
|
|
```
|
|
|
|
# Delete local media
|
|
This API deletes the *local* media from the disk of your own server.
|
|
This includes any local thumbnails and copies of media downloaded from
|
|
remote homeservers.
|
|
This API will not affect media that has been uploaded to external
|
|
media repositories (e.g https://github.com/turt2live/matrix-media-repo/).
|
|
See also [Purge Remote Media API](#purge-remote-media-api).
|
|
|
|
## Delete a specific local media
|
|
Delete a specific `media_id`.
|
|
|
|
Request:
|
|
|
|
```
|
|
DELETE /_synapse/admin/v1/media/<server_name>/<media_id>
|
|
|
|
{}
|
|
```
|
|
|
|
URL Parameters
|
|
|
|
* `server_name`: string - The name of your local server (e.g `matrix.org`)
|
|
* `media_id`: string - The ID of the media (e.g `abcdefghijklmnopqrstuvwx`)
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"deleted_media": [
|
|
"abcdefghijklmnopqrstuvwx"
|
|
],
|
|
"total": 1
|
|
}
|
|
```
|
|
|
|
The following fields are returned in the JSON response body:
|
|
|
|
* `deleted_media`: an array of strings - List of deleted `media_id`
|
|
* `total`: integer - Total number of deleted `media_id`
|
|
|
|
## Delete local media by date or size
|
|
|
|
Request:
|
|
|
|
```
|
|
POST /_synapse/admin/v1/media/delete?before_ts=<before_ts>
|
|
|
|
{}
|
|
```
|
|
|
|
*Deprecated in Synapse v1.78.0:* This API is available at the deprecated endpoint:
|
|
|
|
```
|
|
POST /_synapse/admin/v1/media/<server_name>/delete?before_ts=<before_ts>
|
|
|
|
{}
|
|
```
|
|
|
|
URL Parameters
|
|
|
|
* `server_name`: string - The name of your local server (e.g `matrix.org`). *Deprecated in Synapse v1.78.0.*
|
|
* `before_ts`: string representing a positive integer - Unix timestamp in milliseconds.
|
|
Files that were last used before this timestamp will be deleted. It is the timestamp of
|
|
last access, not the timestamp when the file was created.
|
|
* `size_gt`: Optional - string representing a positive integer - Size of the media in bytes.
|
|
Files that are larger will be deleted. Defaults to `0`.
|
|
* `keep_profiles`: Optional - string representing a boolean - Switch to also delete files
|
|
that are still used in image data (e.g user profile, room avatar).
|
|
If `false` these files will be deleted. Defaults to `true`.
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"deleted_media": [
|
|
"abcdefghijklmnopqrstuvwx",
|
|
"abcdefghijklmnopqrstuvwz"
|
|
],
|
|
"total": 2
|
|
}
|
|
```
|
|
|
|
The following fields are returned in the JSON response body:
|
|
|
|
* `deleted_media`: an array of strings - List of deleted `media_id`
|
|
* `total`: integer - Total number of deleted `media_id`
|
|
|
|
## Delete media uploaded by a user
|
|
|
|
You can find details of how to delete multiple media uploaded by a user in
|
|
[User Admin API](user_admin_api.md#delete-media-uploaded-by-a-user).
|
|
|
|
# Purge Remote Media API
|
|
|
|
The purge remote media API allows server admins to purge old cached remote media.
|
|
|
|
The API is:
|
|
|
|
```
|
|
POST /_synapse/admin/v1/purge_media_cache?before_ts=<unix_timestamp_in_ms>
|
|
|
|
{}
|
|
```
|
|
|
|
URL Parameters
|
|
|
|
* `before_ts`: string representing a positive integer - Unix timestamp in milliseconds.
|
|
All cached media that was last accessed before this timestamp will be removed.
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"deleted": 10
|
|
}
|
|
```
|
|
|
|
The following fields are returned in the JSON response body:
|
|
|
|
* `deleted`: integer - The number of media items successfully deleted
|
|
|
|
If the user re-requests purged remote media, synapse will re-request the media
|
|
from the originating server.
|