We introduce a new interface `ManagementRoomDetail`, and our implementation of this has immediate access to the room members and room state.
Immediately, this allows us to warn when the management room is public.
In the future, it gives us a nice place to decide things like decide whether membership of the management room is enough to be considered a moderator, introduce more redundancy in access control, and give capabilities a way to determine who is a moderator (and avoid enacting consequences against them).
* Move management room to its own folder so we can start introspecting on it.
* Add ManagementRoomDetail.ts
This is just used to track who is a moderator and whether the
management room is public.
* Update ManagementRoomOutput to depend on ManagementRoomDetail.
This should allow us to implement the feature that warns when
the management room is public.
* Send a warning if the management room is public.
Fixes https://github.com/the-draupnir-project/Draupnir/issues/413.
* Update to MPS 1.7.0 so we can get the join rule event.
* Update matrix-appservice-bridge and use our own alias for matrix-bot-sdk
* Bump node version to support matrix-appservice-bridge
* Bump node version in CI
* Fix comments
* Add changelog entry
We have a lot of verbose headers, and i think now is the best opportunity we have to become reuse compliant given that we just did two other similar maintenance changes (prettier, typescirpt5 & eslint9 & typescript-eslint).
* synapse_antispam resuse headers.
* delete old unused tslint.json.
* Add REUSE to pre-commit config.
* reuse info for config directory.
* Migrate to eslint-9 strictTypeChecked & typescript 5.
* Update to MPS 0.23.0.
Required for strict type checks.
* Looks like we found a test that was complete garbage, amazing really.
* FIXUP
* Well, the command handler was bugged previously...
The command handler used to always only return the command
without the prefix due to an operator precedence bug.
This meant that when we made the order of operations explicit,
we were now including the prefix of the command in the copy.
So when we parsed arguments the code wasn't expecting the prefix
to be there.
* update to MPS 0.23.1.
MPS 0.23.0 was bugged because we didn't enable
`noUncheckedIndexedAccess` while upgrading to typescript 5.
* Make sure eslint runs on all ts files.
* eslint fixes.
* enable `noUncheckedIndexedAccess` & `exactOptionalPropertyTypes`.
* eslint ignores is clearly not understood by me.
* Update SuperCoolStream for eslint and ts5.
* stricter eslint done i thinks
* Whoops, added on .only somewhere.
* Update MPS.
* fix broken test realted things.
* Well I guess that part of getMessagesByUserIn was part of the interface.
* Fix redactionCommandTest.
* Account for escapeHTML in tests.
* Fix tests.
* stuff not matching with .editorconfig fixes.
* Fix appservice webAPI test.
* Update for MPS 0.23.3.
This adds a `/healthz` endpoint to the appservice which allows this to work more nicely in kubernetes.
It also adds some metrics for tracking the provisioning state.
Grafana result:

Note: The ts-ignore are sadly required since the `_getValue` method is not public :/ I didnt find another solution apart from tracking it maybe elsewhere.
* Add health endpoint to appservice and add metrics via prometheus
* Ensure that we dont have duplicate metrics when the appservice is registered multiple times
* Move gauge modifications to utils function
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix typo
* Do not interrupt redact sequences due to error when backfilling
... Mainly timeouts.
* Change caught redaction error LogLevel from DEBUG to ERROR.
From matrix-org/mjolnir#479
---------
Co-authored-by: Marco Cirillo <marco.cirillo@aria-net.org>
So as a history lesson.
The Matrix Bot SDK uses the npm library "requests".
When there was a http error, matrix-bot-sdk
would literally throw the response object.
This would be a horrible 1000+line long thing to accidentally
be logged to the console via node's inspect.
Though it was inevitable since you can't be sure every catch
was handling the error correctly. Irregardless, the solution
developed at Element was to create an error object
that had concise details.
This was great, however, within the matrix-bot-sdk there is
[this](https://github.com/Half-Shot/matrix-bot-sdk/blob/f58d7ea6e24d1db8b9b8009dea4cd97cbff54d0c/src/http.ts#L66)
awful line of code which logs every http error as error using the
matrix-bot-sdk logger.
This wastes so much log space and causes alarm fatigue,
rather than muting the module, the action instead taken
was to redact stack traces from http errors.
This was not a good idea.
Eventually matrix-bot-sdk was updated to have a MatrixError type
when a request was performed via the client-server api that had an
error.
matrix-appservice-bridge depends upon this and so Mjolnir now needs
to be updated to throw MatrixError's.
We have gone a step further in this commit and also muted
the matrix-bot-sdk http module, to stop the alarm fatigue problem
https://github.com/turt2live/matrix-bot-sdk/pull/158
* Refactor Matrix event listener in Mjolnir and ManagedMjolnir.
closes https://github.com/matrix-org/mjolnir/issues/411.
Issue #411 says that we have to be careful about room.join,
but this was before we figured how to make matrix-appservice-bridge
echo events sent by its own intents.
* Remove MatrixClientListener since it isn't actually needed.
* Protect which config values can be used for ManagedMjolnirs.
* Introduce MatrixSendClient
so listeners aren't accidentally added to a MatrixClient instead
of MatrixEmitter.
* doc
* Move provisioned mjolnir config to src/config.
This just aids maintance so whenever someone goes to change the config
of the bot they will see this and update it.
* doc for matrix intent listener.
The Sentry package is very useful for monitoring runtime errors. With this PR,
we simply add the necessary mechanism to:
- log to sentry any uncaught error that reaches the toplevel, including startup errors.
* Attempt to factor out protected rooms from Mjolnir.
This is useful to the appservice because it means we don't
have to wrap a Mjolnir that is designed to sync.
It's also useful if we later on want to have specific
settings per space.
It's also just a nice seperation between Mjolnir's needs while
syncing via client-server and the behaviour of syncing policy rooms.
### Things that have changed
- `ErrorCache` no longer a static class (phew), gets used by `ProtectedRooms`.
- `ManagementRoomOutput` class gets created to handle logging back to the management room.
- Responsibilities for syncing member bans and server ACL are handled by `ProtectedRooms`.
- Responsibilities for watched lists should be moved to `ProtectedRooms` if they haven't been.
- `EventRedactionQueue` is moved to `ProtectedRooms` since this needs to happen after
member bans.
- ApplyServerAcls moved to `ProtectedRooms`
- ApplyMemberBans move to `ProtectedRooms`
- `logMessage` and `replaceRoomIdsWithPills` moved to `ManagementRoomOutput`.
- `resyncJoinedRooms` has been made a little more clear, though I am concerned about how often it does run because it does seem expensive.
* ProtectedRooms is not supposed to track joined rooms.
The reason is because it is supposed to represent a specific
set of rooms to protect, not do horrible logic
for working out what rooms mjolnir is supposed to protect.
* Stop the config being global (in almost all contexts).
* make sure unit test has a config
* Make failing word list more visible
* Only use Healthz from index.ts
Not really sure how useful it is anyways?
* Remove debug leftovers from a test.
This is really terrible and has meant whenever anyone has run `yarn test:integration` they have only been running this test.
💀💀💀https://www.youtube.com/watch?v=jmX-tzSOFE0
* Set a default timeout for integration tests that is 5 minutes long.
Seriously, I don't think there is much to gain by making people guess
a reasnoble time for a test to complete in all the time, especially
with how much Synapse changes in response time and all of the machines
involved in running these tests.
* Warn when giving up on being throttled
* For some reason it takes longer for events to appear in /state
no i am not going to track down why yet.
* Rate limiting got a lot more aggresive.
https://github.com/matrix-org/synapse/pull/13018
Rate limiting in Synapse used to reset the burst count and remove
the backoff when you were spamming continuously, now it doesn't.
Ideally we'd rewrite the rate limiting logic to back off for longer
than suggested so we could get burst again, but for now
lets just unblock CI by reducing the number of events we send in these
tests.
* Remove the need to call `/initialSync` in `getMessagesByUserIn`.
At the moment we call `/initialSync` to give a `from` token to `/messages`.
In this PR we instead do not provide a `from` token when calling `/messages`,
which has recently been permitted in the spec
Technically this is still unstable in the spec
https://spec.matrix.org/unstable/client-server-api/#get_matrixclientv3roomsroomidmessageshttps://github.com/matrix-org/matrix-spec/pull/1002
Synapse has supported this for over 2 years and Element web depends on it for threads.
https://github.com/matrix-org/matrix-js-sdk/pull/2065
Given that redactions are super heavy in Mjolnir already and have been reported
as barely functional on matrix.org I believe we should also adopt this approach as
if for some reason the spec did change before the next release (1.3) (extremely unlikely) we can revert this commit.
This is so that the context of failing callbacks are not lost.
We also await during pagination and not after so that if a call to the callback fails, we will not call it again.
Related to https://github.com/matrix-org/mjolnir/pull/132.
The old code would call `/sync` with this filter. If a token was
provided in the response of `/sync` for earlier messages, it would
then use this same filter to call `/rooms/messages`. However, this
filter does not do anything on that endpoint when we know the id of
the sender, as it requires a RoomEventFilter and there is no warning
or error from synapse about the structure of the filter being wrong.
This was not noticed until after the related PR because `/sync` with
the filter would usually be able to provide a user's
entire history in one room. This is because in most cases a user is banned/redacted
shortly after joining a room.
In the case that `/rooms/messages` was called for more events, the method would
always paginate the timeline up until the limit or the end of the room
history, which is only the expected behavior when matching the sender
with a "glob".