6 Commits

Author SHA1 Message Date
Eric Eastwood
a1e9abc7df Add Prometheus HTTP service discovery endpoint for easy discovery of all workers in Docker image (#19336)
Add Prometheus [HTTP service discovery](https://prometheus.io/docs/prometheus/latest/http_sd/)
endpoint for easy discovery of all workers in Docker image.

Follow-up to https://github.com/element-hq/synapse/pull/19324

Spawning from wanting to [run a load
test](https://github.com/element-hq/synapse-rust-apps/pull/397) against
the Complement Docker image of Synapse and see metrics from the
homeserver.


`GET http://<synapse_container>:9469/metrics/service_discovery`
```json5
[
  {
    "targets": [ "<host>", ... ],
    "labels": {
      "<labelname>": "<labelvalue>", ...
    }
  },
  ...
]
```

The metrics from each worker can also be accessed via
`http://<synapse_container>:9469/metrics/worker/<worker_name>` which is
what the service discovery response points to behind the scenes. This
way, you only need to expose a single port (9469) to access all metrics.

<details>
<summary>Real HTTP service discovery response</summary>

```json5
[
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "event_persister",
            "index": "1",
            "__metrics_path__": "/metrics/worker/event_persister1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "event_persister",
            "index": "2",
            "__metrics_path__": "/metrics/worker/event_persister2"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "background_worker",
            "index": "1",
            "__metrics_path__": "/metrics/worker/background_worker1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "event_creator",
            "index": "1",
            "__metrics_path__": "/metrics/worker/event_creator1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "user_dir",
            "index": "1",
            "__metrics_path__": "/metrics/worker/user_dir1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "media_repository",
            "index": "1",
            "__metrics_path__": "/metrics/worker/media_repository1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "federation_inbound",
            "index": "1",
            "__metrics_path__": "/metrics/worker/federation_inbound1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "federation_reader",
            "index": "1",
            "__metrics_path__": "/metrics/worker/federation_reader1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "federation_sender",
            "index": "1",
            "__metrics_path__": "/metrics/worker/federation_sender1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "synchrotron",
            "index": "1",
            "__metrics_path__": "/metrics/worker/synchrotron1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "client_reader",
            "index": "1",
            "__metrics_path__": "/metrics/worker/client_reader1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "appservice",
            "index": "1",
            "__metrics_path__": "/metrics/worker/appservice1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "pusher",
            "index": "1",
            "__metrics_path__": "/metrics/worker/pusher1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "device_lists",
            "index": "1",
            "__metrics_path__": "/metrics/worker/device_lists1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "device_lists",
            "index": "2",
            "__metrics_path__": "/metrics/worker/device_lists2"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "stream_writers",
            "index": "1",
            "__metrics_path__": "/metrics/worker/stream_writers1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "main",
            "index": "1",
            "__metrics_path__": "/metrics/worker/main"
        }
    }
]
```

</details>


And how it ends up as targets in Prometheus
(http://localhost:9090/targets):

(image)


### Testing strategy

1. Make sure your firewall allows the Docker containers to communicate
to the host (`host.docker.internal`) so they can access exposed ports of
other Docker containers. We want to allow Synapse to access the
Prometheus container and Grafana to access to the Prometheus container.
- `sudo ufw allow in on docker0 comment "Allow traffic from the default
Docker network to the host machine (host.docker.internal)"`
- `sudo ufw allow in on br-+ comment "(from Matrix Complement testing)
Allow traffic from custom Docker networks to the host machine
(host.docker.internal)"`
- [Complement firewall
docs](ee6acd9154/README.md (potential-conflict-with-firewall-software))
1. Build the Docker image for Synapse: `docker build -t
matrixdotorg/synapse -f docker/Dockerfile . && docker build -t
matrixdotorg/synapse-workers -f docker/Dockerfile-workers .`
([docs](7a24fafbc3/docker/README-testing.md (building-and-running-the-images-manually)))
 1. Start Synapse:
     ```
    docker run -d --name synapse \
        --mount type=volume,src=synapse-data,dst=/data \
        -e SYNAPSE_SERVER_NAME=my.docker.synapse.server \
        -e SYNAPSE_REPORT_STATS=no \
        -e SYNAPSE_ENABLE_METRICS=1 \
        -p 8008:8008 \
        -p 9469:9469 \
        matrixdotorg/synapse-workers:latest
    ```
    - Also try with workers:
       ```
      docker run -d --name synapse \
          --mount type=volume,src=synapse-data,dst=/data \
          -e SYNAPSE_SERVER_NAME=my.docker.synapse.server \
          -e SYNAPSE_REPORT_STATS=no \
          -e SYNAPSE_ENABLE_METRICS=1 \
          -e SYNAPSE_WORKER_TYPES="\
              event_persister:2, \
              background_worker, \
              event_creator, \
              user_dir, \
              media_repository, \
              federation_inbound, \
              federation_reader, \
              federation_sender, \
              synchrotron, \
              client_reader, \
              appservice, \
              pusher, \
              device_lists:2, \
stream_writers=account_data+presence+receipts+to_device+typing" \
          -p 8008:8008 \
          -p 9469:9469 \
          matrixdotorg/synapse-workers:latest
      ```
1. You should be able to see Prometheus service discovery endpoint at
http://localhost:9469/metrics/service_discovery
 1. Create a Prometheus config (`prometheus.yml`)
    ```yaml
    global:
      scrape_interval: 15s
      scrape_timeout: 15s
      evaluation_interval: 15s
    
    scrape_configs:
      - job_name: synapse
        scrape_interval: 15s
        metrics_path: /_synapse/metrics
        scheme: http
# We set `honor_labels` so that each service can set their own `job`
label
        #
# > honor_labels controls how Prometheus handles conflicts between
labels that are
# > already present in scraped data and labels that Prometheus would
attach
# > server-side ("job" and "instance" labels, manually configured target
# > labels, and labels generated by service discovery implementations).
        # >
# > *--
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config*
        honor_labels: true
        # Use HTTP service discovery
        #
        # Reference:
        #  - https://prometheus.io/docs/prometheus/latest/http_sd/
# -
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config
        http_sd_configs:
          - url: 'http://localhost:9469/metrics/service_discovery'
    ```
1. Start Prometheus (update the volume bind mount to the config you just
saved somewhere):
    ```
    docker run \
        --detach \
        --name=prometheus \
        --add-host host.docker.internal:host-gateway \
        -p 9090:9090 \
-v
~/Documents/code/random/prometheus-config/prometheus.yml:/etc/prometheus/prometheus.yml
\
        prom/prometheus
    ```
1. Make sure you're seeing some data in Prometheus. On
http://localhost:9090/query, search for `synapse_build_info`
 1. Start [Grafana](https://hub.docker.com/r/grafana/grafana)
    ```
docker run -d --name=grafana --add-host
host.docker.internal:host-gateway -p 3000:3000 grafana/grafana
    ```
1. Visit the Grafana dashboard, http://localhost:3000/ (Credentials:
`admin`/`admin`)
1. **Connections** -> **Data Sources** -> **Add data source** ->
**Prometheus**
     - Prometheus server URL: `http://host.docker.internal:9090`
1. Import the Synapse dashboard:
https://github.com/element-hq/synapse/blob/develop/contrib/grafana/synapse.json
2026-01-14 18:02:55 -06:00
Eric Eastwood
bd9a1079bc Update reverse proxy docs with what we've learned from #17986 (#17994)
Update reverse proxy docs with what we've learned from
https://github.com/element-hq/synapse/pull/17986

Also vice versa and update our nginx config with what I learned from the
reverse proxy docs.

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
2024-12-19 14:00:50 +00:00
Eric Eastwood
b257c7ab19 Be able to test /login/sso/redirect in Complement (#17986)
Be able to test `/login/sso/redirect` in Complement

Spawning from
https://github.com/element-hq/sbg/pull/421#discussion_r1854926218 where
we have a proxy that intercepts responses to
`/_matrix/client/v3/login/sso/redirect(/{idpId})` in order to upgrade
them to use OAuth 2.0 Pushed Authorization Requests (PAR). We have some
Complement tests in that codebase that go over this flow and these
changes are required [in order for the URL's to line
up](d648c8ce3f/synapse/rest/client/login.py (L652-L673)).
2024-12-03 12:54:25 +00:00
Jason Little
224ef0b669 Unix Sockets for HTTP Replication (#15708)
Unix socket support for `federation` and `client` Listeners has existed now for a little while(since [1.81.0](https://github.com/matrix-org/synapse/pull/15353)), but there was one last hold out before it could be complete: HTTP Replication communication. This should finish it up. The Listeners would have always worked, but would have had no way to be talked to/at.

---------

Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
Co-authored-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
Co-authored-by: Eric Eastwood <erice@element.io>
2023-07-11 13:08:06 -05:00
reivilibre
4fef76ca34 Remove Caddy from the Synapse workers image used in Complement. (#12818) 2022-05-23 10:29:24 +01:00
Andrew Morgan
7e460ec2a5 Add a dockerfile for running a set of Synapse worker processes (#9162)
This PR adds a Dockerfile and some supporting files to the `docker/` directory. The Dockerfile's intention is to spin up a container with:

* A Synapse main process.
* Any desired worker processes, defined by a `SYNAPSE_WORKERS` environment variable supplied at runtime.
* A redis for worker communication.
* A nginx for routing traffic.
* A supervisord to start all worker processes and monitor them if any go down.

Note that **this is not currently intended to be used in production**. If you'd like to use Synapse workers with Docker, instead make use of the official image, with one worker per container. The purpose of this dockerfile is currently to allow testing Synapse in worker mode with the [Complement](https://github.com/matrix-org/complement/) test suite.

`configure_workers_and_start.py` is where most of the magic happens in this PR. It reads from environment variables (documented in the file) and creates all necessary config files for the processes. It is the entrypoint of the Dockerfile, and thus is run any time the docker container is spun up, recreating all config files in case you want to use a different set of workers. One can specify which workers they'd like to use by setting the `SYNAPSE_WORKERS` environment variable (as a comma-separated list of arbitrary worker names) or by setting it to `*` for all worker processes. We will be using the latter in CI.

Huge thanks to @MatMaul for helping get this all working 🎉 This PR is paired with its equivalent on the Complement side: https://github.com/matrix-org/complement/pull/62.

Note, for the purpose of testing this PR before it's merged: You'll need to (re)build the base Synapse docker image for everything to work (`matrixdotorg/synapse:latest`). Then build the worker-based docker image on top (`matrixdotorg/synapse:workers`).
2021-04-14 13:54:49 +01:00