Had some fun playing with my fork of Uncloud. And making metrics work. Various things don’t work out of the box, but can be made to work. For one Prometheus seems to insist on SRV records, and can’t do DNS service discovery with plan A or AAAA records.

… So…

First we need to work around the fact that DNS sd doesn’t work, so fake it using file sd:

  prom_sd_config:
    content: |
      #!/bin/ash
      while true; do
        nslookup m.internal | awk '/^Address:/ && $2 !~ /:/ {if(!seen){print "- targets:"; seen=1} print "  - " $2 ":51004"}' > /prometheus/
uncloud.yaml
        nslookup caddy.internal | awk '/^Address:/ && $2 !~ /:/ {if(!seen){print "- targets:"; seen=1} print "  - " $2 ":9180"}' > /promethe
us/caddy.yaml
        sleep 3600
      done

This config puts a shell script in place which is then used by a service:

 unprom-sd:
    image: prom/prometheus
    configs:
      - source: prom_sd_config
        target: /sd_uncloud
        mode: 0770
    volumes:
      - prom_data:/prometheus
    entrypoint: /sd_uncloud

In prometheus we then use:

     - job_name: uncloud
        file_sd_configs:
        - files:
          - /prometheus/uncloud.yaml
      - job_name: caddy
        file_sd_configs:
        - files:
          - /prometheus/caddy.yaml

To pick it up. Problem 1 solved.

Problem 2: caddy metrics… in the x-caddy file we deploy for Caddy add:

   metrics {
        per_host
   }

But this serves metrics on localhost inside the container. So we need to export this, we can’t add this snippet:

     http://:9180 {
          handle {
              metrics
          }
      }

Because this is something that does not belong in the global block.

So, we add another service that only adds this:

services:
  caddy-metrics:
    image: registry.science.ru.nl/cncz/sys/image/debug:v0.1.14
    x-caddy: |
      http://:9180 {
          handle {
              metrics
          }
      }

And deploy this do-nothing-but-update-caddy service. And voila, we have metrics finally.