Pi-Hole on Docker Swarm (behind SSL proxy)

Pi-Hole on Docker Swarm (behind SSL proxy)

This is my simple config for running Pi-Hole on Docker Swarm. pfsense is configured as a DNS forwarder pulling from three dockerswarm nodes. I only run one instance of Pi-Hole (they need to lock the sqlite db), but docker swarm takes care of availability/resiliency.

As I hit Pi-Hole through an SSL terminating proxy I set the ServerIP as 0.0.0.0. This resolves blocked domains to 0.0.0.0 with no major side effects.

docker service create --name pihole \
    --mount type=bind,src=/data/docker/pihole/pihole,dst=/etc/pihole \
    --mount type=bind,src=/data/docker/pihole/dnsmasq.d,dst=/etc/dnsmasq.d \
    --replicas=1 \
    -e ServerIP=0.0.0.0 \
    -e VIRTUAL_HOST=pihole.my.domain \
    -e WEBPASSWORD=myPassword \
    --publish published=9053,target=80,protocol=tcp \
    --publish published=53,target=53,protocol=tcp \
    --publish published=53,target=53,protocol=udp \
     diginc/pi-hole:debian_dev
Deploy Ceph Rados Gateway on Docker Swarm for Proxmox Cluster

Deploy Ceph Rados Gateway on Docker Swarm for Proxmox Cluster

I want to use the features exposed by the Ceph Rados Gateway (RGW). While it is possible to install this directly on the Proxmox nodes, it is not supported.

I wondered if I could run the gateway on Docker Swarm. The long story is that I want to try NFS via RGW as an alternative to CephFS (which has been a bit of a pain to manage the past). It seems that typically you run multiple instances of RGW, but in this case Swarm already provides HA so perhaps I only need one.

The first official docker image (ceph/radosgw) I found for RGW was ancient – two years old! Not encouraging but I tried it anyway. This choked with:

connect protocol feature mismatch, my ffffffffffffff < peer 4fddff8eea4fffb missing 400000000000000

Well that's a clear as mud way of saying that my Ceph and RGW versions didn't match. Turns out that Ceph don't maintain the RGW image and expect people to use an all in one ceph/daemon image. Let's try again:

$ docker service create --name radosgw \
     --mount type=bind,src=/data/docker/ceph/etc,dst=/etc/ceph \
     --mount type=bind,src=/data/docker/ceph/lib,dst=/var/lib/ceph/ \
     -e RGW_CIVETWEB_PORT=7480 \
     -e RGW_NAME=proxceph \
     --publish 7480:7480 \
     --replicas=1 \
 ceph/daemon rgw
wzj49gor6tfffs3uv3mdyy9sd
overall progress: 0 out of 1 tasks 
1/1: preparing 
verify: Detected task failure 

This never stabilizes. This is in the logs:

2018-06-25 05:16:14  /entrypoint.sh: ERROR- /var/lib/ceph/bootstrap-rgw/ceph.keyring must exist. You can extract it from your current monitor by running 'ceph auth get client.bootstrap-rgw -o /var/lib/ceph/bootstrap-rgw/ceph.keyring',

This file was auto-generated by the PVE Ceph installation, so copy it to the path exposed to the docker service.

2018-06-25 05:23:37  /entrypoint.sh: SUCCESS
exec: PID 197: spawning /usr/bin/radosgw --cluster ceph --setuser ceph --setgroup ceph -d -n client.rgw.proxceph -k /var/lib/ceph/radosgw/ceph-rgw.proxceph/keyring --rgw-socket-path= --rgw-zonegroup= --rgw-zone= --rgw-frontends=civetweb port=0.0.0.0:7480
2018-06-25 05:23:37.584 7fcccee8a8c0  0 framework: civetweb
2018-06-25 05:23:37.584 7fcccee8a8c0  0 framework conf key: port, val: 0.0.0.0:7480
2018-06-25 05:23:37.588 7fcccee8a8c0  0 deferred set uid:gid to 167:167 (ceph:ceph)
2018-06-25 05:23:37.588 7fcccee8a8c0  0 ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable), process (unknown), pid 197
2018-06-25 05:23:49.100 7fcccee8a8c0  0 starting handler: civetweb
2018-06-25 05:23:49.116 7fcccee8a8c0  1 mgrc service_daemon_register rgw.proxceph metadata {arch=x86_64,ceph_release=mimic,ceph_version=ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable),ceph_version_short=13.2.0,cpu=Common KVM processor,distro=centos,distro_description=CentOS Linux 7 (Core),distro_version=7,frontend_config#0=civetweb port=0.0.0.0:7480,frontend_type#0=civetweb,hostname=e326bccb3712,kernel_description=#154-Ubuntu SMP Fri May 25 14:15:18 UTC 2018,kernel_version=4.4.0-128-generic,mem_swap_kb=4190204,mem_total_kb=4046012,num_handles=1,os=Linux,pid=197,zone_id=f2928dc9-3983-46ff-9da9-2987f3639bb6,zone_name=default,zonegroup_id=16963e86-4e7d-4152-99f0-c6e9ae4596a4,zonegroup_name=default}

It looks like RGW auto created some pools. This was expected.

# ceph osd lspools
[..], 37 .rgw.root,38 default.rgw.control,39 default.rgw.meta,40 default.rgw.log,

Now I need to create a user that can use the RGW. Get a shell on the container:

[[email protected] /]# radosgw-admin user create --uid="media" --display-name="media"
{
    "user_id": "media",
    "display_name": "media",
    "email": "",
    "suspended": 0,
    "max_buckets": 1000,
    "auid": 0,
    "subusers": [],
    "keys": [
        {
            "user": "media",
            "access_key": "XXXXXXXX",
            "secret_key": "XXXXXXXXXXXXXXXXXXX"
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw",
    "mfa_ids": []
}

The next step is a little beyond the scope of this "guide". I already have haproxy in place with a valid wildcard letsencrypt cert. I set all *bucket*.my.domain to point to RGW. For this to work properly you need to set

          rgw dns name = my.domain 

in your ceph.conf.

Let's test everything works:

$ cat s3test.py
import boto.s3.connection

access_key = 'JU1P3CIATBP1IK297D3H'
secret_key = 'sV0dGfVSbClQFvbCUM22YivwcXyUmyQEOBqrDsy6'
conn = boto.connect_s3(
        aws_access_key_id=access_key,
        aws_secret_access_key=secret_key,
        host='rgw.my.domain', port=443,
        is_secure=True, calling_format=boto.s3.connection.OrdinaryCallingFormat(),
       )

bucket = conn.create_bucket('my-test-bucket')
for bucket in conn.get_all_buckets():
    print "{name} {created}".format(
        name=bucket.name,
        created=bucket.creation_date,
    )

N.B. rgw.my.domain is substituted for the real hostname with a valid cert directed to rgw on my internal network. Also note that is_secure is set to true as I do have SSL termination at haproxy.

The output:

$ python s3test.py 
my-test-bucket 2018-06-25T06:01:19.258Z

Cool!

Attaching Docker Swarm Services to an Overlay Network

Attaching Docker Swarm Services to an Overlay Network

When I originally configured Prometheus with a variety of exporters I had it scraping ports on a specific docker swarm host. This is dangerous as if that host goes down the underlying service will pop back up on a different host but Prometheus won’t be able to scrape it. I considered using haproxy to round robin onto the docker swarm nodes, but Kubernetes can resolve services by service name – is there no way to do this in Docker Swarm?

There is, but unlike Kubernetes the services can’t resolve each other by default. We must create a specific network and attach the services to it.

Before:

/prometheus $ nslookup unifi_exporter
Server:    127.0.0.11
Address 1: 127.0.0.11

nslookup: can't resolve 'unifi_exporter'

Create overlay network:

sudo docker network create -d overlay monitoring
tb3iw12k7xaw5olz7rasdcnm0

Redeploy Prometheus on network:

docker service create --replicas 1 --name prometheus \
    --mount type=bind,source=/data/docker/prometheus/config/prometheus.yml,destination=/etc/prometheus/prometheus.yml \
    --mount type=bind,src=/data/docker/prometheus/data,dst=/prometheus \
    --publish published=9090,target=9090,protocol=tcp \
    --network monitoring \
    prom/prometheus

Redeploy our exporter, this time attached to the overlay network. Note we no longer need to publish a port.

docker service create --replicas 1 --name unifi_exporter \
    --mount type=bind,src=/data/docker/unifi-exporter/config.yml,dst=/config.yml \
    --mount type=bind,src=/etc/ssl,dst=/etc/ssl,readonly \
    --replicas=1 \
    --network monitoring \
    louisvernon/unifi_exporter:0.4.0-18-g85455df -config.file=/config.yml

Confirm Prometheus can resolve the exporter by service name:

/prometheus $ nslookup unifi_exporter
Server:    127.0.0.11
Address 1: 127.0.0.11

Name:      unifi_exporter
Address 1: 10.0.1.15
Netboot.xyz Docker Service

Netboot.xyz Docker Service

netboot.xyz offers a feature rich zero configuration iPXE boot-up environment with all the linux and utility images you could ask for. This is great as we don’t have to maintain up-to-date bootable PXE images locally.

We are running netboot.xyz on docker swarm. We found the tftp server did not function correctly when utilizing the swarm network so instead bound this specific container’s networking to host.

docker service create --replicas 1 --name netbootxyz \
    --constraint node.hostname==specific_node_name \
    --network=host \
    --replicas=1 \
     rjocoleman/netboot.xyz

Finally you need to configure your DHCP server to return the IP for specific_node_name as the Next Server.

Initial Menu:

Linux Distributions:

Utilities:

N.B. the contraint does mean we lose resiliency, but given the low resources this container requires (it’s less than 5MB!) you could easily remove the constraint and replace –replicas=1 with –mode=global. How you expose multiple swarm nodes behind a single virtual IP is a topic for another post.

Unifi to Grafana (using Prometheus and unifi_exporter)

Unifi to Grafana (using Prometheus and unifi_exporter)

Documenting the process of getting this up and running. We already had Prometheus and Grafana running on our docker swarm cluster (we promise to document this all one day).

There was only one up to date image of unifi_exporter in DockerHub and it had no documentation so we were not comfortable using it.

1) Download, build and push unifi_exporter.

$ git clone [email protected]:mdlayher/unifi_exporter.git
...
$ cd unifi_exporter
$ sudo docker build -t louisvernon/unifi_exporter:$(git describe --tags) . # yields a tag like 0.4.0-18-g85455df
$ sudo docker push louisvernon/unifi_exporter:$(git describe --tags)

2) Create read only admin user for unifi_exporter service:

3) Create config.yml on storage mounted on dockerswarm node. In our case we have a glusterfs volume mounted across all nodes. If you are using the self-signed cert on your unifi controller then you will need to set insecure to true.

$ $ cat /data/docker/unifi-exporter/config.yml 
listen:
  address: :9130
  metricspath: /metrics
unifi:
  address: https://unifi.vern.space
  username: unifiexporter
  password: random_password
  site: Default 
  insecure: false
  timeout: 5s

4) Deploy to docker swarm. The docker image does not contain any trusted certs, so we mounted the host certs as readonly.

$ docker service create --replicas 1 --name unifi_exporter \
    --mount type=bind,src=/data/docker/unifi-exporter/config.yml,dst=/config.yml \
    --mount type=bind,src=/etc/ssl,dst=/etc/ssl,readonly \
    --publish 9130:9130 \
    --replicas=1 \
    louisvernon/unifi_exporter:0.4.0-18-g85455df -config.file=/config.yml

5) You should see something like this from the logs (we use portainer to quickly inspect our services).

2018/06/12 01:10:47 [INFO] successfully authenticated to UniFi controller
2018/06/12 01:10:47 Starting UniFi exporter on ":9130" for site(s): Default

First time around (before we bind mounted /etc/ssl) we had an x509 error due to the missing trusted certs..

6) Add unifi_exporter as a new target for prometheus.

$ cat /data/docker/prometheus/config/prometheus.yml
...
  - job_name: 'unifi_exporter'
    static_configs:
      - targets: ['dockerswarm:9130']
        labels:
          alias: unifi_exporter
...

7) Point your browser at http://dockerswarm:9130/metrics and make sure you see stats. In our case the payload was 267 lines.

8) Restart the prometheus service: `docker service update –force prometheus`

9) Hop on over to prometheus to make sure the new target is listed and UP: http://dockerswarm:9090/targets

10) Finally we import the dashboard into Grafana. Our options are a little sparse right now, but this dashboard gives us somewhere to start. we made some tweaks to this to make it multi-AP friendly with some some extra stats:
Unifi-1516201148080

The result:

Setup node_exporter on Proxmox

Setup node_exporter on Proxmox

node_exporter is one of the most useful exporters for your Prometheus/Grafana installation, providing a wealth of statistics about the state of your servers/nodes.

These are the steps we used to install node_exporter on our Proxmox nodes.

Download and extract binary:

$ wget https://github.com/prometheus/node_exporter/releases/download/v0.16.0/node_exporter-0.16.0.linux-amd64.tar.gz
...
$ tar xvf node_exporter-0.16.0.linux-amd64.tar.gz
$ cd node_exporter-0.16.0.linux-amd64/

Create user to run node_exporter

$ useradd --no-create-home --shell /bin/false node_exporter

Copy binary to /usr/local/bin and modify owner:

$ cp node_exporter /usr/local/bin/.
$ chown node_exporter:node_exporter  /usr/local/bin/node_exporter

Create service entry for node_exporter. Create /etc/systemd/system/node_exporter.service:

[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
# ExecStart=/usr/local/bin/node_exporter --collectors.enabled meminfo,loadavg,filesystem

[Install]
WantedBy=multi-user.target

Enable service and check it is running:

$ systemctl daemon-reload
$ systemctl start node_exporter
$ systemctl status node_exporter
systemctl status node_exporter
● node_exporter.service - Node Exporter
   Loaded: loaded (/etc/systemd/system/node_exporter.service; disabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-06-10 14:10:42 MDT; 7s ago
 Main PID: 3456142 (node_exporter)
    Tasks: 5 (limit: 4915)
   Memory: 2.1M
      CPU: 9ms
   CGroup: /system.slice/node_exporter.service
           └─3456142 /usr/local/bin/node_exporter

Jun 10 14:10:42 superdave node_exporter[3456142]: time="2018-06-10T14:10:42-06:00" level=info msg=" - stat" source="node_exporter.go:97"
Jun 10 14:10:42 superdave node_exporter[3456142]: time="2018-06-10T14:10:42-06:00" level=info msg=" - textfile" source="node_exporter.go:97"
Jun 10 14:10:42 superdave node_exporter[3456142]: time="2018-06-10T14:10:42-06:00" level=info msg=" - time" source="node_exporter.go:97"
Jun 10 14:10:42 superdave node_exporter[3456142]: time="2018-06-10T14:10:42-06:00" level=info msg=" - timex" source="node_exporter.go:97"
Jun 10 14:10:42 superdave node_exporter[3456142]: time="2018-06-10T14:10:42-06:00" level=info msg=" - uname" source="node_exporter.go:97"
Jun 10 14:10:42 superdave node_exporter[3456142]: time="2018-06-10T14:10:42-06:00" level=info msg=" - vmstat" source="node_exporter.go:97"
Jun 10 14:10:42 superdave node_exporter[3456142]: time="2018-06-10T14:10:42-06:00" level=info msg=" - wifi" source="node_exporter.go:97"
Jun 10 14:10:42 superdave node_exporter[3456142]: time="2018-06-10T14:10:42-06:00" level=info msg=" - xfs" source="node_exporter.go:97"
Jun 10 14:10:42 superdave node_exporter[3456142]: time="2018-06-10T14:10:42-06:00" level=info msg=" - zfs" source="node_exporter.go:97"
Jun 10 14:10:42 superdave node_exporter[3456142]: time="2018-06-10T14:10:42-06:00" level=info msg="Listening on :9100" source="node_exporter.go:111"

Configure to start at boot:

$ systemctl enable node_exporter
Created symlink /etc/systemd/system/multi-user.target.wants/node_exporter.service → /etc/systemd/system/node_exporter.service.

Then you are done. You just need to setup a target in Prometheus.


Many of these steps were re-purposed from
https://www.digitalocean.com/community/tutorials/how-to-install-prometheus-on-ubuntu-16-04

Quick GlusterFS Volume Creation Steps

Quick GlusterFS Volume Creation Steps

Here are some quick steps to create a three drive three node replicated distributed GlusterFS volume for use by docker swarm. We are not using LVM for this quick test so we lose features like snapshotting.

1) Create brick mount point on each node

mkdir -p /data/glusterfs/dockerswarm/brick1

2) Format the drives with xfs

 mkfs.xfs -f -i size=512 /dev/sd_

3) Add drives to fstab

/dev/disk/by-id/ata_ /data/glusterfs/dockerswarm/brick1  xfs rw,inode64,noatime,nouuid      1 2

4) Mount

mount /data/glusterfs/dockerswarm/brick1

5) Create volume mount point under brick mount point*

mkdir -p /data/glusterfs/dockerswarm/brick1/brick

5) Create volume

 $gluster volume create dockerswarm replica 3 transport tcp server1:/data/glusterfs/dockerswarm/brick1/brick server2:/data/glusterfs/dockerswarm/brick1/brick server3:/data/glusterfs/dockerswarm/brick1/brick 

volume create: dockerswarm: success: please start the volume to access data

* The reason we mount the volume to a directory inside the brick mount is to ensure the brick has been mounted on the host. If not the brick directory will not be present and gluster will act as if the brick is unavailable.

Ceph, SolarFlare and Proxmox – slow requests are blocked

Ceph, SolarFlare and Proxmox – slow requests are blocked

Are you seeing lots of `slow requests are blocked` errors during high throughput on your Ceph storage?

We were experiencing serious issues on two supermicro nodes with IOMMU enabled (Keywords: dmar dma pte vpfn) but even on our ASRack C2750 system things weren’t behaving as they should.

We were tearing our hair out trying to figure out what was going on. Especially as we had been using my Solarflare Dual SFP+ 10GB NICs for non-ceph purposes for years.

The answer in this case was to manually install the sfc driver from Solarflare’s website (kudos to solarflare for providing active driver releases covering 5+ year old hardware btw).

Kernel: 4.15.17-2-pve

Check existing driver:

$ modinfo sfc
---
version:        4.1
---

Download the driver:
https://channel.solarflare.com/index.php/component/cognidox/?file=SF-104979-LS-37_Solarflare_NET_driver_source_DKMS.zip&task=download&format=raw&id=1945

Install alien, kernel headers and dkms:

apt-get install alien pve-headers dkms

Extract the RPM and convert to .deb:

alien -c sfc-dkms-4.13.1.1034-0.sf.1.noarch.rpm

Build and install:

dpkg -i sfc-dkms_4.13.1.1034-1_all.deb

Reboot.

Check driver was updated correctly:

---
version: 4.13.1.1034
---

After this we experienced no further slow request warnings or timed out file transfers even under intense sustained IO.

Used LB4M – Finding the Management Interface

Used LB4M – Finding the Management Interface

I bought an LB4M on ebay. I did not have a USB to RJ45 console cable.

Power port – Green
Status port – Orange

I read online that by default the management port pulls an IP via DHCP but that wasn’t working. I cycled through several ports on the device connecting them to my network and actually missed that one port pulled an IP address.

I then connected my laptop to the management port and watched traffic with Wireshark. The LB4M was sending data using the CDP protocol (new to me).

So the firmware is 1.1.0.8. Understanding the span of features vs stability for LB4M firmware versions seems to be a minefield.

The packet data indicated that it had picked up (or statically held) an IP on some interface. As I plugged in ports I ran:

$sudo nmap -F 192.168.1.215
Nmap scan report for 192.168.1.215
Host is up (0.010s latency).
Not shown: 99 closed ports
PORT   STATE SERVICE
23/tcp open  telnet

I was in.
Username: admin
Password:

My first experience with this CLI, so lets poke around.
(type show ? for options)

(Switching) >show network

Interface Status............................... Up
IP Address..................................... 192.168.1.215
Subnet Mask.................................... 255.255.255.0
Default Gateway................................ 192.168.1.1
Burned In MAC Address.......................... XX:XX:XX:XX:XX:XX
Locally Administered MAC address............... 00:00:00:00:00:00
MAC Address Type............................... Burned In
Configured IPv4 Protocol....................... DHCP
Management VLAN ID............................. 1
(Switching) >show serviceport

Interface Status............................... Up
IP Address..................................... 0.0.0.0
Subnet Mask.................................... 0.0.0.0
Default Gateway................................ 0.0.0.0
Configured IPv4 Protocol....................... None
Burned In MAC Address.......................... XX:XX:XX:XX:XX:XX