Browsed by
Tag: docker swarm

Deploy Ceph Rados Gateway on Docker Swarm for Proxmox Cluster

Deploy Ceph Rados Gateway on Docker Swarm for Proxmox Cluster

I want to use the features exposed by the Ceph Rados Gateway (RGW). While it is possible to install this directly on the Proxmox nodes, it is not supported.

I wondered if I could run the gateway on Docker Swarm. The long story is that I want to try NFS via RGW as an alternative to CephFS (which has been a bit of a pain to manage the past). It seems that typically you run multiple instances of RGW, but in this case Swarm already provides HA so perhaps I only need one.

The first official docker image (ceph/radosgw) I found for RGW was ancient – two years old! Not encouraging but I tried it anyway. This choked with:

connect protocol feature mismatch, my ffffffffffffff < peer 4fddff8eea4fffb missing 400000000000000

Well that's a clear as mud way of saying that my Ceph and RGW versions didn't match. Turns out that Ceph don't maintain the RGW image and expect people to use an all in one ceph/daemon image. Let's try again:

$ docker service create --name radosgw \
     --mount type=bind,src=/data/docker/ceph/etc,dst=/etc/ceph \
     --mount type=bind,src=/data/docker/ceph/lib,dst=/var/lib/ceph/ \
     -e RGW_CIVETWEB_PORT=7480 \
     -e RGW_NAME=proxceph \
     --publish 7480:7480 \
     --replicas=1 \
 ceph/daemon rgw
wzj49gor6tfffs3uv3mdyy9sd
overall progress: 0 out of 1 tasks 
1/1: preparing 
verify: Detected task failure 

This never stabilizes. This is in the logs:

2018-06-25 05:16:14  /entrypoint.sh: ERROR- /var/lib/ceph/bootstrap-rgw/ceph.keyring must exist. You can extract it from your current monitor by running 'ceph auth get client.bootstrap-rgw -o /var/lib/ceph/bootstrap-rgw/ceph.keyring',

This file was auto-generated by the PVE Ceph installation, so copy it to the path exposed to the docker service.

2018-06-25 05:23:37  /entrypoint.sh: SUCCESS
exec: PID 197: spawning /usr/bin/radosgw --cluster ceph --setuser ceph --setgroup ceph -d -n client.rgw.proxceph -k /var/lib/ceph/radosgw/ceph-rgw.proxceph/keyring --rgw-socket-path= --rgw-zonegroup= --rgw-zone= --rgw-frontends=civetweb port=0.0.0.0:7480
2018-06-25 05:23:37.584 7fcccee8a8c0  0 framework: civetweb
2018-06-25 05:23:37.584 7fcccee8a8c0  0 framework conf key: port, val: 0.0.0.0:7480
2018-06-25 05:23:37.588 7fcccee8a8c0  0 deferred set uid:gid to 167:167 (ceph:ceph)
2018-06-25 05:23:37.588 7fcccee8a8c0  0 ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable), process (unknown), pid 197
2018-06-25 05:23:49.100 7fcccee8a8c0  0 starting handler: civetweb
2018-06-25 05:23:49.116 7fcccee8a8c0  1 mgrc service_daemon_register rgw.proxceph metadata {arch=x86_64,ceph_release=mimic,ceph_version=ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable),ceph_version_short=13.2.0,cpu=Common KVM processor,distro=centos,distro_description=CentOS Linux 7 (Core),distro_version=7,frontend_config#0=civetweb port=0.0.0.0:7480,frontend_type#0=civetweb,hostname=e326bccb3712,kernel_description=#154-Ubuntu SMP Fri May 25 14:15:18 UTC 2018,kernel_version=4.4.0-128-generic,mem_swap_kb=4190204,mem_total_kb=4046012,num_handles=1,os=Linux,pid=197,zone_id=f2928dc9-3983-46ff-9da9-2987f3639bb6,zone_name=default,zonegroup_id=16963e86-4e7d-4152-99f0-c6e9ae4596a4,zonegroup_name=default}

It looks like RGW auto created some pools. This was expected.

# ceph osd lspools
[..], 37 .rgw.root,38 default.rgw.control,39 default.rgw.meta,40 default.rgw.log,

Now I need to create a user that can use the RGW. Get a shell on the container:

[[email protected] /]# radosgw-admin user create --uid="media" --display-name="media"
{
    "user_id": "media",
    "display_name": "media",
    "email": "",
    "suspended": 0,
    "max_buckets": 1000,
    "auid": 0,
    "subusers": [],
    "keys": [
        {
            "user": "media",
            "access_key": "XXXXXXXX",
            "secret_key": "XXXXXXXXXXXXXXXXXXX"
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw",
    "mfa_ids": []
}

The next step is a little beyond the scope of this "guide". I already have haproxy in place with a valid wildcard letsencrypt cert. I set all *bucket*.my.domain to point to RGW. For this to work properly you need to set

          rgw dns name = my.domain 

in your ceph.conf.

Let's test everything works:

$ cat s3test.py
import boto.s3.connection

access_key = 'JU1P3CIATBP1IK297D3H'
secret_key = 'sV0dGfVSbClQFvbCUM22YivwcXyUmyQEOBqrDsy6'
conn = boto.connect_s3(
        aws_access_key_id=access_key,
        aws_secret_access_key=secret_key,
        host='rgw.my.domain', port=443,
        is_secure=True, calling_format=boto.s3.connection.OrdinaryCallingFormat(),
       )

bucket = conn.create_bucket('my-test-bucket')
for bucket in conn.get_all_buckets():
    print "{name} {created}".format(
        name=bucket.name,
        created=bucket.creation_date,
    )

N.B. rgw.my.domain is substituted for the real hostname with a valid cert directed to rgw on my internal network. Also note that is_secure is set to true as I do have SSL termination at haproxy.

The output:

$ python s3test.py 
my-test-bucket 2018-06-25T06:01:19.258Z

Cool!

Attaching Docker Swarm Services to an Overlay Network

Attaching Docker Swarm Services to an Overlay Network

When I originally configured Prometheus with a variety of exporters I had it scraping ports on a specific docker swarm host. This is dangerous as if that host goes down the underlying service will pop back up on a different host but Prometheus won’t be able to scrape it. I considered using haproxy to round robin onto the docker swarm nodes, but Kubernetes can resolve services by service name – is there no way to do this in Docker Swarm?

There is, but unlike Kubernetes the services can’t resolve each other by default. We must create a specific network and attach the services to it.

Before:

/prometheus $ nslookup unifi_exporter
Server:    127.0.0.11
Address 1: 127.0.0.11

nslookup: can't resolve 'unifi_exporter'

Create overlay network:

sudo docker network create -d overlay monitoring
tb3iw12k7xaw5olz7rasdcnm0

Redeploy Prometheus on network:

docker service create --replicas 1 --name prometheus \
    --mount type=bind,source=/data/docker/prometheus/config/prometheus.yml,destination=/etc/prometheus/prometheus.yml \
    --mount type=bind,src=/data/docker/prometheus/data,dst=/prometheus \
    --publish published=9090,target=9090,protocol=tcp \
    --network monitoring \
    prom/prometheus

Redeploy our exporter, this time attached to the overlay network. Note we no longer need to publish a port.

docker service create --replicas 1 --name unifi_exporter \
    --mount type=bind,src=/data/docker/unifi-exporter/config.yml,dst=/config.yml \
    --mount type=bind,src=/etc/ssl,dst=/etc/ssl,readonly \
    --replicas=1 \
    --network monitoring \
    louisvernon/unifi_exporter:0.4.0-18-g85455df -config.file=/config.yml

Confirm Prometheus can resolve the exporter by service name:

/prometheus $ nslookup unifi_exporter
Server:    127.0.0.11
Address 1: 127.0.0.11

Name:      unifi_exporter
Address 1: 10.0.1.15