Messing Around With Ceph (1/N)
I decided I needed a DFS, so I did a brief survey of current (FOSS) offerings. Ceph came out on top due to its features and robust developer base, so I figured I'd mess around with it for a little bit, see if I could get it set up and useful.
The good news about Ceph is that it has a tremendous amount of documentation! The bad news about Ceph is... that it has a tremendous amount of documentation. It also consists of a bunch of different services, which can be set up in a variety of configurations, so figuring out the best way to lay everything out takes a little bit of digging. Here's what I've come up with so far:
- Monitor nodes: These things maintain cluster state. In an HA setup they use Paxos, which means you need at least 3 for tinkering (5 for an n+2 setup in production).
- Manager nodes: Provide a bunch of management functionality. The manager node docs say "It is not mandatory to place mgr daemons on the same nodes as mons, but it is almost always sensible". So three of those too.
- Metadata Servers (aka "MDS"): These servers store all the metadata for the Ceph Filesystem. There's a lot of flexibility in how these are laid out. Towards the end of the architecture doc it says "Combinations of standby and active etc are possible, for example running 3 active ceph-mds instances for scaling, and one standby instance for high availability". So let's install three of these things, make two of them active and one of them standby.
- Object Storage Daemons (aka "OSD"): These hold the data. In a normal installation these do most of the heavy lifting, and you'll have way more of them than any of the other daemons. Since we've already consumed 3 VMs lets go ahead and install an OSD on each.
- Admin node: The preflight checklist strongly suggests a dedicated admin node.
- Client: Finally, let's set up a node to act as an FS client.
So here's a Vagrantfile:
Vagrant.configure("2") do |config| config.vm.box = "centos/7" config.vm.provision :shell, path: "dns.sh" config.vm.synced_folder ".", "/vagrant", type: "rsync", rsync__exclude: "vdisks/" config.vm.define "cache" do |cache| cache.vm.hostname = "cache" cache.vm.network "private_network", ip: "10.0.0.254", virtualbox__intnet: true cache.vm.provision :shell, path: "dns.sh" cache.vm.provision :shell, path: "cache.sh" end ips = Hash[ "node1" => "10.0.0.3", "node2" => "10.0.0.4", "node3" => "10.0.0.5" ] vdisk_dir = "./vdisks" unless Dir.exist?(vdisk_dir) Dir.mkdir(vdisk_dir) end (1..3).each do |i| config.vm.define "node#{i}" do |node| hostname = "node#{i}" node.vm.hostname = hostname node.vm.network "private_network", ip: ips[hostname], virtualbox__intnet: true node.vm.provider "virtualbox" do |vb| vdisk_file = "#{vdisk_dir}/#{hostname}-ceph.vdi" unless File.exist?(vdisk_file) vb.customize [ 'createhd', '--filename', vdisk_file, '--variant', 'Fixed', '--size', 1024 ] end vb.customize [ 'storageattach', :id, '--storagectl', 'IDE', '--port', 1, '--device', 0, '--type', 'hdd', '--medium', vdisk_file ] end node.vm.provision :shell, path: "bootstrap.sh" node.vm.provision :shell, path: "ntp.sh" end end config.vm.define "admin" do |admin| admin.vm.hostname = "admin" admin.vm.network "private_network", ip: "10.0.0.2", virtualbox__intnet: true admin.vm.provision :shell, path: "bootstrap.sh" end config.vm.define "client" do |client| client.vm.hostname = "client" client.vm.network "private_network", ip: "10.0.0.6", virtualbox__intnet: true client.vm.provision :shell, path: "bootstrap.sh" end endVarious parts of the above blatantly stolen from the Vagrant tips page and EverythingShouldBeVirtual.
There are a handful of marginally interesting things going on in the Vagrantfile:
- Note that there's a VM called cache; it doesn't have anything to do with Ceph. As I was building (and rebuilding) the other nodes it quickly became apparent that downloading RPMs consumes the vast majority of the setup time for each machine. So cache is just going to run a caching Squid proxy, which will speed up the build time for the remaining nodes considerably.
- There's some jiggery-pokery which adds an additional disk to each of the Ceph nodes; Ceph really likes to have a raw disk device on which to store things.
Here's dns.sh:
#!/usr/bin/env bash cat <<EOM > /etc/hosts 127.0.0.1 localhost localhost.localdomain 10.0.0.2 admin admin.localdomain 10.0.0.3 node1 node1.localdomain 10.0.0.4 node2 node2.localdomain 10.0.0.5 node3 node3.localdomain 10.0.0.6 client client.localdomain 10.0.0.254 cache cache.localdomain EOMNothing fancy here, just make sure all names resolve appropriately.
And cache.sh:
#!/usr/bin/env bash yum install -y squid sed -i '/cache_dir/ s/^#//' /etc/squid/squid.conf service squid start sed -i '/\[main\]/a proxy=http://cache.localdomain:3128' /etc/yum.conf sed -i 's/enabled=1/enabled=0/' /etc/yum/pluginconf.d/fastestmirror.conf yum update -yThis installs Squid, enables caching, and sets up yum to direct requests through Squid.
ntp.sh:
#!/usr/bin/env bash yum install -y ntp ntpdate ntp-doc grep -o 'node[1-9].localdomain' /etc/hosts | grep -v `hostname` | sed -e 's/^/peer /' >> /etc/ntp.conf ntpdate 0.centos.pool.ntp.org service ntpd startThis does NTP setup on the 3 main nodes. The Ceph documentation strongly suggests that, in a multi-monitor setup, the hosts running the monitoring daemons should be set up as NTP peers of each other.
Here's bootstrap.sh:
#!/usr/bin/env bash sed -i '/\[main\]/a proxy=http://cache.localdomain:3128' /etc/yum.conf sed -i 's/enabled=1/enabled=0/' /etc/yum/pluginconf.d/fastestmirror.conf cat << EOM > /etc/yum.repos.d/ceph-deploy.repo [ceph-noarch] name=Ceph noarch packages baseurl=https://download.ceph.com/rpm-luminous/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc EOM yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm yum update -y yum install -y yum-plugin-priorities useradd -c /home/ceph-deploy -m ceph-deploy echo "ceph-deploy ALL = (root) NOPASSWD:ALL" > /etc/sudoers.d/ceph-deploy chmod 0440 /etc/sudoers.d/ceph-deploy mkdir -p -m 700 /home/ceph-deploy/.ssh cat <<EOM > /home/ceph-deploy/.ssh/config Host admin Hostname admin User ceph-deploy Host node1 Hostname node1 User ceph-deploy Host node2 Hostname node2 User ceph-deploy Host node3 Hostname node3 User ceph-deploy Host client Hostname client User ceph-deploy EOM mv /vagrant/id_rsa* /home/ceph-deploy/.ssh/ cp /home/ceph-deploy/.ssh/id_rsa.pub /home/ceph-deploy/.ssh/authorized_keys chown -R ceph-deploy:ceph-deploy /home/ceph-deploy/.ssh chmod 0600 /home/ceph-deploy/.ssh/*This is based largely off of the preflight checklist mentioned above. Note the tweaks to the files under /etc/yum, which configures the machine to use cache for downloading RPMs and disables the fastestmirror plugin (which isn't needed 'cause we're using a local caching proxy). Note also the distribution of a shared SSH key; the ceph-deploy utility needs passwordless SSH access to all nodes. I couldn't figure out an elegant way to generate a shared keypair during vagrant up. So instead I did
ssh-keygen -C 'ceph deploy user' -f id_rsato create the keypair and put the resulting files into the Vagrant directory where they're available during execution of bootstrap.sh.
This completes the preliminaries; all of the nodes are ready for installation and configuration of Ceph proper. We'll pick up there next time.
0 Comments:
Post a Comment
<< Home