Shiny Ideas: Monitoring The Bottom Turtle (2/N)

In the previous post we decided to build a self-monitoring cluster using Prometheus and Grafana. This post will focus on gettting Prometheus up and running on the same cluster which is used to house etcd. The usual caveats apply, this is just messing around and isn't vetted for production.

Setting up Prometheus is a little involved since there don't appear to be any official RPMs. Briefly:

Create an initial Prometheus config file.
Create an systemd service description (or init script).
Use bootstrap.sh to install Prometheus, move the above files into place, and start the system.
Modify the Vagrant network config so you can get to one of the Prometheus servers.

So, first things first, let's put together a basic configuration file. The file below is a slightly-modified version of the stock configuration file from the Prometheus Getting Started Guide:

global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['turtle1:9090', 'turtle2:9090', 'turtle3:9090']

The way Prometheus works is to contact targets via HTTP and scrape data from /metrics. The Prometheus server itself is bound to port 9090 on each server, so the fragment above essentially tells it to monitor itself across all 3 cluster nodes. Pretty cool, huh? Note that the servers themselves are totally independent; each server polls itself and peers independently. As such, each server will have slightly different data samples, but that's a non-issue in the context of monitoring.

Name it prometheus.yml and stick it in the Vagrant project directory, which will automatically make it accessible to bootstrap.sh later on.

Another bit you have to supply is some sort of init script. Here's a systemd script recommended by Computing For Geeks:

[Unit]
Description=Prometheus
Documentation=https://prometheus.io/docs/introduction/overview/
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
Environment="GOMAXPROCS=2"
User=prometheus
Group=prometheus
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/local/bin/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/var/lib/prometheus \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries \
  --web.listen-address=0.0.0.0:9090 \
  --web.external-url=

SyslogIdentifier=prometheus
Restart=always

[Install]
WantedBy=multi-user.target

Name this prometheus.service and put it in the Vagrant directory as well.

Update bootstrap.sh to install Prometheus. The script below is mostly stolen from the same Computing For Geeks page:

yum install -y wget
wget -nv https://github.com/prometheus/prometheus/releases/download/v2.8.1/prometheus-2.8.1.linux-amd64.tar.gz
tar zxf prometheus-2.8.1.linux-amd64.tar.gz
groupadd --system prometheus
useradd -s /sbin/nologin --system -g prometheus prometheus
mkdir /var/lib/prometheus
for i in rules rules.d files_sd; do
  mkdir -p -m 775 /etc/prometheus/${i};
  chown -R prometheus:prometheus /etc/prometheus/${i};
done
cp prometheus-2.8.1.linux-amd64/prometheus /usr/local/bin/
cp prometheus-2.8.1.linux-amd64/promtool /usr/local/bin/
cp -r prometheus-2.8.1.linux-amd64/consoles /etc/prometheus
cp -r prometheus-2.8.1.linux-amd64/console_libraries /etc/prometheus
mkdir -p /var/lib/prometheus
chown -R prometheus:prometheus /var/lib/prometheus/
mv /vagrant/prometheus.yml /etc/prometheus
mv /vagrant/prometheus.service /etc/systemd/system
service prometheus start
rm -rf prometheus-2.8.1.linux-amd64*

Note the mv /vargrant/... commands which move the configuration files into place. Once bootstrap.sh completes the Prometheus server should be up and running on the machine.

At this point we could go ahead and vagrant up the cluster and everything should work, but we wouldn't be able to actually get to any of servers without some trickery. This simplest solution is just to enable port forwarding for one of the machines. For example:

  config.vm.define "turtle1" do |turtle1|
    turtle1.vm.hostname = "turtle1"
    turtle1.vm.network "private_network", ip: "10.0.0.2",
      virtualbox__intnet: true
    turtle1.vm.network "forwarded_port", guest: 9090, host: 9090
    turtle1.vm.provision :shell, path: "bootstrap.sh", args: "10.0.0.2 turtle1"
  end

This will let you navigate to localhost:9090 and interact with the Prometheus host running on turtle1. You can check to make sure that everything is working by navigating to http://localhost:9090/targets, which should show the three targets all up and healthy.

So at this point we've got Prometheus up and running, but it isn't yet collecting data about either the etcd cluster or the underlying machines. We'll do that next time.

Friday, March 29, 2019

Monitoring The Bottom Turtle (2/N)

0 Comments:

Previous Posts