Prometheus

Prometheus is a monitoring system that is designed to process a large number of metrics, centralize them on one (or multiple) servers and serve them with a well-defined API. That API is queried through a domain-specific language (DSL) called "PromQL" or "Prometheus Query Language". Prometheus also supports basic graphing capabilities although those are limited enough that we use a separate graphing layer on top (see Grafana).

Basic design

The Prometheus web interface is available at:

https://prometheus.torproject.org

A simple query you can try is to pick any metric in the list and click Execute. For example, this link will show the 5-minute load over the last two weeks for the known servers.

Here you can see, from the Prometheus overview documentation the basic architecture of a Prometheus site:

A
drawing of Prometheus' architecture, showing the push gateway and
exporters adding metrics, service discovery through file_sd and
Kubernetes, alerts pushed to the Alertmanager and the various UIs
pulling from Prometheus

As you can see, Prometheus is somewhat tailored towards Kubernetes but it can be used without it. We're deploying it with the file_sd discovery mechanism, where Puppet collects all exporters into the central server, which then scrapes those exporters every scrape_interval (by default 15 seconds). The architecture graph also shows the Alertmanager which could be used to (eventually) replace our Nagios deployment.

It does not show that Prometheus can federate to multiple instances and the Alertmanager can be configured with High availability.

Munin expatriates

Here's a quick cheat sheet from people used to Munin and switching to Prometheus:

What Munin Prometheus
Scraper munin-update prometheus
Agent munin-node prometheus node-exporter and others
Graphing munin-graph prometheus or grafana
Alerting munin-limits prometheus alertmanager
Network port 4949 9100 and others
Protocol TCP, text-based HTTP, text-based
Storage format RRD custom TSDB
Downsampling yes no
Default interval 5 minutes 15 seconds
Authentication no no
Federation no yes (can fetch from other servers)
High availability no yes (alert-manager gossip protocol)

Basically, Prometheus is similar to Munin in many ways:

  • it "pulls" metrics from the nodes, although it does it over HTTP (to http://host:9100/metrics) instead of a custom TCP protocol like Munin

  • the agent running on the nodes is called prometheus-node-exporter instead of munin-node. it scrapes only a set of built-in parameters like CPU, disk space and so on, different exporters are necessary for different applications (like prometheus-apache-exporter) and any application can easily implement an exporter by exposing a Prometheus-compatible /metrics endpoint

  • like Munin, the node exporter doesn't have any form of authentication built-in. we rely on IP-level firewalls to avoid leakage

  • the central server is simply called prometheus and runs as a daemon that wakes up on its own, instead of munin-update which is called from munin-cron and before that cron

  • graphics are generated on the fly through the crude Prometheus web interface or by frontends like Grafana, instead of being constantly regenerated by munin-graph

  • samples are stored in a custom "time series database" (TSDB) in Prometheus instead of the (ad-hoc) RRD standard

  • Prometheus performs no downsampling like RRD and Prom relies on smart compression to spare disk space, but it uses more than Munin

  • Prometheus scrapes samples much more aggressively than Munin by default, but that interval is configurable

  • Prometheus can scale horizontally (by sharding different services to different servers) and vertically (by aggregating different servers to a central one with a different sampling frequency) natively - munin-update and munin-graph can only run on a single (and same) server

  • Prometheus can act as an high availability alerting system thanks to its alertmanager that can run multiple copies in parallel without sending duplicate alerts - munin-limits can only run on a single server

Puppet implementation

Every node is configured as a node-exporter through the roles::monitored that is included everywhere. The role might eventually be expanded to cover alerting and other monitoring resources as well. This role, in turn, includes the profile::prometheus::client which configures each client correctly with the right firewall rules.

The firewall rules are exported from the server, defined in profile::prometheus::server. We hacked around limitations of the upstream Puppet module to install Prometheus using backported Debian packages. The monitoring server itself is defined in roles::monitoring.

The Prometheus Puppet module was patched to allow scrape job collection and use of Debian packages for installation. Much of the initial Prometheus configuration was also documented in [ticketĀ #29681][] and especially ticket #29388 which investigates storage requirements and possible alternatives for data retention policies.