Prometheus
What Prometheus is
Prometheus is an open-source systems monitoring and alerting toolkit. Prometheus pulls metrics from targets and stores it as multiple time-series in an aggregated format.
Architecture
Alertmanager
Prometheus sends notifications to Alertmanager based on rules in config. Alertmanager integrates with Opsgenie, Slack, or other notification systems.
Exporter
Exporter is used to expose OS or database metrics for Prometheus to scrape
Push gateway
Prometheus always uses pull but if you need to use push you can set PushGateway: Prometheus will pull from PushGateway and your system will push to PushGateway.
Data model
Time series in Prometheus consists of multiple samples. Each sample is a pair of millisecond-precision timestamp and a float64 value.
123.45@1702749856.001
Each time series has a set of labels and values associated with it. Metric name is a special value for __name__
label.
Counter metric type
Use it when value is increased.
There is a simple algorithm to compute the increase of counter between scrapes from t1 to t2 is:
- if counter(t2) >= counter(t1) then increase=counter(t2)-counter(t1)
- if counter(2) < counter(t1)then increase=counter(t2)
Gauge metric type
Use it when value is increased and decreased.
Histogram metric type
Use it when you need to know when some percent of your values are above or below a threshold. Like 90% of latency is less than 100 ms.
Instant vector
An instant vector selector returns an instant vector of the most recent samples before the query evaluation time.
process_cpu_seconds_total
Range vector
A range vector selector returns many samples for each time series. Range vectors are always used with increase, rate or a similar function:
rate(process_cpu_seconds_total[1m])
Offset
Offset allows to look back into the past.
rate(process_cpu_seconds_total[1m]) offset 1h