CentOS 7环境Prometheus快速起步

备注

由于生产环境依然在使用RHEL 7(稳定为主),所以在 Prometheus快速起步 ( Ubuntu Linux 22.04 LTS )基础上,再次以 CentOS 7 企业级古老的操作系统为基础,部署Prometheus监控

安装

Prometheus官方网站提供下载 ,可以获得不同平台 (macOS, Linux, Windows)的版本:

  • prometheus

  • alertmanager

  • 不同的exporter

CentOS 7安装Prometheus

在操作系统中添加 prometheus 用户
sudo groupadd --system prometheus
sudo useradd -s /sbin/nologin --system -g prometheus prometheus

CentOS 7 的 system 系统用户账号的ID从500开始递减,所以这里 prometheus 用户账号分配到的uid/gid是499

  • 创建配置目录和数据目录:

在操作系统中创建prometheus目录(选择 /home 主目录)
sudo mkdir /home/prometheus
for i in rules rules.d files_sd; do sudo mkdir -p /etc/prometheus/${i}; done
  • 下载最新prometheus二进制程序:

在CentOS7环境安装Prometheus
mkdir -p /tmp/prometheus && cd /tmp/prometheus
curl -s https://api.github.com/repos/prometheus/prometheus/releases/latest | grep browser_download_url | grep linux-amd64 | cut -d '"' -f 4 | wget -qi -
tar xvf prometheus*.tar.gz
cd prometheus*/
sudo mv prometheus promtool /usr/local/bin/

配置以及systemd运行Prometheus

  • 在解压缩的Prometheus软件包目录下有配置案例以及 console libraries :

简单配置
sudo mkdir -p /etc/prometheus
sudo mv consoles/ console_libraries/ /etc/prometheus/
sudo mv prometheus.yml /etc/prometheus/prometheus.yml
  • 创建 Prometheus 的 Systemd进程管理器 服务管理配置文件 /etc/systemd/system/prometheus.service :

Prometheus Systemd进程管理器 服务管理配置文件 /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

StartLimitIntervalSec=500
StartLimitBurst=5

[Service]
User=prometheus
Group=prometheus
Type=simple
Restart=on-failure
RestartSec=5s
ExecStart=/usr/local/bin/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/home/data/prometheus \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries \
  --web.listen-address=0.0.0.0:9090 \
  --web.enable-lifecycle

[Install]
WantedBy=multi-user.target

备注

这里部署的 prometheus 数据存储在 /home/data/prometheus 目录,所以需要先创建这个目录才能运行服务:

mkdir -p /home/data/prometheus
chown prometheus:prometheus /home/data/prometheus
  • 启动服务:

启动Prometheus
sudo systemctl daemon-reload
sudo systemctl enable --now prometheus
sudo systemctl status prometheus

警告

如果系统启用了 Cockpit服务器统一管理平台 ,会遇到端口冲突导致无法启动。请先执行 Cockpit监听端口和地址 调整(我设置成 9091 )

反向代理和url

对于采用 在反向代理后面运行Prometheus 部署,如果采用了 在反向代理后面采用 sub-path 的Prometheus ,则还需要修订 /etc/systemd/system/prometheus.service ,添加 --web.external-url 运行参数,否则反向代理会提示页面不存在

  • 配置 /etc/nginx/conf.d/onesre-core.conf 设置反向代理:

nginx反向代理,prometheus使用sub-path模式 /etc/nginx/conf.d/onesre-core.conf
upstream prometheus {
    server 192.168.8.151:9090;
}

server {
    listen 80;

    server_name onesre onesre.cloud-atlas.io;

    location / {
        include proxy_params;
        proxy_pass http://prometheus;
    }
}
  • 修订 /etc/systemd/system/prometheus.service 添加 --web.external-url 运行参数:

添加 --web.external-url 运行参数 的 /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

StartLimitIntervalSec=500
StartLimitBurst=5

[Service]
User=prometheus
Group=prometheus
Type=simple
Restart=on-failure
RestartSec=5s
ExecStart=/usr/local/bin/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/home/data/prometheus \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries \
  --web.listen-address=0.0.0.0:9090 \
  --web.external-url=/prometheus/ \
  --web.enable-lifecycle

[Install]
WantedBy=multi-user.target

注意,此时默认内置的 prometheus job也需要修订将 sub-path 添加上去,以便能够抓去mtrics:

  • 修改 /etc/prometheus/prometheus.yml :

根据prometheus运行参数 --web.external-url 修订抓去路径
# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    metrics_path: /prometheus/metrics

    static_configs:
      - targets: ["localhost:9090"]

配套安装exporter

我的主要目标是实现 HPE服务器监控 ,所以继续安装以下组件:

参考