Prometheus配置(文件)¶

Prometheus使用配置文件有2个:

prometheus.yaml : 主要配置文件，包含所有的 scrape 配置， service discovery 详情，存储位置，数据保留(data retention)配置等
prometheus.rules : 包含所有告警规则

对于扩展 Prometheus配置到一个Kubernetes config map，不需要build Prometheus镜像(不管是添加或移除配置)；只需要更新 config map 然后重启Prometheus pods来使用新配置。

在在Kubernetes集群(z-k8s)部署集成GPU监控的Prometheus和Grafana 以及在Kuternetes集成GPU可观测能力采用了 intergrate_gpu_telemetry_into_k8s 在 helm 中添加了抓取任务，那么这个 config map 在哪里呢？

在 configMap 配置 additionalScrapeConfigs 添加 gpu-metrics¶

# AdditionalScrapeConfigs allows specifying additional Prometheus scrape configurations. Scrape configurations
# are appended to the configurations generated by the Prometheus Operator. Job configurations must have the form
# as specified in the official Prometheus documentation:
# https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config. As scrape configs are
# appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility
# to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible
# scrape configs are going to break Prometheus after the upgrade.
#
# The scrape configuration example below will find master nodes, provided they have the name .*mst.*, relabel the
# port to 2379 and allow etcd scraping provided it is running on all Kubernetes master nodes
#
additionalScrapeConfigs:
- job_name: gpu-metrics
  scrape_interval: 1s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - gpu-operator
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_node_name]
    action: replace
    target_label: kubernetes_node

可以看到 namespace 是 gpu-operator

所以检查有哪些 CM

获取 gpu-operator namesapce中的CM¶

kubectl get cm -A | grep gpu-operator

输出内容:

获取 gpu-operator namesapce中的CM输出项¶

gpu-operator                    53822513.nvidia.com                                            0      88d
gpu-operator                    default-gpu-clients                                            1      88d
gpu-operator                    default-mig-parted-config                                      1      88d
gpu-operator                    gpu-operator-1673526262-node-feature-discovery-worker-conf     1      88d
gpu-operator                    istio-ca-root-cert                                             1      88d
gpu-operator                    kube-root-ca.crt                                               1      88d

从 Prometheus 的WEB管理界面，可以选择菜单 Status >> Configuration 看到在Kubernetes集群(z-k8s)部署集成GPU监控的Prometheus和Grafana 和在Kuternetes集成GPU可观测能力增加的配置部分:

Prometheus 的配置文件 prometheus.yaml 增加了 gpu-metrics¶

- job_name: gpu-metrics
  honor_timestamps: true
  scrape_interval: 1s
  scrape_timeout: 1s
  metrics_path: /metrics
  scheme: http
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_node_name]
    separator: ;
    regex: (.*)
    target_label: kubernetes_node
    replacement: $1
    action: replace
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - default

那么究竟改如何修订这个配置呢？

我发现，实际上 prometheus 这个 namespace 中，执行:

kubectl -n prometheus get all

会看到:

...
NAME                                                                    READY   AGE
statefulset.apps/alertmanager-kube-prometheus-stack-1680-alertmanager   1/1     3d3h
statefulset.apps/prometheus-kube-prometheus-stack-1680-prometheus       1/1     3d3h

也就是说 prometheus-kube-prometheus-stack-1680-prometheus 这个pods实际上是 statefulset ，所以检查如下:

kubectl -n prometheus get statefulset prometheus-kube-prometheus-stack-1680-prometheus -o yaml

可以看到如下:

...
        volumeMounts:
        - mountPath: /etc/prometheus/config
          name: config
        - mountPath: /etc/prometheus/config_out
          name: config-out
        - mountPath: /etc/prometheus/rules/prometheus-kube-prometheus-stack-1680-prometheus-rulefiles-0
          name: prometheus-kube-prometheus-stack-1680-prometheus-rulefiles-0
      dnsPolicy: ClusterFirst
      initContainers:
      - args:
        - --watch-interval=0
        - --listen-address=:8080
        - --config-file=/etc/prometheus/config/prometheus.yaml.gz
        - --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
        - --watched-dir=/etc/prometheus/rules/prometheus-kube-prometheus-stack-1680-prometheus-rulefiles-0
        command:
        - /bin/prometheus-config-reloader
...
        volumeMounts:
        - mountPath: /etc/prometheus/config
          name: config
        - mountPath: /etc/prometheus/config_out
          name: config-out
        - mountPath: /etc/prometheus/rules/prometheus-kube-prometheus-stack-1680-prometheus-rulefiles-0
          name: prometheus-kube-prometheus-stack-1680-prometheus-rulefiles-0
      restartPolicy: Always
,..
      volumes:
...
      - emptyDir:
          medium: Memory
        name: config-out
...

登陆到容器内部检查:

kubectl exec -it prometheus-kube-prometheus-stack-1680-prometheus-0 -n prometheus -- /bin/sh

可以在容器内部 /etc/prometheus/config_out 目录下找到gpu配置文件 prometheus.env.yaml ，其中包含了 gpu-metrics

- job_name: gpu-metrics
  kubernetes_sd_configs:
  - namespaces:
      names:
      - gpu-operator
    role: endpoints
  metrics_path: /metrics
  relabel_configs:
  - action: replace
    source_labels:
    - __meta_kubernetes_pod_node_name
    target_label: kubernetes_node
  scheme: http
  scrape_interval: 1s

那么，对于已经部署了 DCGM-Exporter 的集群，该如何添加这段 prometheus.env.yaml 呢?

根据 prometheus-kube-prometheus-stack-1680-prometheus 这个 statefulset 配置yaml，可以看到卷挂载:

        - mountPath: /etc/prometheus/config_out
          name: config-out
...
      volumes:
...
      - emptyDir:
          medium: Memory
        name: config-out

奇怪，为何要使用 tmpfs 内存文件系统作为卷挂载呢？

登陆到 prometheus-kube-prometheus-stack-1680-prometheus 调度所在的节点 z-k8s-n-2 ，确实:

df -h | grep config-out

可以看到:

tmpfs           7.6G   36K  7.6G   1% /var/lib/kubelet/pods/74ff8d0b-baa1-4cf3-b2f1-dcc2e47b6925/volumes/kubernetes.io~empty-dir/config-out

在这个 /var/lib/kubelet/pods/74ff8d0b-baa1-4cf3-b2f1-dcc2e47b6925/volumes/kubernetes.io~empty-dir/config-out 可以看到一个 prometheus.env.yaml

参考¶

How to Setup Prometheus Monitoring On Kubernetes Cluster