在Kubernetes集群(z-k8s)部署集成GPU监控的Prometheus和Grafana¶

备注

对于采用OVMF实现passthrough GPU和NVMe存储的 GPU Kubernetes 集群，本文综合了使用Helm 3在Kubernetes集群部署Prometheus和Grafana 和在Kuternetes集成GPU可观测能力实现私有云架构的 Kubernetes 监控

Prometheus 社区提供了 kube-prometheus-stack helm chart，一个完整的Kubernetes manifests，包含 Grafana通用可视分析平台 dashboard，以及结合了文档和脚本的 Prometheus 规则以方便通过 Prometheus Operator 。不过，对于GPU节点的监控，建议在部署时做一些修订(见本文)可以方便一气呵成。当然，先完成使用Helm 3在Kubernetes集群部署Prometheus和Grafana 再通过更新Kubernetes集群的Prometheus配置也可以。

helm3¶

helm 提供方便部署:

使用官方脚本安装 helm¶

curl -LO https://git.io/get_helm.sh
chmod 700 get_helm.sh
./get_helm.sh

安装NVIDIA GPU Operator ¶

这段待整理

安装Prometheus 和 Grafana¶

helm配置¶

添加 Prometheus 社区helm chart:

添加 Prometheus 社区helm chart¶

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

NVIDIA对社区方案参数做一些调整，所以先导出 chart 使用的变量(以便修订):

helm inspect values 输出Prometheus Stack的chart变量值¶

prometheus:

  ## Configuration for Prometheus service
  ##
  service:
    annotations: {}
    labels: {}
    clusterIP: ""

    ## Port for Prometheus Service to listen on
    ##
    port: 9090

    ## To be used with a proxy extraContainer port
    targetPort: 9090

    ## List of IP addresses at which the Prometheus server service is available
    ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
    ##
    externalIPs: []

    ## Port to expose on each node
    ## Only used if service.type is 'NodePort'
    ##
    nodePort: 30090

    ## Loadbalancer IP
    ## Only use if service.type is "LoadBalancer"
    loadBalancerIP: ""
    loadBalancerSourceRanges: []

    ## Denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints
    ##
    externalTrafficPolicy: Cluster

    ## Service type
    ##
    type: NodePort

...

grafana:

  ## Passed to grafana subchart and used by servicemonitor below
  ##
  service:
    portName: http-web
    nodePort: 30080
    type: NodePort

...

alertmanager:

  ## Deploy alertmanager
  ##
  enabled: true
  ...
  ## Configuration for Alertmanager service
  ##
  service:
    annotations: {}
    labels: {}
    clusterIP: ""

    ## Port for Alertmanager Service to listen on
    ##
    port: 9093
    ## To be used with a proxy extraContainer port
    ##
    targetPort: 9093
    ## Port to expose on each node
    ## Only used if service.type is 'NodePort'
    ##
    nodePort: 30903
    ## List of IP addresses at which the Prometheus server service is available
    ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
    ##
    ...
    ## Service type
    ##
    type: NodePort

修订一: 将metrics端口 30090 作为 NodePort 输出在每个节点(实际只需要修改 type: ClusterIP 改为 type: NodePort 行，建议同时修改 stable-grafana ( helm install 时支持传递参数 --set grafana.service.type=NodePort ，通过增加 nodePort 指定 80/30080映射), alertmanager (9093/30903) 和 prometheus (9090/30090) 对应的 svc ):

修订服务类型NodePort¶

prometheus:

  ## Configuration for Prometheus service
  ##
  service:
    annotations: {}
    labels: {}
    clusterIP: ""

    ## Port for Prometheus Service to listen on
    ##
    port: 9090

    ## To be used with a proxy extraContainer port
    targetPort: 9090

    ## List of IP addresses at which the Prometheus server service is available
    ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
    ##
    externalIPs: []

    ## Port to expose on each node
    ## Only used if service.type is 'NodePort'
    ##
    nodePort: 30090

    ## Loadbalancer IP
    ## Only use if service.type is "LoadBalancer"
    loadBalancerIP: ""
    loadBalancerSourceRanges: []

    ## Denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints
    ##
    externalTrafficPolicy: Cluster

    ## Service type
    ##
    type: NodePort

...

grafana:

  ## Passed to grafana subchart and used by servicemonitor below
  ##
  service:
    portName: http-web
    nodePort: 30080
    type: NodePort

...

alertmanager:

  ## Deploy alertmanager
  ##
  enabled: true
  ...
  ## Configuration for Alertmanager service
  ##
  service:
    annotations: {}
    labels: {}
    clusterIP: ""

    ## Port for Alertmanager Service to listen on
    ##
    port: 9093
    ## To be used with a proxy extraContainer port
    ##
    targetPort: 9093
    ## Port to expose on each node
    ## Only used if service.type is 'NodePort'
    ##
    nodePort: 30903
    ## List of IP addresses at which the Prometheus server service is available
    ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
    ##
    ...
    ## Service type
    ##
    type: NodePort

备注

我最初在 kube-prometheus-stack.values 没有找到修订 grafana 的 service.type 的地方，后来找到可以通过传递参数 --set grafana.service.type=NodePort 实现，再仔细看了 values ，原来默认没有配置，所以需要自己手工添加

其他修订:

defaultDashboardsTimezone: Asia/Shanghai

修订二: 修改 prometheusSpec.serviceMonitorSelectorNilUsesHelmValues 设置为 false :

修改 prometheusSpec.serviceMonitorSelectorNilUsesHelmValues 设置为 false¶

# If true, a nil or {} value for prometheus.prometheusSpec.serviceMonitorSelector will cause the
# prometheus resource to be created with selectors based on values in the helm deployment,
# which will also match the servicemonitors created
#
serviceMonitorSelectorNilUsesHelmValues: false

修改三: 在 configMap 配置 additionalScrapeConfigs 添加 gpu-metrics :

在 configMap 配置 additionalScrapeConfigs 添加 gpu-metrics¶

# AdditionalScrapeConfigs allows specifying additional Prometheus scrape configurations. Scrape configurations
# are appended to the configurations generated by the Prometheus Operator. Job configurations must have the form
# as specified in the official Prometheus documentation:
# https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config. As scrape configs are
# appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility
# to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible
# scrape configs are going to break Prometheus after the upgrade.
#
# The scrape configuration example below will find master nodes, provided they have the name .*mst.*, relabel the
# port to 2379 and allow etcd scraping provided it is running on all Kubernetes master nodes
#
additionalScrapeConfigs:
- job_name: gpu-metrics
  scrape_interval: 1s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - gpu-operator
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_node_name]
    action: replace
    target_label: kubernetes_node

备注

准备存储¶

创建在Kubernetes中部署hostPath存储持久化存储卷:

kube-prometheus-stack-pv.yaml 创建在Kubernetes中部署hostPath存储持久化存储卷¶

apiVersion: v1
kind: PersistentVolume
metadata:
  name: kube-prometheus-stack-pv
  labels:
    type: local
spec:
  storageClassName: prometheus-data
  capacity:
    storage: 400Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/prometheus/data"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: kube-prometheus-stack-pv-alert
  labels:
    type: local
spec:
  storageClassName: prometheus-data-alert
  capacity:
    storage: 400Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/prometheus/data"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: kube-prometheus-stack-pv-thanos
  labels:
    type: local
spec:
  storageClassName: prometheus-data-thanos
  capacity:
    storage: 400Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/prometheus/data"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: kube-prometheus-stack-pv-grafana
  labels:
    type: local
spec:
  storageClassName: prometheus-data-grafana
  capacity:
    storage: 400Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/prometheus/data/grafana-db"

备注

只需要创建 PV 就可以， kube-prometheus-stack values.yaml 中提供了 PVC 配置，会自动创建PVC

执行:

执行构建 kube-prometheus-stack-pv¶

kubectl apply -f kube-prometheus-stack-pv.yaml

部署¶

执行部署，部署中采用自己定制的values:

使用定制helm chart values来安装部署 kube-prometheus-stack (传递定制存储参数没有成功，实际正确方法应该采用 kube-prometheus-stack 持久化卷 )¶

helm install prometheus-community/kube-prometheus-stack \
   --create-namespace --namespace prometheus \
   --generate-name \
   --values /tmp/kube-prometheus-stack.values
   #--set=alertmanager.persistentVolume.existingClaim=kube-prometheus-stack-pvc,server.persistentVolume.existingClaim=kube-prometheus-stack-pvc,grafana.persistentVolume.existingClaim=kube-prometheus-stack-pvc

备注

持久化存储解决方案采用 kube-prometheus-stack 持久化卷验证通过

输出信息:

使用定制helm chart values来安装部署 kube-prometheus-stack 输出信息¶

NAME: kube-prometheus-stack-1680871060
LAST DEPLOYED: Fri Apr  7 20:38:00 2023
NAMESPACE: prometheus
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
  kubectl --namespace prometheus get pods -l "release=kube-prometheus-stack-1680871060"

Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.

备注

在生产集群部署，遇到过如下报错:

Error: INSTALLATION FAILED: unable to build kubernetes objects from release manifest: error validating "": error validating data: ValidationError(ServiceMonitor.spec.endpoints[0]): unknown field "enableHttp2" in com.coreos.monitoring.v1.ServiceMonitor.spec.endpoints

参考 prometheus-kube-stack helm install results in unknown field “enableHttp2” #2633 情况类似:

Found same error upgrading from old Prometheus installation.
Solution: uninstall prometheus, delete CRDs and install again.
https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#uninstall-helm-chart

原因是我之前的一次安装 prometheus-stack 安装，中途按下了 ctrl-c ，然后执行了一次 helm uninstall stack 来卸载。但是根据文档，CRD是不会自动清理掉，这可能导致了冲突。需要手工清理相关监控的CRD:

kubectl delete crd alertmanagerconfigs.monitoring.coreos.com
kubectl delete crd alertmanagers.monitoring.coreos.com
kubectl delete crd podmonitors.monitoring.coreos.com
kubectl delete crd probes.monitoring.coreos.com
kubectl delete crd prometheuses.monitoring.coreos.com
kubectl delete crd prometheusrules.monitoring.coreos.com
kubectl delete crd servicemonitors.monitoring.coreos.com
kubectl delete crd thanosrulers.monitoring.coreos.com

备注

在生产集群部署，遇到调度失败:

kubectl --namespace prometheus get pods kube-prometheus-stack-1680962838-prometheus-node-exporter-5kk5q -o yaml

...
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-04-08T14:07:36Z"
    message: '0/12 nodes are available: 12 node(s) didn''t have free ports for the
      requested pod ports.'
    reason: Unschedulable

原因是Kubernetes集群在阿里云平台部署，已经购买了阿里云的 Prometheus 监控，所以集群已经提前部署了 node-exporter ，导致端口中途。解决方法是修订上文自定义values文件 kube-prometheus-stack.values

...
## Deploy node exporter as a daemonset to all nodes
##
nodeExporter:
  enabled: false

然后重新部署。(不过实践发现还是存在其他问题，遂放弃)

备注

如果已经部署好 prometheus-stack ，需要添加 DCGM-Exporter 数据采集支持，则可以通过更新Kubernetes集群的Prometheus配置修订

备注

在墙内部署会遇到镜像下载问题，通过镜像导入目标节点:

#下载
docker pull registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
docker pull registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.8.2

#导出
docker save -o kube-webhook-certgen.tar registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
docker save -o kube-state-metrics.tar registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.8.2

#导入
nerdctl -n k8s.io load < /tmp/kube-webhook-certgen.tar
nerdctl -n k8s.io load < /tmp/kube-state-metrics.tar

对于需要部署到指定监控服务器，可以采用 label 方法:

kubectl label nodes i-0jl8d8r83kkf3yt5lzh7 telemetry=prometheus

依次修订 deployments ，例如 kubectl edit deployment stable-grafana

spec:
  nodeSelector:
    telemetry: prometheus
  containers:
  ...

检查 prometheus namespace中部署的容器:

检查 kube-prometheus-stack 部署容器¶

kubectl --namespace prometheus get pods -l "release=kube-prometheus-stack-1680871060"

输出显示类似如下:

检查 kube-prometheus-stack 部署容器输出显示¶

NAME                                                              READY   STATUS    RESTARTS   AGE
kube-prometheus-stack-1680-operator-df66d5c4c-8jqzj               1/1     Running   0          3m59s
kube-prometheus-stack-1680871060-kube-state-metrics-865958g6ffz   1/1     Running   0          3m59s
kube-prometheus-stack-1680871060-prometheus-node-exporter-6nwkp   1/1     Running   0          3m59s
kube-prometheus-stack-1680871060-prometheus-node-exporter-6rk88   1/1     Running   0          3m59s
kube-prometheus-stack-1680871060-prometheus-node-exporter-7jx92   1/1     Running   0          3m59s
kube-prometheus-stack-1680871060-prometheus-node-exporter-dkqqs   1/1     Running   0          3m59s
kube-prometheus-stack-1680871060-prometheus-node-exporter-dqmfc   1/1     Running   0          3m59s
kube-prometheus-stack-1680871060-prometheus-node-exporter-h2rdq   1/1     Running   0          3m59s
kube-prometheus-stack-1680871060-prometheus-node-exporter-h44wr   1/1     Running   0          3m59s
kube-prometheus-stack-1680871060-prometheus-node-exporter-t655c   1/1     Running   0          3m59s

检查部署完成的Prometheus Pods可以看到每个节点都运行了 node-exporter 且已经运行起 Prometheus和Grafana(注意，位于 prometheus namespace)

备注

如果有遇到镜像无法下载问题，请参考使用Helm 3在Kubernetes集群部署Prometheus和Grafana 我的实践经验

服务输出¶

检查 svc :

检查部署完成的服务 kubectl get svc¶

kubectl get svc -n prometheus

输出显示:

kubectl get svc 输出显示当前grafana服务还是ClusterIP，需要修订¶

NAME                                                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
alertmanager-operated                                       ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP   14m
kube-prometheus-stack-1680-alertmanager                     NodePort    10.106.70.4      <none>        9093:30903/TCP               15m
kube-prometheus-stack-1680-operator                         ClusterIP   10.107.104.10    <none>        443/TCP                      15m
kube-prometheus-stack-1680-prometheus                       NodePort    10.101.120.210   <none>        9090:30090/TCP               15m
kube-prometheus-stack-1680871060-grafana                    ClusterIP   10.99.214.112    <none>        80/TCP                       15m
kube-prometheus-stack-1680871060-kube-state-metrics         ClusterIP   10.108.43.250    <none>        8080/TCP                     15m
kube-prometheus-stack-1680871060-prometheus-node-exporter   ClusterIP   10.110.33.129    <none>        9100/TCP                     15m
prometheus-operated                                         ClusterIP   None             <none>        9090/TCP                     14m

默认情况下， prometheus 和 grafana 服务都是使用ClusterIP在集群内部，所以要能够在外部访问，需要使用 Kubernetes集群的Load Balancer和Ingress辨析或者 NodePort (简单) 。上文我采用了NVIDIA官方部署文档方法，将 alertmanager 和 prometheus 修订成了 NodePort 模式，但是没有修订 grafana ，所以下面我再手工修订 grafana 设置为 NodePort 模式

修改 stable-grafana 服务，将 type 从 ClusterIP 修改为 NodePort 或者 LoadBalancer

kubectl edit svc 将ClusterIP类型改为NodePort¶

kubectl edit svc kube-prometheus-stack-1680871060-grafana -n prometheus

最终检查 svc 如下:

kubectl get svc 输出显示NodePort类型就是在运行服务节点对外提供服务¶

NAME                                                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
alertmanager-operated                                       ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP   166m
kube-prometheus-stack-1680-alertmanager                     NodePort    10.106.70.4      <none>        9093:30903/TCP               166m
kube-prometheus-stack-1680-operator                         ClusterIP   10.107.104.10    <none>        443/TCP                      166m
kube-prometheus-stack-1680-prometheus                       NodePort    10.101.120.210   <none>        9090:30090/TCP               166m
kube-prometheus-stack-1680871060-grafana                    NodePort    10.99.214.112    <none>        80:32427/TCP                 166m
kube-prometheus-stack-1680871060-kube-state-metrics         ClusterIP   10.108.43.250    <none>        8080/TCP                     166m
kube-prometheus-stack-1680871060-prometheus-node-exporter   ClusterIP   10.110.33.129    <none>        9100/TCP                     166m
prometheus-operated                                         ClusterIP   None             <none>        9090/TCP                     166m

不过，这样外部访问的端口是随机的，有点麻烦。临时性解决方法，我采用 Nginx反向代理将对外端口固定住，然后反向转发给 NodePort 的随机端口，至少能临时使用。

端口转发¶

备注

我在上次实践在Kuternetes集成GPU可观测能力采用 Nginx反向代理到Grafana，遇到过在反向代理后面运行Grafana 问题，原因是Grafana新版本为了阻断跨站攻击，对客户端请求源和返回地址进行校验，所以必须对 Nginx 设置代理头部

另外可以采用 Apache反向代理来实现反向代理(因为我已经采用了 Apache WebDAV服务器实现通过WebDAV同步Joplin数据 )

在通过 NodePort 输出 Prometheus/Grafana/Altermanager 时，pod容器可以在集群的任何node节点运行。对于外部访问，比较好的方式是采用 Kubernetes MetalLB 负载均衡结合 Ingress 来实现完整的云计算网络。

不过，出于快速构建，我当前采用简化的服务输出方式 NodePort ，所以再部署一个简单的WEB反向代理就能在外部访问 iptables端口转发(port forwarding) 实现访问。

检查 prometheus-stack 输出的 NodePort :

检查服务的 NodePort¶

kubectl get svc -n prometheus | grep NodePort

输出显示:

检查服务的 NodePort 输出¶

kube-prometheus-stack-1680-alertmanager                     NodePort    10.106.70.4      <none>        9093:30903/TCP               2d1h
kube-prometheus-stack-1680-prometheus                       NodePort    10.101.120.210   <none>        9090:30090/TCP               2d1h
kube-prometheus-stack-1680871060-grafana                    NodePort    10.99.214.112    <none>        80:32427/TCP                 2d1h

检查 prometheus-stack 对应pods落在哪个 nodes 上:

检查prometheus的服务对应 pods 落在哪个 nodes 上¶

kubectl get pods -n prometheus -o wide

输出显示

检查prometheus的服务对应 pods 落在哪个 nodes (对应3个NODE)¶

NAME                                                              READY   STATUS    RESTARTS       AGE    IP              NODE        NOMINATED NODE   READINESS GATES
alertmanager-kube-prometheus-stack-1680-alertmanager-0            2/2     Running   1 (2d2h ago)   2d2h   10.0.5.28       z-k8s-n-3   <none>           <none>
kube-prometheus-stack-1680-operator-df66d5c4c-8jqzj               1/1     Running   0              2d2h   10.0.4.178      z-k8s-n-2   <none>           <none>
kube-prometheus-stack-1680871060-grafana-6f5c7cb5-k2kw9           3/3     Running   0              2d2h   10.0.7.107      z-k8s-n-4   <none>           <none>
kube-prometheus-stack-1680871060-kube-state-metrics-865958g6ffz   1/1     Running   0              2d2h   10.0.7.187      z-k8s-n-4   <none>           <none>
kube-prometheus-stack-1680871060-prometheus-node-exporter-6nwkp   1/1     Running   0              2d2h   192.168.6.112   z-k8s-n-2   <none>           <none>
...
prometheus-kube-prometheus-stack-1680-prometheus-0                2/2     Running   0              2d2h   10.0.4.242      z-k8s-n-2   <none>           <none>

检查 nodes 对应IP:

检查 nodes 对应 IP¶

kubectl get nodes -o wide

检查 nodes 对应 IP¶

NAME        STATUS   ROLES           AGE    VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
z-k8s-m-1   Ready    control-plane   266d   v1.25.3   192.168.6.101   <none>        Ubuntu 22.04.2 LTS   5.15.0-69-generic   containerd://1.6.6
z-k8s-m-2   Ready    control-plane   264d   v1.25.3   192.168.6.102   <none>        Ubuntu 22.04.2 LTS   5.15.0-69-generic   containerd://1.6.6
z-k8s-m-3   Ready    control-plane   264d   v1.25.3   192.168.6.103   <none>        Ubuntu 22.04.2 LTS   5.15.0-69-generic   containerd://1.6.6
z-k8s-n-1   Ready    <none>          264d   v1.25.3   192.168.6.111   <none>        Ubuntu 22.04.2 LTS   5.15.0-69-generic   containerd://1.6.6
z-k8s-n-2   Ready    <none>          264d   v1.25.3   192.168.6.112   <none>        Ubuntu 22.04.2 LTS   5.15.0-69-generic   containerd://1.6.6
z-k8s-n-3   Ready    <none>          264d   v1.25.3   192.168.6.113   <none>        Ubuntu 22.04.2 LTS   5.15.0-69-generic   containerd://1.6.6
z-k8s-n-4   Ready    <none>          264d   v1.25.3   192.168.6.114   <none>        Ubuntu 22.04.2 LTS   5.15.0-69-generic   containerd://1.6.6
z-k8s-n-5   Ready    <none>          264d   v1.25.3   192.168.6.115   <none>        Ubuntu 22.04.2 LTS   5.15.0-69-generic   containerd://1.6.6

整理对应关系:

`prometheus-stack` 服务 `NodePort` 对应关系¶
服务	Gateway IP	Gateway Port	Node IP	Port
grafana	192.168.106.15	8080	192.168.6.114	32427
prometheus	192.168.106.15	9090	192.168.6.112	30090
alertmanager	192.168.106.15	9093	192.168.6.113	30903

执行以下端口转发脚本:

端口转发 prometheus-stack 服务端口¶

local_host=192.168.106.15

dashboard_port=8443
grafana_port=8080
prometheus_port=9090
alertmanager_port=9093

k8s_dashboard_host=172.21.44.215
k8s_dashboard_port=32642

k8s_grafana_host=192.168.6.114
k8s_grafana_port=32427

k8s_prometheus_host=192.168.6.112
k8s_prometheus_port=30090

k8s_alertmanager_host=192.168.6.113
k8s_alertmanager_port=30903

iptables -t nat -A PREROUTING -p tcp --dport ${dashboard_port} -j DNAT --to-destination ${k8s_dashboard_host}:${k8s_dashboard_port}
iptables -t nat -A POSTROUTING -p tcp -d ${k8s_dashboard_host} --dport ${k8s_dashboard_port} -j SNAT --to-source ${local_host}

iptables -t nat -A PREROUTING -p tcp --dport ${grafana_port} -j DNAT --to-destination ${k8s_grafana_host}:${k8s_grafana_port}
iptables -t nat -A POSTROUTING -p tcp -d ${k8s_grafana_host} --dport ${k8s_grafana_port} -j SNAT --to-source ${local_host}

iptables -t nat -A PREROUTING -p tcp --dport ${prometheus_port} -j DNAT --to-destination ${k8s_prometheus_host}:${k8s_prometheus_port}
iptables -t nat -A POSTROUTING -p tcp -d ${k8s_prometheus_host} --dport ${k8s_prometheus_port} -j SNAT --to-source ${local_host}

iptables -t nat -A PREROUTING -p tcp --dport ${alertmanager_port} -j DNAT --to-destination ${k8s_alertmanager_host}:${k8s_alertmanager_port}
iptables -t nat -A POSTROUTING -p tcp -d ${k8s_alertmanager_host} --dport ${k8s_alertmanager_port} -j SNAT --to-source ${local_host}

配置修订¶

对于需要后续调整的配置，采用更新Kubernetes集群的Prometheus配置方法:

使用 helm upgrade prometheus-community/kube-prometheus-stack¶

helm upgrade kube-prometheus-stack-1681228346 prometheus-community/kube-prometheus-stack \
  --namespace prometheus --values kube-prometheus-stack.values

例如更新 scrape 配置

持久化存储¶

默认配置:

默认 prometheus 存储在内存¶

...
        volumeMounts:
...
        - mountPath: /prometheus
          name: prometheus-kube-prometheus-stack-1681-prometheus-db
...
      volumes:
      - emptyDir: {}
        name: prometheus-kube-prometheus-stack-1681-prometheus-db

我最初按照上文 Deploying kube-prometheus-stack with persistent storage on Kubernetes Cluster 构建了存储PV/PVC，但是采用了 helm install 参数 ``

访问使用¶

访问 Grafana 面板，初始账号 admin 密码是 prom-operator ，请立即修改

然后我们可以开始 Grafana配置快速起步

在Kuternetes集成GPU可观测能力采用 NVIDIA官方提供的面板 NVIDIA DCGM Exporter Dashboard ，可以直接导入监控我的 Nvidia Tesla P10 GPU运算卡

Kubernetes Ingress控制器改进¶

我最初为了方便快速，采用了 NodePort 输出服务，所以简单部署了在反向代理后面运行Grafana ，后续尝试改进成 Kubernetes Ingress控制器模式

参考¶

How to Install Prometheus and Grafana on Kubernetes using Helm 3
Deploying kube-prometheus-stack with persistent storage on Kubernetes Cluster 这个持久化卷的配置方法不成功，但是持久化卷应用方法不通过。参考 [kube-prometheus-stack] how to use persistent volumes instead of emptyDir