kube-prometheus-stack 使用HTTP方式获取etcd的metrics监控

我在 kube-prometheus-stack 监控etcd 实践遇到挫折,没有解决 https 方式获取metrics的问题(始终http),即使配置了 scheme: https 。万般无奈,我暂时回退到采用 2381 的 HTTP方式获取监控数据

其实etcd官方对监控的文档也很粗疏,采用的也是比较简单的 http 方式

开启 2381 metrics

对于 Systemd进程管理器 运行的 etcd - 分布式kv存储 ,根据 systemd 配置文件,可以看到,etcd参数是通过 /etc/etcd.env 定制,所以在这个文件中加入以下行启动http的metrics:

ETCD_LISTEN_METRICS_URLS=http://192.168.6.204:2381,http://127.0.0.1:2381

参考环境变量:

–listen-metrics-urls
List of additional URLs to listen on that will respond to both the /metrics and /health endpoints
default: ""
env variable: ETCD_LISTEN_METRICS_URLS

重启 etcd

  • 配置 kube-prometheus-stack.values :

简单开启 2381 端口 metrics采集,无需证书(http)
## Component scraping etcd
##
kubeEtcd:
  enabled: true

  ## If your etcd is not deployed as a pod, specify IPs it can be found on
  ##
  endpoints:
    - 10.0.1.167
    - 10.0.1.168
    - 10.0.1.166

  ## Etcd service. If using kubeEtcd.endpoints only the port and targetPort are used
  ##
  service:
    enabled: true
    port: 2381
    targetPort: 2381
    # selector:
    #   component: etcd

备注

如果之前已经部署过一次 使用Helm 3在Kubernetes集群部署Prometheus和Grafana ( 实践案例 在Kubernetes集群(z-k8s)部署集成GPU监控的Prometheus和Grafana ) ,则默认 kube-prometheus-stack.values 已经启用过 etcd 监控配置项:

kubeEtcd:
  enabled: true

那么在 kube-system 会有一个 endporint 类似名为 kube-prometheus-stack-1681-kube-etcd ,但是实际 ENDPOINTS 内容是空的:

NAME                                      ENDPOINTS           AGE
kube-prometheus-stack-1681-kube-etcd      <none>              5h29m

那么直接执行 helm upgrade 会报错:

使用 helm upgrade prometheus-community/kube-prometheus-stack提示etcd相关错误
rror: UPGRADE FAILED: rendered manifests contain a resource that already exists. 
Unable to continue with update: Endpoints "kube-prometheus-stack-1680-kube-etcd" in namespace "kube-system" exists and cannot be imported into the current release: invalid ownership metadata; 
annotation validation error: missing key "meta.helm.sh/release-name": must be set to "kube-prometheus-stack-1680871060"; 
annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "prometheus"

所以先暂时去掉 etcd 监控:

kube-prometheus-stack.values 配置暂时去除 etcd 监控
    ## Component scraping etcd
    ##
    kubeEtcd:
      enabled: false

然后再执行上面的 配置 kube-prometheus-stack.values (简单开启 2381 端口 metrics采集,无需证书(http)),再执行下一步 更新helm ,就能正常开启etcd监控

  • 更新helm:

使用 helm upgrade prometheus-community/kube-prometheus-stack
helm upgrade kube-prometheus-stack-1681228346 prometheus-community/kube-prometheus-stack \
  --namespace prometheus --values kube-prometheus-stack.values

此时可以在 prometheus WEB界面看到 targetsetcd 采集成功

参考