kube-prometheus-stack
扩展运行参数( extraArgs
)¶
在使用 helm 完成 在Kubernetes集群(z-k8s)部署集成GPU监控的Prometheus和Grafana ,有一个需求是定制 kube-state-metrics (KSM) 运行参数:
...
spec:
...
template:
...
spec:
containers:
- args:
- --port=8080
- --resources=certificatesigningrequests,configmaps,cronjobs...
- --metric-labels-allowlist=nodes=[infra.cloud-atlas/node-ip,machine.cloud-atlas.io/biz-name,k8s.cloud-atlas.io/arch],pods=[sync.k8s.cloud-atlas.io/resource-type,custom.cloud-atlas.io/runtime-class]
image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.8.2
imagePullPolicy: IfNotPresent
...
虽然可以通过 kubectl -n prometheus edit deploy kube-prometheus-stack-1681228346-kube-state-metrics
直接修订添加 --metric-labels-allowlist
运行参数,但是如果执行 更新Kubernetes集群的Prometheus配置 就会被刷掉,所以我们需要固化参数。
仔细检查 kube-prometheus-stack.values
可以看到在 prometheus-node-exporter
这个 subchart
有定制运行参数的配置:
## Configuration for prometheus-node-exporter subchart
##
prometheus-node-exporter:
namespaceOverride: ""
podLabels:
## Add the 'node-exporter' label to be used by serviceMonitor to match standard common usage in rules and grafana dashboards
##
jobLabel: node-exporter
releaseLabel: true
extraArgs:
- --collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)
- --collector.filesystem.fs-types-exclude=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
service:
portName: http-metrics
...
原来 kube-prometheus-stack.values
每个 subchart 都可以采用类似方法定制pod中镜像运行参数(多个container该怎么搞?)
参考 Using prometheus-community helm chart how can I expose custom pod labels 做如下定制:
## Configuration for kube-state-metrics subchart
##
kube-state-metrics:
namespaceOverride: ""
rbac:
create: true
releaseLabel: true
extraArgs:
- --metric-labels-allowlist=nodes=[infra.cloud-atlas/node-ip,machine.cloud-atlas.io/biz-name,k8s.cloud-atlas.io/arch]<Plug>PeepOpenods=[sync.k8s.cloud-atlas.io/resource-type,custom.cloud-atlas.io/runtime-class]
prometheus:
monitor:
enabled: true
...
然后执行 更新Kubernetes集群的Prometheus配置 :
helm upgrade kube-prometheus-stack-1681228346 prometheus-community/kube-prometheus-stack \
--namespace prometheus --values kube-prometheus-stack.values