Daemonset `nodeAffinity`

要使得 DaemonSet 运行在特定节点有以下两种方式:

设置 .spec.template.spec.nodeSelector ，则 DaemonSet controller 会将Pods创建到符合选择节点(nodeSelector)部署Pod 的节点
设置 .spec.template.spec.affinity ，则 DaemonSet controller 会将Pods创建到符合 Kubernetes nodeAffinity 的节点

`.spec.template.spec.affinity`

为安装了GPU设备的节点打上标签:

GPU设备节点打标签标记有GPU设备

kubectl label node z-k8s-n-1 custom.k8s.cloud-atlas.io/gpu-mode=phy

部署 DCGM-Exporter Daemonset时候，需要确保只部署到安装有GPU的节点，否则DS会无法正常启动:

通过 nodeAffinity 控制 DCGM-Exporter Daemonset 部署

spec:
  ...
  selector:
    matchLabels:
      ...
  template:
    metadata:
      ...
      labels:
        ...
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: custom.k8s.cloud-atlas.io/gpu-mode
                operator: In
                values:
                - phy
      containers:
      ...

`nodeAntiAffinity` (其实还是 `nodeAffinity` )

对于同时部署自己的 kube-prometheus-stack 和阿里云Prometheus监控产品，需要在部署阿里云 starship Agent节点避开不安装 DCGM-Exporter Daemonset。不过，Kubernetes没有提供 nodeAntiAffinity ，实际上是通过 nodeAffinity 变相实现的(多个Label同时匹配来缩小部署范围):

为部署了 starship Agent服务器打标签:

GPU设备节点打标签标记启动了starship的dcgm功能

kubectl label node i-2ze6nk43mbc7xxpcb0an starship=dcgm

然后为节点部署添加控制:

通过增加 nodeAffinity 的 NotIn  控制 DCGM-Exporter Daemonset 不部署到已经运行了 starship 的节点

spec:
  ...
  selector:
    matchLabels:
      ...
  template:
    metadata:
      ...
      labels:
        ...
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: custom.k8s.cloud-atlas.io/gpu-mode
                operator: In
                values:
                - phy
              - key: starship
                operator: NotIn
                values:
                - dcgm
      containers:
      ...

参考