Kubernetes管控节点Pods创建”CreateContainerError”¶

测试环境ARM平台的Kubernetes集群，管控检点升级系统重启后出现了非常奇怪的 “CreateContainerError” 错误:

kubectl get pods -o wide -n kube-system | grep master 显示:

coredns-78fcd69978-775zr             1/1     Running                2 (3d23h ago)     17d     10.244.0.7      pi-master1   <none>           <none>
coredns-78fcd69978-cx94g             1/1     Running                2 (3d6h ago)      17d     10.244.0.6      pi-master1   <none>           <none>
etcd-pi-master1                      0/1     CreateContainerError   4 (2d8h ago)      9m50s   192.168.6.11    pi-master1   <none>           <none>
kube-apiserver-pi-master1            0/1     CreateContainerError   9 (2d8h ago)      17d     192.168.6.11    pi-master1   <none>           <none>
kube-controller-manager-pi-master1   1/1     Running                12 (2d8h ago)     17d     192.168.6.11    pi-master1   <none>           <none>
kube-flannel-ds-jqgsm                1/1     Running                1239 (2d8h ago)   17d     192.168.6.11    pi-master1   <none>           <none>
kube-proxy-rfgxq                     1/1     Running                5 (2d8h ago)      17d     192.168.6.11    pi-master1   <none>           <none>
kube-scheduler-pi-master1            0/1     CreateContainerError   12 (2d8h ago)     17d     192.168.6.11    pi-master1   <none>           <none>

可以看到关键组件 etcd, apiserver 和 scheduler 都创建失败。通常管控节点大量服务pod失败很可能和etcd相关，因为etcd是所有组件数据存储和交换的数据库。

执行pod检查

etcd检查:

kubectl -n kube-system describe pods etcd-pi-master1

奇怪，事件显示只有:

...
Events:
  Type    Reason  Age                       From     Message
  ----    ------  ----                      ----     -------
  Normal  Pulled  2m14s (x15298 over 3d6h)  kubelet  Container image "k8s.gcr.io/etcd:3.5.0-0" already present on machine

同样，apiserver也是如此:

kubectl -n kube-system describe pods kube-apiserver-pi-master1

事件显示:

...
Events:
  Type    Reason  Age                     From     Message
  ----    ------  ----                    ----     -------
  Normal  Pulled  16s (x15280 over 3d6h)  kubelet  Container image "k8s.gcr.io/kube-apiserver:v1.22.0" already present on machine

尝试直接清理掉pod:

$ kubectl -n kube-system  delete pod etcd-pi-master1
pod "etcd-pi-master1" deleted

但是检查 etcd 进程发现，这个进程还是2天前启动的进程:

root        4613  5.2  1.9 10611120 76868 ?      Ssl  Aug25 178:03 etcd --advertise-client-urls=https://192.168.6.11:2379 ...

也就是说，这个pod销毁重启是不生效。

尝试重启一次操作系统，重启后发现，还是同样的pods无法启动:

NAME                                 READY   STATUS                 RESTARTS          AGE
coredns-78fcd69978-775zr             1/1     Running                2 (4d ago)        17d
coredns-78fcd69978-cx94g             1/1     Running                2 (3d7h ago)      17d
etcd-pi-master1                      0/1     CreateContainerError   4 (2d9h ago)      14m
kube-apiserver-pi-master1            0/1     CreateContainerError   9 (2d9h ago)      17d
...
kube-scheduler-pi-master1            0/1     CreateContainerError   12 (2d9h ago)     17d

但是神奇的是，过了一会 etcd 和 apiserver / scheduler 启动起来了，但是 coredns 和 flannel 网络存在问题:

$ kubectl -n kube-system get pods -o wide
NAME                                 READY   STATUS             RESTARTS         AGE   IP              NODE         NOMINATED NODE   READINESS GATES
coredns-78fcd69978-775zr             0/1     Completed          2                17d   <none>          pi-master1   <none>           <none>
coredns-78fcd69978-cx94g             0/1     Completed          2                17d   <none>          pi-master1   <none>           <none>
etcd-pi-master1                      1/1     Running            5 (2d9h ago)     19m   192.168.6.11    pi-master1   <none>           <none>
kube-apiserver-pi-master1            1/1     Running            10 (2d9h ago)    17d   192.168.6.11    pi-master1   <none>           <none>
kube-controller-manager-pi-master1   1/1     Running            13 (8m42s ago)   17d   192.168.6.11    pi-master1   <none>           <none>
kube-flannel-ds-4dxvz                1/1     Running            1 (7d7h ago)     17d   192.168.6.16    pi-worker2   <none>           <none>
kube-flannel-ds-jdwcr                1/1     Running            1931 (8d ago)    15d   192.168.6.200   zcloud       <none>           <none>
kube-flannel-ds-jqgsm                0/1     CrashLoopBackOff   1244 (66s ago)   17d   192.168.6.11    pi-master1   <none>           <none>
kube-flannel-ds-l7j6b                1/1     Running            6 (5d21h ago)    17d   30.73.165.29    jetson       <none>           <none>
kube-flannel-ds-nqg77                1/1     Running            1 (7d7h ago)     17d   192.168.6.15    pi-worker1   <none>           <none>
kube-flannel-ds-pkhch                1/1     Running            3 (2d8h ago)     15d   30.73.167.10    kali         <none>           <none>
kube-proxy-bn9q8                     1/1     Running            2 (2d8h ago)     15d   30.73.167.10    kali         <none>           <none>
kube-proxy-d9xlj                     1/1     Running            1 (7d7h ago)     17d   192.168.6.15    pi-worker1   <none>           <none>
kube-proxy-gz9bh                     1/1     Running            6 (5d21h ago)    17d   30.73.165.29    jetson       <none>           <none>
kube-proxy-nt27w                     1/1     Running            1 (7d7h ago)     17d   192.168.6.16    pi-worker2   <none>           <none>
kube-proxy-pbtcz                     1/1     Running            2 (10d ago)      15d   192.168.6.200   zcloud       <none>           <none>
kube-proxy-rfgxq                     1/1     Running            6 (8m42s ago)    17d   192.168.6.11    pi-master1   <none>           <none>
kube-scheduler-pi-master1            1/1     Running            13 (2d9h ago)    17d   192.168.6.11    pi-master1   <none>           <none>

检查 kube-flannel 失败原因:

kubectl -n kube-system logs kube-flannel-ds-tq9x5

原来 kube-flannel 需要主机有一个默认路由来判断默认网络接口:

I0827 09:48:41.911302       1 main.go:520] Determining IP address of default interface
E0827 09:48:41.912029       1 main.go:205] Failed to find any valid interface to use: failed to get default interface: Unable to find default route

我的测试服务器，默认网络是通过无线网卡实现的，只是每次服务器重启，无线网卡没有初始化，所以 wlan0 是 DOWN 状态

重新执行一次 netplan网络配置命令来恢复无线网络:
```
sudo netplan apply
```

然后检查网卡:

ip addr

确定无线网络恢复工作

此时再次检查 kube-system 中的pods，就可以看到 kube-flannel 能够正常启动，也同时恢复了 coredns
```
kubectl -n kube-system get pods
```

NAME                                 READY   STATUS    RESTARTS        AGE
coredns-78fcd69978-775zr             1/1     Running   3 (14m ago)     17d
coredns-78fcd69978-cx94g             1/1     Running   3 (14m ago)     17d
etcd-pi-master1                      1/1     Running   5 (2d9h ago)    25m
kube-apiserver-pi-master1            1/1     Running   10 (2d9h ago)   17d
kube-controller-manager-pi-master1   1/1     Running   13 (14m ago)    17d
kube-flannel-ds-4dxvz                1/1     Running   1 (7d7h ago)    17d
kube-flannel-ds-jdwcr                1/1     Running   1931 (8d ago)   15d
kube-flannel-ds-l7j6b                1/1     Running   6 (5d22h ago)   17d
kube-flannel-ds-nqg77                1/1     Running   1 (7d7h ago)    17d
kube-flannel-ds-pkhch                1/1     Running   3 (2d8h ago)    15d
kube-flannel-ds-tq9x5                1/1     Running   5 (2m40s ago)   4m30s
kube-proxy-bn9q8                     1/1     Running   2 (2d8h ago)    15d
kube-proxy-d9xlj                     1/1     Running   1 (7d7h ago)    17d
kube-proxy-gz9bh                     1/1     Running   6 (5d22h ago)   17d
kube-proxy-nt27w                     1/1     Running   1 (7d7h ago)    17d
kube-proxy-pbtcz                     1/1     Running   2 (10d ago)     15d
kube-proxy-rfgxq                     1/1     Running   6 (14m ago)     17d
kube-scheduler-pi-master1            1/1     Running   13 (2d9h ago)   17d

备注

kube-flannel 的daemonset启动需要确保物理主机的默认路由网卡启动，如果网卡没有设置默认路由，会导致daemonset pod无法启动。这也是我之前发现，如果没有启动无线网卡（默认路由接口），管控master服务器的负载极高，应该也是和网络相关的 kube-flannel 无法正常工作有关。