Cilium完全取代kube-proxy运行Kubernetes¶
Cilium提供了完全取代 kube-proxy
的运行模式。比较简单的方式是在 kubeadm
bootstrap 集群的时候就不安装 kube-proxy
。
备注
Cilium代替 kube-proxy
需要依赖 socket-LB
功能,这要求内核 v4.19.57
, v5.1.16
, v5.2.0
或者更新的 Linux 内核。 Linux 内核 v5.3 和 v5.8 添加了更多功能,可以让Cilium更加优化替代 kube-proxy
的实现。
快速起步¶
在
kubeadm
初始化集群时候就可以跳过安装kube-proxy
:
kubeadm init --skip-phases=addon/kube-proxy
已经安装 kube-proxy
的替换方法¶
对于已经安装了 kube-proxy
作为 DaemonSet 的Kubernetes集群,则通过以下命令移除 kube-proxy
。 注意: 删除kube-proxy会导致现有服务中断链接,并且停止流量,直到Cilium替换完全安装好才能恢复
kubectl -n kube-system delete ds kube-proxy
# Delete the configmap as well to avoid kube-proxy being reinstalled during a Kubeadm upgrade (works only for K8s 1.19 and newer)
kubectl -n kube-system delete cm kube-proxy
# Run on each node with root permissions:
iptables-save | grep -v KUBE | iptables-restore
设置Helm仓库:
helm repo add cilium https://helm.cilium.io/
执行以下命令进行安装:
#API_SERVER_IP=192.168.6.101
API_SERVER_IP=z-k8s-api.staging.huatai.me
# Kubeadm default is 6443
API_SERVER_PORT=6443
helm install cilium cilium/cilium --version 1.11.7 \
--namespace kube-system \
--set kubeProxyReplacement=strict \
--set k8sServiceHost=${API_SERVER_IP} \
--set k8sServicePort=${API_SERVER_PORT}
这里有一个报错:
Error: INSTALLATION FAILED: cannot re-use a name that is still in use
原因官方文档是以第一次初始安装cilium为准,也就是直接删除掉kube-proxy之后,立即进行cilium安装。而我的操作步骤是,安装了cilium之后,再删除掉kube-proxy并重新安装cilium,所以就会出现冲突报错。这个问题参考 Cannot install kubernetes helm chart Error: cannot re-use a name that is still in use ,也就是采用 helm 提供的
upgrade
命令代替 install
命令,就可以重新安装:
#API_SERVER_IP=192.168.6.101
API_SERVER_IP=z-k8s-api.staging.huatai.me
# Kubeadm default is 6443
API_SERVER_PORT=6443
helm upgrade cilium cilium/cilium --version 1.11.7 \
--namespace kube-system \
--set kubeProxyReplacement=strict \
--set k8sServiceHost=${API_SERVER_IP} \
--set k8sServicePort=${API_SERVER_PORT}
此时可以看到替换成功:
W0813 23:39:08.689475 1285915 warnings.go:70] spec.template.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[1].matchExpressions[0].key: beta.kubernetes.io/os is deprecated since v1.14; use "kubernetes.io/os" instead
Release "cilium" has been upgraded. Happy Helming!
NAME: cilium
LAST DEPLOYED: Sat Aug 13 23:39:06 2022
NAMESPACE: kube-system
STATUS: deployed
REVISION: 3
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble.
Your release version is 1.11.7.
For any further help, visit https://docs.cilium.io/en/v1.11/gettinghelp
另外一种解决方法可以参考 Cannot re-use a name that is still in use ,即先使用 helm uninstall
卸载组件,然后再进行 helm install
(未尝试)。
现在我们可以检查cilium是否在每个节点正常工作:
kubectl -n kube-system get pods -l k8s-app=cilium -o wide
输出显示:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cilium-2qcdd 1/1 Running 0 16m 192.168.6.113 z-k8s-n-3 <none> <none>
cilium-4drkm 1/1 Running 0 17m 192.168.6.102 z-k8s-m-2 <none> <none>
cilium-4xktc 1/1 Running 0 17m 192.168.6.101 z-k8s-m-1 <none> <none>
cilium-5j2xb 1/1 Running 0 16m 192.168.6.112 z-k8s-n-2 <none> <none>
cilium-d7mmq 1/1 Running 0 17m 192.168.6.114 z-k8s-n-4 <none> <none>
cilium-fw9b5 1/1 Running 0 17m 192.168.6.115 z-k8s-n-5 <none> <none>
cilium-t675t 1/1 Running 0 16m 192.168.6.103 z-k8s-m-3 <none> <none>
cilium-tsntp 1/1 Running 0 16m 192.168.6.111 z-k8s-n-1 <none> <none>
验证设置¶
在完成了kube-proxy替代之后,首先验证是否在节点上运行了Cilium agent正确模式:
kubectl -n kube-system exec ds/cilium -- cilium status | grep KubeProxyReplacement
此时显示输出类似:
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), clean-cilium-state (init)
KubeProxyReplacement: Strict [enp1s0 192.168.6.102 (Direct Routing)]
检查详细信息:
kubectl -n kube-system exec ds/cilium -- cilium status --verbose
可选步骤:通过Nginx部署验证¶
准备
my-nginx.yaml
:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
spec:
selector:
matchLabels:
run: my-nginx
replicas: 2
template:
metadata:
labels:
run: my-nginx
spec:
containers:
- name: my-nginx
image: nginx
ports:
- containerPort: 80
执行部署:
kubectl create -f my-nginx.yaml
检查pod创建:
kubectl get pods -o wide
出现一个意外,镜像始终没有下载成功:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-nginx-df7bbf6f5-457mh 0/1 ContainerCreating 0 12m <none> z-k8s-n-5 <none> <none>
my-nginx-df7bbf6f5-6gndk 0/1 ContainerCreating 0 12m <none> z-k8s-n-1 <none> <none>
通过 kubectl describe pods my-nginx-df7bbf6f5-457mh
可以看到一直停留在pulling image状态:
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned default/my-nginx-df7bbf6f5-457mh to z-k8s-n-5
Normal Pulling 10m kubelet Pulling image "nginx"
此时检查集群事件:
kubectl get events --sort-by=.metadata.creationTimestamp
可以看到:
LAST SEEN TYPE REASON OBJECT MESSAGE
16m Normal Scheduled pod/my-nginx-df7bbf6f5-457mh Successfully assigned default/my-nginx-df7bbf6f5-457mh to z-k8s-n-5
16m Normal Scheduled pod/my-nginx-df7bbf6f5-6gndk Successfully assigned default/my-nginx-df7bbf6f5-6gndk to z-k8s-n-1
16m Normal SuccessfulCreate replicaset/my-nginx-df7bbf6f5 Created pod: my-nginx-df7bbf6f5-6gndk
16m Normal SuccessfulCreate replicaset/my-nginx-df7bbf6f5 Created pod: my-nginx-df7bbf6f5-457mh
16m Normal ScalingReplicaSet deployment/my-nginx Scaled up replica set my-nginx-df7bbf6f5 to 2
16m Normal Pulling pod/my-nginx-df7bbf6f5-457mh Pulling image "nginx"
16m Normal Pulling pod/my-nginx-df7bbf6f5-6gndk Pulling image "nginx"
不过看起来还是下载镜像较慢,最终还是运行起来了:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-nginx-df7bbf6f5-457mh 1/1 Running 0 12h 10.0.6.22 z-k8s-n-5 <none> <none>
my-nginx-df7bbf6f5-6gndk 1/1 Running 0 12h 10.0.3.160 z-k8s-n-1 <none> <none>
为两个实例创建 NodePort Kubeernetes服务(services)
kubectl expose deployment my-nginx --type=NodePort --port=80
提示信息:
service/my-nginx exposed
检查 NodePort 服务:
kubectl get svc my-nginx
状态显示:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-nginx NodePort 10.101.117.255 <none> 80:30828/TCP 110s
现在我们可以通过
cilium service list
命令来验证 Cilium eBPF kube-proxy 替换所创建的新的 NodePort 服务:
kubectl -n kube-system exec ds/cilium -- cilium service list
输出显示:
ID Frontend Service Type Backend
1 10.104.129.196:443 ClusterIP 1 => 192.168.6.114:4244
2 => 192.168.6.102:4244
3 => 192.168.6.115:4244
4 => 192.168.6.101:4244
5 => 192.168.6.103:4244
6 => 192.168.6.112:4244
7 => 192.168.6.113:4244
8 => 192.168.6.111:4244
2 10.108.4.221:8080 ClusterIP 1 => 10.0.5.157:8080
3 10.96.0.1:443 ClusterIP 1 => 192.168.6.101:6443
2 => 192.168.6.102:6443
3 => 192.168.6.103:6443
4 10.96.0.10:53 ClusterIP 1 => 10.0.0.141:53
2 => 10.0.0.241:53
5 10.96.0.10:9153 ClusterIP 1 => 10.0.0.141:9153
2 => 10.0.0.241:9153
6 10.100.109.59:8080 ClusterIP 1 => 10.0.7.132:8080
9 192.168.6.102:31066 NodePort 1 => 10.0.5.157:8080
10 0.0.0.0:31066 NodePort 1 => 10.0.5.157:8080
11 192.168.6.102:30798 NodePort 1 => 10.0.7.132:8080
12 0.0.0.0:30798 NodePort 1 => 10.0.7.132:8080
13 10.101.117.255:80 ClusterIP 1 => 10.0.3.160:80
2 => 10.0.6.22:80
14 192.168.6.102:30828 NodePort 1 => 10.0.3.160:80
2 => 10.0.6.22:80
15 0.0.0.0:30828 NodePort 1 => 10.0.3.160:80
2 => 10.0.6.22:80
通过以下命令获取服务输出的NodePort端口:
node_port=$(kubectl get svc my-nginx -o=jsonpath='{@.spec.ports[0].nodePort}')
实际上,现在我们有3种方式访问,从前文 cilium service list
可以看到:
10.101.117.255:80 ClusterIP
192.168.6.102:30828 NodePort
0.0.0.0:30828 NodePort
对应:
在集群任何节点上访问 10.101.117.255 端口 80
访问
z-k8s-m-2
(192.168.6.102) 端口 30828访问集群任何节点的端口 30828
都能够看到nginx的页面(这里举例访问 z-k8s-n-2
192.168.6.112):
curl 192.168.6.112:30828
输出可以看到:
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
Socket LoadBalancer Bypass in Pod Namespace¶
在 Cilium Istio集成起步 配置Cilium时,如果部署的Cilium采用本文 kube-proxy replacement 模式( kube-proxy_free
),就需要调整 Cilium 的socket load balancing,配置 socketLB.hostNamespaceOnly=true
,否则会导致Istio的加密和遥测功能失效。
由于我已经在上文中启用了 hub-proxy_free
,所以,在部署 Cilium Istio集成起步 的第一个步骤就是本段落配置更新,激活 socketLB.hostNamespaceOnly=true
:
警告
我这里配置错误了,折腾了一下才解决,请参考下文的排查和纠正。最后我给出一个正确的简化配置(不修订默认值)。Cilium有很多强大的网络功能配置需要联动,并且和底层云计算underlay网络(vxlan等)有关,所以调整要非常小心。
API_SERVER_IP=z-k8s-api.staging.huatai.me
API_SERVER_PORT=6443
helm upgrade cilium cilium/cilium --version 1.12.1 \
--namespace kube-system \
--reuse-values \
--set tunnel=disabled \
--set autoDirectNodeRoutes=true \
--set kubeProxyReplacement=strict \
--set socketLB.hostNamespaceOnly=true \
--set k8sServiceHost=${API_SERVER_IP} \
--set k8sServicePort=${API_SERVER_PORT}
不过,我这次更新遇到奇怪的问题,就是节点上的 cilium
不断crash:
$ kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cilium-2brxn 0/1 CrashLoopBackOff 4 (67s ago) 3m4s 192.168.6.103 z-k8s-m-3 <none> <none>
cilium-6rhms 1/1 Running 0 25h 192.168.6.115 z-k8s-n-5 <none> <none>
cilium-mzrkm 0/1 CrashLoopBackOff 4 (79s ago) 3m5s 192.168.6.113 z-k8s-n-3 <none> <none>
cilium-operator-6dfc84b7fc-m8ftr 1/1 Running 0 3m5s 192.168.6.114 z-k8s-n-4 <none> <none>
cilium-operator-6dfc84b7fc-sxjp5 1/1 Running 0 3m6s 192.168.6.113 z-k8s-n-3 <none> <none>
cilium-pmdj4 1/1 Running 0 25h 192.168.6.102 z-k8s-m-2 <none> <none>
cilium-qjxcc 0/1 CrashLoopBackOff 4 (81s ago) 3m5s 192.168.6.101 z-k8s-m-1 <none> <none>
cilium-t5n4c 1/1 Running 0 25h 192.168.6.114 z-k8s-n-4 <none> <none>
cilium-vjqlr 1/1 Running 0 25h 192.168.6.111 z-k8s-n-1 <none> <none>
cilium-vk624 0/1 CrashLoopBackOff 4 (74s ago) 3m4s 192.168.6.112 z-k8s-n-2 <none> <none>
检查pods:
kubectl -n kube-system describe pods cilium-vk624
显示容器健康检查失败:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m51s default-scheduler Successfully assigned kube-system/cilium-vk624 to z-k8s-n-2
Normal Pulled 4m50s kubelet Container image "quay.io/cilium/cilium:v1.12.1@sha256:ea2db1ee21b88127b5c18a96ad155c25485d0815a667ef77c2b7c7f31cab601b" already present on machine
Normal Created 4m50s kubelet Created container mount-cgroup
Normal Pulled 4m50s kubelet Container image "quay.io/cilium/cilium:v1.12.1@sha256:ea2db1ee21b88127b5c18a96ad155c25485d0815a667ef77c2b7c7f31cab601b" already present on machine
Normal Started 4m50s kubelet Started container mount-cgroup
Normal Started 4m49s kubelet Started container apply-sysctl-overwrites
Normal Created 4m49s kubelet Created container apply-sysctl-overwrites
Normal Pulled 4m49s kubelet Container image "quay.io/cilium/cilium:v1.12.1@sha256:ea2db1ee21b88127b5c18a96ad155c25485d0815a667ef77c2b7c7f31cab601b" already present on machine
Normal Pulled 4m48s kubelet Container image "quay.io/cilium/cilium:v1.12.1@sha256:ea2db1ee21b88127b5c18a96ad155c25485d0815a667ef77c2b7c7f31cab601b" already present on machine
Normal Created 4m48s kubelet Created container mount-bpf-fs
Normal Started 4m48s kubelet Started container mount-bpf-fs
Normal Created 4m47s kubelet Created container clean-cilium-state
Normal Started 4m47s kubelet Started container clean-cilium-state
Normal Started 4m43s (x2 over 4m46s) kubelet Started container cilium-agent
Warning Unhealthy 4m42s (x2 over 4m44s) kubelet Startup probe failed: Get "http://127.0.0.1:9879/healthz": dial tcp 127.0.0.1:9879: connect: connection refused
Warning BackOff 4m38s (x3 over 4m40s) kubelet Back-off restarting failed container
Normal Pulled 4m24s (x3 over 4m47s) kubelet Container image "quay.io/cilium/cilium:v1.12.1@sha256:ea2db1ee21b88127b5c18a96ad155c25485d0815a667ef77c2b7c7f31cab601b" already present on machine
Normal Created 4m24s (x3 over 4m46s) kubelet Created container cilium-agent
检查也可以看到:
kubectl -n kube-system exec ds/cilium -- cilium status --verbose
显示有异常:
...
Encryption: Disabled
Cluster health: 4/8 reachable (2022-08-22T16:34:16Z)
Name IP Node Endpoints
z-k8s-n-4 (localhost) 192.168.6.114 reachable reachable
z-k8s-m-1 192.168.6.101 unreachable reachable
z-k8s-m-2 192.168.6.102 reachable reachable
z-k8s-m-3 192.168.6.103 unreachable reachable
z-k8s-n-1 192.168.6.111 reachable reachable
z-k8s-n-2 192.168.6.112 unreachable reachable
z-k8s-n-3 192.168.6.113 unreachable reachable
z-k8s-n-5 192.168.6.115 reachable reachable
检查crash的pod日志:
kubectl -n kube-system logs cilium-vk624
发现错误是参数错误:
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init)
level=info msg="Started gops server" address="127.0.0.1:9890" subsys=daemon
level=warning msg="If auto-direct-node-routes is enabled, then you are recommended to also configure ipv4-native-routing-cidr. If ipv4-native-routing-cidr is not configured, this may lead to pod to pod traffic being masqueraded, which can cause problems with performance, observability and policy" subsys=config
level=info msg="Memory available for map entries (0.003% of 4120702976B): 10301757B" subsys=config
level=info msg="option bpf-ct-global-tcp-max set by dynamic sizing to 131072" subsys=config
level=info msg="option bpf-ct-global-any-max set by dynamic sizing to 65536" subsys=config
level=info msg="option bpf-nat-global-max set by dynamic sizing to 131072" subsys=config
level=info msg="option bpf-neigh-global-max set by dynamic sizing to 131072" subsys=config
level=info msg="option bpf-sock-rev-map-max set by dynamic sizing to 65536" subsys=config
level=info msg=" --agent-health-port='9879'" subsys=daemon
level=info msg=" --agent-labels=''" subsys=daemon
level=info msg=" --agent-not-ready-taint-key='node.cilium.io/agent-not-ready'" subsys=daemon
level=info msg=" --allocator-list-timeout='3m0s'" subsys=daemon
level=info msg=" --allow-icmp-frag-needed='true'" subsys=daemon
level=info msg=" --allow-localhost='auto'" subsys=daemon
level=info msg=" --annotate-k8s-node='false'" subsys=daemon
level=info msg=" --api-rate-limit=''" subsys=daemon
level=info msg=" --arping-refresh-period='30s'" subsys=daemon
level=info msg=" --auto-create-cilium-node-resource='true'" subsys=daemon
level=info msg=" --auto-direct-node-routes='true'" subsys=daemon
level=info msg=" --bgp-announce-lb-ip='false'" subsys=daemon
level=info msg=" --bgp-announce-pod-cidr='false'" subsys=daemon
level=info msg=" --bgp-config-path='/var/lib/cilium/bgp/config.yaml'" subsys=daemon
level=info msg=" --bpf-ct-global-any-max='262144'" subsys=daemon
level=info msg=" --bpf-ct-global-tcp-max='524288'" subsys=daemon
level=info msg=" --bpf-ct-timeout-regular-any='1m0s'" subsys=daemon
level=info msg=" --bpf-ct-timeout-regular-tcp='6h0m0s'" subsys=daemon
level=info msg=" --bpf-ct-timeout-regular-tcp-fin='10s'" subsys=daemon
level=info msg=" --bpf-ct-timeout-regular-tcp-syn='1m0s'" subsys=daemon
level=info msg=" --bpf-ct-timeout-service-any='1m0s'" subsys=daemon
level=info msg=" --bpf-ct-timeout-service-tcp='6h0m0s'" subsys=daemon
level=info msg=" --bpf-ct-timeout-service-tcp-grace='1m0s'" subsys=daemon
level=info msg=" --bpf-filter-priority='1'" subsys=daemon
level=info msg=" --bpf-fragments-map-max='8192'" subsys=daemon
level=info msg=" --bpf-lb-acceleration='disabled'" subsys=daemon
level=info msg=" --bpf-lb-affinity-map-max='0'" subsys=daemon
level=info msg=" --bpf-lb-algorithm='random'" subsys=daemon
level=info msg=" --bpf-lb-dev-ip-addr-inherit=''" subsys=daemon
level=info msg=" --bpf-lb-dsr-dispatch='opt'" subsys=daemon
level=info msg=" --bpf-lb-dsr-l4-xlate='frontend'" subsys=daemon
level=info msg=" --bpf-lb-external-clusterip='false'" subsys=daemon
level=info msg=" --bpf-lb-maglev-hash-seed='JLfvgnHc2kaSUFaI'" subsys=daemon
level=info msg=" --bpf-lb-maglev-map-max='0'" subsys=daemon
level=info msg=" --bpf-lb-maglev-table-size='16381'" subsys=daemon
level=info msg=" --bpf-lb-map-max='65536'" subsys=daemon
level=info msg=" --bpf-lb-mode='snat'" subsys=daemon
level=info msg=" --bpf-lb-rev-nat-map-max='0'" subsys=daemon
level=info msg=" --bpf-lb-rss-ipv4-src-cidr=''" subsys=daemon
level=info msg=" --bpf-lb-rss-ipv6-src-cidr=''" subsys=daemon
level=info msg=" --bpf-lb-service-backend-map-max='0'" subsys=daemon
level=info msg=" --bpf-lb-service-map-max='0'" subsys=daemon
level=info msg=" --bpf-lb-sock='false'" subsys=daemon
level=info msg=" --bpf-lb-sock-hostns-only='true'" subsys=daemon
level=info msg=" --bpf-lb-source-range-map-max='0'" subsys=daemon
level=info msg=" --bpf-map-dynamic-size-ratio='0.0025'" subsys=daemon
level=info msg=" --bpf-nat-global-max='524288'" subsys=daemon
level=info msg=" --bpf-neigh-global-max='524288'" subsys=daemon
level=info msg=" --bpf-policy-map-max='16384'" subsys=daemon
level=info msg=" --bpf-root='/sys/fs/bpf'" subsys=daemon
level=info msg=" --bpf-sock-rev-map-max='262144'" subsys=daemon
level=info msg=" --bypass-ip-availability-upon-restore='false'" subsys=daemon
level=info msg=" --certificates-directory='/var/run/cilium/certs'" subsys=daemon
level=info msg=" --cflags=''" subsys=daemon
level=info msg=" --cgroup-root='/run/cilium/cgroupv2'" subsys=daemon
level=info msg=" --cluster-health-port='4240'" subsys=daemon
level=info msg=" --cluster-id='0'" subsys=daemon
level=info msg=" --cluster-name='default'" subsys=daemon
level=info msg=" --clustermesh-config='/var/lib/cilium/clustermesh/'" subsys=daemon
level=info msg=" --cmdref=''" subsys=daemon
level=info msg=" --config=''" subsys=daemon
level=info msg=" --config-dir='/tmp/cilium/config-map'" subsys=daemon
level=info msg=" --conntrack-gc-interval='0s'" subsys=daemon
level=info msg=" --crd-wait-timeout='5m0s'" subsys=daemon
level=info msg=" --datapath-mode='veth'" subsys=daemon
level=info msg=" --debug='false'" subsys=daemon
level=info msg=" --debug-verbose=''" subsys=daemon
level=info msg=" --derive-masquerade-ip-addr-from-device=''" subsys=daemon
level=info msg=" --devices=''" subsys=daemon
level=info msg=" --direct-routing-device=''" subsys=daemon
level=info msg=" --disable-cnp-status-updates='true'" subsys=daemon
level=info msg=" --disable-conntrack='false'" subsys=daemon
level=info msg=" --disable-endpoint-crd='false'" subsys=daemon
level=info msg=" --disable-envoy-version-check='false'" subsys=daemon
level=info msg=" --disable-iptables-feeder-rules=''" subsys=daemon
level=info msg=" --dns-max-ips-per-restored-rule='1000'" subsys=daemon
level=info msg=" --dns-policy-unload-on-shutdown='false'" subsys=daemon
level=info msg=" --dnsproxy-concurrency-limit='0'" subsys=daemon
level=info msg=" --dnsproxy-concurrency-processing-grace-period='0s'" subsys=daemon
level=info msg=" --egress-masquerade-interfaces=''" subsys=daemon
level=info msg=" --egress-multi-home-ip-rule-compat='false'" subsys=daemon
level=info msg=" --enable-auto-protect-node-port-range='true'" subsys=daemon
level=info msg=" --enable-bandwidth-manager='false'" subsys=daemon
level=info msg=" --enable-bbr='false'" subsys=daemon
level=info msg=" --enable-bgp-control-plane='false'" subsys=daemon
level=info msg=" --enable-bpf-clock-probe='true'" subsys=daemon
level=info msg=" --enable-bpf-masquerade='false'" subsys=daemon
level=info msg=" --enable-bpf-tproxy='false'" subsys=daemon
level=info msg=" --enable-cilium-endpoint-slice='false'" subsys=daemon
level=info msg=" --enable-custom-calls='false'" subsys=daemon
level=info msg=" --enable-endpoint-health-checking='true'" subsys=daemon
level=info msg=" --enable-endpoint-routes='false'" subsys=daemon
level=info msg=" --enable-envoy-config='true'" subsys=daemon
level=info msg=" --enable-external-ips='true'" subsys=daemon
level=info msg=" --enable-health-check-nodeport='true'" subsys=daemon
level=info msg=" --enable-health-checking='true'" subsys=daemon
level=info msg=" --enable-host-firewall='false'" subsys=daemon
level=info msg=" --enable-host-legacy-routing='false'" subsys=daemon
level=info msg=" --enable-host-port='true'" subsys=daemon
level=info msg=" --enable-host-reachable-services='false'" subsys=daemon
level=info msg=" --enable-hubble='true'" subsys=daemon
level=info msg=" --enable-hubble-recorder-api='true'" subsys=daemon
level=info msg=" --enable-icmp-rules='true'" subsys=daemon
level=info msg=" --enable-identity-mark='true'" subsys=daemon
level=info msg=" --enable-ip-masq-agent='false'" subsys=daemon
level=info msg=" --enable-ipsec='false'" subsys=daemon
level=info msg=" --enable-ipv4='true'" subsys=daemon
level=info msg=" --enable-ipv4-egress-gateway='false'" subsys=daemon
level=info msg=" --enable-ipv4-fragment-tracking='true'" subsys=daemon
level=info msg=" --enable-ipv4-masquerade='true'" subsys=daemon
level=info msg=" --enable-ipv6='false'" subsys=daemon
level=info msg=" --enable-ipv6-masquerade='true'" subsys=daemon
level=info msg=" --enable-ipv6-ndp='false'" subsys=daemon
level=info msg=" --enable-k8s-api-discovery='false'" subsys=daemon
level=info msg=" --enable-k8s-endpoint-slice='true'" subsys=daemon
level=info msg=" --enable-k8s-event-handover='false'" subsys=daemon
level=info msg=" --enable-k8s-terminating-endpoint='true'" subsys=daemon
level=info msg=" --enable-l2-neigh-discovery='true'" subsys=daemon
level=info msg=" --enable-l7-proxy='true'" subsys=daemon
level=info msg=" --enable-local-node-route='true'" subsys=daemon
level=info msg=" --enable-local-redirect-policy='false'" subsys=daemon
level=info msg=" --enable-mke='false'" subsys=daemon
level=info msg=" --enable-monitor='true'" subsys=daemon
level=info msg=" --enable-node-port='false'" subsys=daemon
level=info msg=" --enable-policy='default'" subsys=daemon
level=info msg=" --enable-recorder='false'" subsys=daemon
level=info msg=" --enable-remote-node-identity='true'" subsys=daemon
level=info msg=" --enable-runtime-device-detection='false'" subsys=daemon
level=info msg=" --enable-selective-regeneration='true'" subsys=daemon
level=info msg=" --enable-service-topology='false'" subsys=daemon
level=info msg=" --enable-session-affinity='false'" subsys=daemon
level=info msg=" --enable-svc-source-range-check='true'" subsys=daemon
level=info msg=" --enable-tracing='false'" subsys=daemon
level=info msg=" --enable-unreachable-routes='false'" subsys=daemon
level=info msg=" --enable-vtep='false'" subsys=daemon
level=info msg=" --enable-well-known-identities='false'" subsys=daemon
level=info msg=" --enable-wireguard='false'" subsys=daemon
level=info msg=" --enable-wireguard-userspace-fallback='false'" subsys=daemon
level=info msg=" --enable-xdp-prefilter='false'" subsys=daemon
level=info msg=" --enable-xt-socket-fallback='true'" subsys=daemon
level=info msg=" --encrypt-interface=''" subsys=daemon
level=info msg=" --encrypt-node='false'" subsys=daemon
level=info msg=" --endpoint-gc-interval='5m0s'" subsys=daemon
level=info msg=" --endpoint-interface-name-prefix=''" subsys=daemon
level=info msg=" --endpoint-queue-size='25'" subsys=daemon
level=info msg=" --endpoint-status=''" subsys=daemon
level=info msg=" --envoy-config-timeout='2m0s'" subsys=daemon
level=info msg=" --envoy-log=''" subsys=daemon
level=info msg=" --exclude-local-address=''" subsys=daemon
level=info msg=" --fixed-identity-mapping=''" subsys=daemon
level=info msg=" --force-local-policy-eval-at-source='true'" subsys=daemon
level=info msg=" --fqdn-regex-compile-lru-size='1024'" subsys=daemon
level=info msg=" --gops-port='9890'" subsys=daemon
level=info msg=" --host-reachable-services-protos='tcp,udp'" subsys=daemon
level=info msg=" --http-403-msg=''" subsys=daemon
level=info msg=" --http-idle-timeout='0'" subsys=daemon
level=info msg=" --http-max-grpc-timeout='0'" subsys=daemon
level=info msg=" --http-normalize-path='true'" subsys=daemon
level=info msg=" --http-request-timeout='3600'" subsys=daemon
level=info msg=" --http-retry-count='3'" subsys=daemon
level=info msg=" --http-retry-timeout='0'" subsys=daemon
level=info msg=" --hubble-disable-tls='false'" subsys=daemon
level=info msg=" --hubble-event-buffer-capacity='4095'" subsys=daemon
level=info msg=" --hubble-event-queue-size='0'" subsys=daemon
level=info msg=" --hubble-export-file-compress='false'" subsys=daemon
level=info msg=" --hubble-export-file-max-backups='5'" subsys=daemon
level=info msg=" --hubble-export-file-max-size-mb='10'" subsys=daemon
level=info msg=" --hubble-export-file-path=''" subsys=daemon
level=info msg=" --hubble-listen-address=':4244'" subsys=daemon
level=info msg=" --hubble-metrics='dns,drop,tcp,flow,port-distribution,icmp,http'" subsys=daemon
level=info msg=" --hubble-metrics-server=':9965'" subsys=daemon
level=info msg=" --hubble-recorder-sink-queue-size='1024'" subsys=daemon
level=info msg=" --hubble-recorder-storage-path='/var/run/cilium/pcaps'" subsys=daemon
level=info msg=" --hubble-socket-path='/var/run/cilium/hubble.sock'" subsys=daemon
level=info msg=" --hubble-tls-cert-file='/var/lib/cilium/tls/hubble/server.crt'" subsys=daemon
level=info msg=" --hubble-tls-client-ca-files='/var/lib/cilium/tls/hubble/client-ca.crt'" subsys=daemon
level=info msg=" --hubble-tls-key-file='/var/lib/cilium/tls/hubble/server.key'" subsys=daemon
level=info msg=" --identity-allocation-mode='crd'" subsys=daemon
level=info msg=" --identity-change-grace-period='5s'" subsys=daemon
level=info msg=" --identity-restore-grace-period='10m0s'" subsys=daemon
level=info msg=" --install-egress-gateway-routes='false'" subsys=daemon
level=info msg=" --install-iptables-rules='true'" subsys=daemon
level=info msg=" --install-no-conntrack-iptables-rules='false'" subsys=daemon
level=info msg=" --ip-allocation-timeout='2m0s'" subsys=daemon
level=info msg=" --ip-masq-agent-config-path='/etc/config/ip-masq-agent'" subsys=daemon
level=info msg=" --ipam='cluster-pool'" subsys=daemon
level=info msg=" --ipsec-key-file=''" subsys=daemon
level=info msg=" --iptables-lock-timeout='5s'" subsys=daemon
level=info msg=" --iptables-random-fully='false'" subsys=daemon
level=info msg=" --ipv4-native-routing-cidr=''" subsys=daemon
level=info msg=" --ipv4-node='auto'" subsys=daemon
level=info msg=" --ipv4-pod-subnets=''" subsys=daemon
level=info msg=" --ipv4-range='auto'" subsys=daemon
level=info msg=" --ipv4-service-loopback-address='169.254.42.1'" subsys=daemon
level=info msg=" --ipv4-service-range='auto'" subsys=daemon
level=info msg=" --ipv6-cluster-alloc-cidr='f00d::/64'" subsys=daemon
level=info msg=" --ipv6-mcast-device=''" subsys=daemon
level=info msg=" --ipv6-native-routing-cidr=''" subsys=daemon
level=info msg=" --ipv6-node='auto'" subsys=daemon
level=info msg=" --ipv6-pod-subnets=''" subsys=daemon
level=info msg=" --ipv6-range='auto'" subsys=daemon
level=info msg=" --ipv6-service-range='auto'" subsys=daemon
level=info msg=" --join-cluster='false'" subsys=daemon
level=info msg=" --k8s-api-server=''" subsys=daemon
level=info msg=" --k8s-heartbeat-timeout='30s'" subsys=daemon
level=info msg=" --k8s-kubeconfig-path=''" subsys=daemon
level=info msg=" --k8s-namespace='kube-system'" subsys=daemon
level=info msg=" --k8s-require-ipv4-pod-cidr='false'" subsys=daemon
level=info msg=" --k8s-require-ipv6-pod-cidr='false'" subsys=daemon
level=info msg=" --k8s-service-cache-size='128'" subsys=daemon
level=info msg=" --k8s-service-proxy-name=''" subsys=daemon
level=info msg=" --k8s-sync-timeout='3m0s'" subsys=daemon
level=info msg=" --k8s-watcher-endpoint-selector='metadata.name!=kube-scheduler,metadata.name!=kube-controller-manager,metadata.name!=etcd-operator,metadata.name!=gcp-controller-manager'" subsys=daemon
level=info msg=" --keep-config='false'" subsys=daemon
level=info msg=" --kube-proxy-replacement='strict'" subsys=daemon
level=info msg=" --kube-proxy-replacement-healthz-bind-address=''" subsys=daemon
level=info msg=" --kvstore=''" subsys=daemon
level=info msg=" --kvstore-connectivity-timeout='2m0s'" subsys=daemon
level=info msg=" --kvstore-lease-ttl='15m0s'" subsys=daemon
level=info msg=" --kvstore-max-consecutive-quorum-errors='2'" subsys=daemon
level=info msg=" --kvstore-opt=''" subsys=daemon
level=info msg=" --kvstore-periodic-sync='5m0s'" subsys=daemon
level=info msg=" --label-prefix-file=''" subsys=daemon
level=info msg=" --labels=''" subsys=daemon
level=info msg=" --lib-dir='/var/lib/cilium'" subsys=daemon
level=info msg=" --local-max-addr-scope='252'" subsys=daemon
level=info msg=" --local-router-ipv4=''" subsys=daemon
level=info msg=" --local-router-ipv6=''" subsys=daemon
level=info msg=" --log-driver=''" subsys=daemon
level=info msg=" --log-opt=''" subsys=daemon
level=info msg=" --log-system-load='false'" subsys=daemon
level=info msg=" --max-controller-interval='0'" subsys=daemon
level=info msg=" --metrics=''" subsys=daemon
level=info msg=" --mke-cgroup-mount=''" subsys=daemon
level=info msg=" --monitor-aggregation='medium'" subsys=daemon
level=info msg=" --monitor-aggregation-flags='all'" subsys=daemon
level=info msg=" --monitor-aggregation-interval='5s'" subsys=daemon
level=info msg=" --monitor-queue-size='0'" subsys=daemon
level=info msg=" --mtu='0'" subsys=daemon
level=info msg=" --node-port-acceleration='disabled'" subsys=daemon
level=info msg=" --node-port-algorithm='random'" subsys=daemon
level=info msg=" --node-port-bind-protection='true'" subsys=daemon
level=info msg=" --node-port-mode='snat'" subsys=daemon
level=info msg=" --node-port-range='30000,32767'" subsys=daemon
level=info msg=" --policy-audit-mode='false'" subsys=daemon
level=info msg=" --policy-queue-size='100'" subsys=daemon
level=info msg=" --policy-trigger-interval='1s'" subsys=daemon
level=info msg=" --pprof='false'" subsys=daemon
level=info msg=" --pprof-port='6060'" subsys=daemon
level=info msg=" --preallocate-bpf-maps='false'" subsys=daemon
level=info msg=" --prepend-iptables-chains='true'" subsys=daemon
level=info msg=" --procfs='/host/proc'" subsys=daemon
level=info msg=" --prometheus-serve-addr=':9962'" subsys=daemon
level=info msg=" --proxy-connect-timeout='1'" subsys=daemon
level=info msg=" --proxy-gid='1337'" subsys=daemon
level=info msg=" --proxy-max-connection-duration-seconds='0'" subsys=daemon
level=info msg=" --proxy-max-requests-per-connection='0'" subsys=daemon
level=info msg=" --proxy-prometheus-port='9964'" subsys=daemon
level=info msg=" --read-cni-conf=''" subsys=daemon
level=info msg=" --restore='true'" subsys=daemon
level=info msg=" --route-metric='0'" subsys=daemon
level=info msg=" --sidecar-istio-proxy-image='cilium/istio_proxy'" subsys=daemon
level=info msg=" --single-cluster-route='false'" subsys=daemon
level=info msg=" --socket-path='/var/run/cilium/cilium.sock'" subsys=daemon
level=info msg=" --sockops-enable='false'" subsys=daemon
level=info msg=" --state-dir='/var/run/cilium'" subsys=daemon
level=info msg=" --tofqdns-dns-reject-response-code='refused'" subsys=daemon
level=info msg=" --tofqdns-enable-dns-compression='true'" subsys=daemon
level=info msg=" --tofqdns-endpoint-max-ip-per-hostname='50'" subsys=daemon
level=info msg=" --tofqdns-idle-connection-grace-period='0s'" subsys=daemon
level=info msg=" --tofqdns-max-deferred-connection-deletes='10000'" subsys=daemon
level=info msg=" --tofqdns-min-ttl='3600'" subsys=daemon
level=info msg=" --tofqdns-pre-cache=''" subsys=daemon
level=info msg=" --tofqdns-proxy-port='0'" subsys=daemon
level=info msg=" --tofqdns-proxy-response-max-delay='100ms'" subsys=daemon
level=info msg=" --trace-payloadlen='128'" subsys=daemon
level=info msg=" --tunnel='disabled'" subsys=daemon
level=info msg=" --tunnel-port='0'" subsys=daemon
level=info msg=" --version='false'" subsys=daemon
level=info msg=" --vlan-bpf-bypass=''" subsys=daemon
level=info msg=" --vtep-cidr=''" subsys=daemon
level=info msg=" --vtep-endpoint=''" subsys=daemon
level=info msg=" --vtep-mac=''" subsys=daemon
level=info msg=" --vtep-mask=''" subsys=daemon
level=info msg=" --write-cni-conf-when-ready=''" subsys=daemon
level=info msg=" _ _ _" subsys=daemon
level=info msg=" ___|_| |_|_ _ _____" subsys=daemon
level=info msg="| _| | | | | | |" subsys=daemon
level=info msg="|___|_|_|_|___|_|_|_|" subsys=daemon
level=info msg="Cilium 1.12.1 4c9a630 2022-08-15T16:29:39-07:00 go version go1.18.5 linux/amd64" subsys=daemon
level=info msg="cilium-envoy version: 5739e4be8ae7134fee683d920d25c3732ac6c819/1.21.5/Distribution/RELEASE/BoringSSL" subsys=daemon
level=info msg="clang (10.0.0) and kernel (5.4.0) versions: OK!" subsys=linux-datapath
level=info msg="linking environment: OK!" subsys=linux-datapath
level=info msg="Detected mounted BPF filesystem at /sys/fs/bpf" subsys=bpf
level=info msg="Mounted cgroupv2 filesystem at /run/cilium/cgroupv2" subsys=cgroups
level=info msg="Parsing base label prefixes from default label list" subsys=labels-filter
level=info msg="Parsing additional label prefixes from user inputs: []" subsys=labels-filter
level=info msg="Final label prefixes to be used for identity evaluation:" subsys=labels-filter
level=info msg=" - reserved:.*" subsys=labels-filter
level=info msg=" - :io\\.kubernetes\\.pod\\.namespace" subsys=labels-filter
level=info msg=" - :io\\.cilium\\.k8s\\.namespace\\.labels" subsys=labels-filter
level=info msg=" - :app\\.kubernetes\\.io" subsys=labels-filter
level=info msg=" - !:io\\.kubernetes" subsys=labels-filter
level=info msg=" - !:kubernetes\\.io" subsys=labels-filter
level=info msg=" - !:.*beta\\.kubernetes\\.io" subsys=labels-filter
level=info msg=" - !:k8s\\.io" subsys=labels-filter
level=info msg=" - !:pod-template-generation" subsys=labels-filter
level=info msg=" - !:pod-template-hash" subsys=labels-filter
level=info msg=" - !:controller-revision-hash" subsys=labels-filter
level=info msg=" - !:annotation.*" subsys=labels-filter
level=info msg=" - !:etcd_node" subsys=labels-filter
level=info msg="Auto-disabling \"enable-bpf-clock-probe\" feature since KERNEL_HZ cannot be determined" error="Cannot probe CONFIG_HZ" subsys=daemon
level=info msg="Using autogenerated IPv4 allocation range" subsys=node v4Prefix=10.112.0.0/16
level=info msg="Initializing daemon" subsys=daemon
level=info msg="Establishing connection to apiserver" host="https://z-k8s-api.staging.huatai.me:6443" subsys=k8s
level=info msg="Connected to apiserver" subsys=k8s
level=fatal msg="Error while creating daemon" error="invalid daemon configuration: native routing cidr must be configured with option --ipv4-native-routing-cidr in combination with --enable-ipv4-masquerade --tunnel=disabled --ipam=cluster-pool --enable-ipv4=true" subsys=daemon
关键点是:
...
level=warning msg="If auto-direct-node-routes is enabled, then you are recommended to also configure ipv4-native-routing-cidr. If ipv4-native-routing-cidr is not configured, this may lead to pod to pod traffic being masqueraded, which can cause problems with performance, observability and policy" subsys=config
...
evel=fatal msg="Error while creating daemon" error="invalid daemon configuration: native routing cidr must be configured with option --ipv4-native-routing-cidr in combination with --enable-ipv4-masquerade --tunnel=disabled --ipam=cluster-pool --enable-ipv4=true" subsys=daemon
这个原因:
注意 tunnel
配置参数只有3个 {vxlan, geneve, disabled}
,其中 geneve
是BGP模式tunnel
一旦关闭 tunnel
,则必须同时配置 ipv4-native-routing-cidr: x.x.x.x/y
表示不执行封包的路由 参考 Cilium Concepts >> Networking >> Routing >> Native-Routing
cilium 默认就启用了 Encapsulation
(封包),不需要配置,这样就可以和 underlying 网络架构配合无需更多配置。此时所有集群节点之间采用 mesh of tunnels
的UDP封包协议,如VXLAN或Geneve。所有Cilium node的流量都是封包的。
所以,我现在修订为:
API_SERVER_IP=z-k8s-api.staging.huatai.me
API_SERVER_PORT=6443
helm upgrade cilium cilium/cilium --version 1.12.1 \
--namespace kube-system \
--reuse-values \
--set tunnel=vxlan \ #默认
--set autoDirectNodeRoutes=false \ #默认
--set kubeProxyReplacement=strict \
--set socketLB.hostNamespaceOnly=true \
--set loadBalancer.acceleration=disabled \ #默认
--set loadBalancer.mode=snat \ #默认
--set k8sServiceHost=${API_SERVER_IP} \
--set k8sServicePort=${API_SERVER_PORT}
综上所述,实际上我走了弯路,应该保持默认配置情况下有限修订,简化配置如下(以此为准):
API_SERVER_IP=z-k8s-api.staging.huatai.me
API_SERVER_PORT=6443
helm upgrade cilium cilium/cilium --version 1.12.1 \
--namespace kube-system \
--reuse-values \
--set kubeProxyReplacement=strict \
--set socketLB.hostNamespaceOnly=true \
--set k8sServiceHost=${API_SERVER_IP} \
--set k8sServicePort=${API_SERVER_PORT}