Kubernetes集群(z-k8s)
部署 基于DNS轮询构建高可用Kubernetes
,后续部署集群改造成 HAProxy 负载均衡的 基于负载均衡的高可用Kubernetes集群 架构
准备etcd访问证书
由于是访问外部扩展etcd集群,所以首先需要将etcd证书复制到管控服务器节点,以便管控服务器服务(如apiserver)启动后能够正常读写etcd:
在管控服务器
z-k8s-m-1
/z-k8s-m-2
/z-k8s-m-3
上创建etcd
访问证书目录,并将 私有云etcd服务 准备好的证书复制过来:
export ETCDCTL_API=3
#export ETCDCTL_ENDPOINTS='https://etcd.staging.huatai.me:2379'
export ETCDCTL_ENDPOINTS=https://192.168.6.204:2379,https://192.168.6.205:2379,https://192.168.6.206:2379
export ETCDCTL_CACERT=/etc/etcd/ca.pem
export ETCDCTL_CERT=/etc/etcd/client.pem
export ETCDCTL_KEY=/etc/etcd/client-key.pem
将上述 etcdctl
客户端配置文件和Kubernetes访问etcd配置文件一一对应如下:
cfssl生成etcd客户端密钥 |
对应k8s访问etcd密钥文件 |
---|---|
ca.pem |
ca.crt |
client.pem |
apiserver-etcd-client.crt |
client-key.pem |
apiserver-etcd-client.key |
分发kubernetes的apiserver使用的etcd证书:
for host in z-k8s-m-1 z-k8s-m-2 z-k8s-m-3;do
scp /etc/etcd/ca.pem $host:/tmp/ca.crt
scp /etc/etcd/client.pem $host:/tmp/apiserver-etcd-client.crt
scp /etc/etcd/client-key.pem $host:/tmp/apiserver-etcd-client.key
ssh $host 'sudo mkdir -p /etc/kubernetes/pki/etcd'
ssh $host 'sudo mv /tmp/ca.crt /etc/kubernetes/pki/etcd/ca.crt'
ssh $host 'sudo mv /tmp/apiserver-etcd-client.crt /etc/kubernetes/pki/apiserver-etcd-client.crt'
ssh $host 'sudo mv /tmp/apiserver-etcd-client.key /etc/kubernetes/pki/apiserver-etcd-client.key'
done
备注
我是在具备密钥认证管理主机 z-b-data-1
上作为客户端,通过ssh远程登录到 z-k8s-m-1
/ z-k8s-m-2
/ z-k8s-m-3
,执行上述 deploy_k8s_etcd_key.sh
分发密钥
配置第一个管控节点(control plane ndoe)
创建
create_kubeadm-config.sh
脚本 :
K8S_API_ENDPOINT=z-k8s-api.staging.huatai.me
K8S_API_ENDPOINT_PORT=6443
K8S_CLUSTER_NAME=z-k8s
ETCD_0_IP=192.168.6.204
ETCD_1_IP=192.168.6.205
ETCD_2_IP=192.168.6.206
cat << EOF > kubeadm-config.yaml
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: stable
clusterName: ${K8S_CLUSTER_NAME}
controlPlaneEndpoint: "${K8S_API_ENDPOINT}:${K8S_API_ENDPOINT_PORT}"
etcd:
external:
endpoints:
- https://${ETCD_0_IP}:2379
- https://${ETCD_1_IP}:2379
- https://${ETCD_2_IP}:2379
caFile: /etc/kubernetes/pki/etcd/ca.crt
certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
EOF
执行
sh create_kubeadm-config.sh
生成kubeadm-config.yaml
配置文件创建第一个管控节点:
sudo kubeadm init --config kubeadm-config.yaml --upload-certs
根据提示,执行以下命令为自己的账户准备好管理配置
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
并且提供了如何添加管控平面节点的操作命令(包含密钥,所以必须保密),以及添加工作节点的命令(包含密钥,所以必须保密)
备注
由于 containerd运行时(runtime) 取代了 docker
,所以运维方式已经改变,详情参考 Kubernetes集群(z-k8s)使用nerdctl
检查Kubernetes节点和pods:
kubectl get nodes -o wide kubectl get pods -n kube-system -o wide
备注
如果出现 Unable to connect to the server: Forbidden
请检查操作系统是否设置了代理服务器环境变量,我吃过苦头
输出:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-6d4b75cb6d-jnfmj 0/1 Pending 0 18h <none> <none> <none> <none>
coredns-6d4b75cb6d-nm5fz 0/1 Pending 0 18h <none> <none> <none> <none>
kube-apiserver-z-k8s-m-1 1/1 Running 0 18h 192.168.6.101 z-k8s-m-1 <none> <none>
kube-controller-manager-z-k8s-m-1 1/1 Running 0 18h 192.168.6.101 z-k8s-m-1 <none> <none>
kube-proxy-vwqsn 1/1 Running 0 18h 192.168.6.101 z-k8s-m-1 <none> <none>
kube-scheduler-z-k8s-m-1 1/1 Running 0 18h 192.168.6.101 z-k8s-m-1 <none> <none>
备注
目前还有2个问题没有解决:
z-k8s-m-1
节点状态是NotReady
coredns
管控pods无法启动(网络没有配置)这个问题我之前在 创建单一控制平面(单master)集群 已经有经验,只要为Kubernetes集群安装正确的网络接口即可启动容器
请注意,Kubernetes集群的3大组件
apiserver
/scheduler
/controller-manager
都是使用物理主机的IP地址192.168.6.101
,也就是说,即使没有安装网络接口组件这3个管控组件也是能够启动的;这也是为何在kubeadm-config.yaml
配置的controlPlaneEndpoint
项域名z-k8s-api.staging.huatai.me
就是指向物理主机IP地址的解析
安装 Cilium网络
需要注意,针对 私有云部署TLS认证的etcd集群 (扩展外部etcd)需要采用 在扩展etcd环境安装cilium :
首先在节点安装 helm :
version=3.12.2
wget https://get.helm.sh/helm-v${version}-linux-amd64.tar.gz
tar -zxvf helm-v${version}-linux-amd64.tar.gz
sudo mv linux-amd64/helm /usr/local/bin/helm
设置cilium Helm仓库:
helm repo add cilium https://helm.cilium.io/
通过 helm 部署Cilium:
VERSION=1.11.7
ETCD_0_IP=192.168.6.204
ETCD_1_IP=192.168.6.205
ETCD_2_IP=192.168.6.206
kubectl create secret generic -n kube-system cilium-etcd-secrets \
--from-file=etcd-client-ca.crt=/etc/kubernetes/pki/etcd/ca.crt \
--from-file=etcd-client.key=/etc/kubernetes/pki/apiserver-etcd-client.key \
--from-file=etcd-client.crt=/etc/kubernetes/pki/apiserver-etcd-client.crt
helm install cilium cilium/cilium --version ${VERSION} \
--namespace kube-system \
--set etcd.enabled=true \
--set etcd.ssl=true \
--set "etcd.endpoints[0]=https://${ETCD_0_IP}:2379" \
--set "etcd.endpoints[1]=https://${ETCD_1_IP}:2379" \
--set "etcd.endpoints[2]=https://${ETCD_2_IP}:2379"
备注
正确安装了 Cilium网络 CNI 之后,之前部署过程中没有运行起来的coredns容器就能够分配IP地址并运行起来
安装cilium客户端:
curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-amd64.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
rm cilium-linux-amd64.tar.gz{,.sha256sum}
检查:
cilium status
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: 1 errors, 1 warnings
/¯¯\__/¯¯\ Hubble: disabled
\__/¯¯\__/ ClusterMesh: disabled
\__/
Deployment cilium-operator Desired: 2, Ready: 1/2, Available: 1/2, Unavailable: 1/2
DaemonSet cilium Desired: 1, Ready: 1/1, Available: 1/1
Containers: cilium-operator Running: 1, Pending: 1
cilium Running: 1
Cluster Pods: 2/2 managed by Cilium
Image versions cilium quay.io/cilium/cilium:v1.11.7@sha256:66a6f72a49e55e21278d07a99ff2cffa7565ed07f2578d54b5a92c1a492a6597: 1
cilium-operator quay.io/cilium/operator-generic:v1.11.7@sha256:0f8ed5d815873d20848a360df3f2ebbd4116481ff817d3f295557801e0b45900: 2
Errors: cilium-operator cilium-operator 1 pods of Deployment cilium-operator are not ready
Warnings: cilium-operator cilium-operator-68dffdc9f7-rph4w pod is pending
添加第二个管控节点
按照
kubeadm init
输出信息,在第二个管控节点z-k8s-m-2
上执行节点添加:
kubeadm join z-k8s-api.staging.huatai.me:6443 --token <token> \
--discovery-token-ca-cert-hash sha256:<hash> \
--control-plane --certificate-key <hash>
添加工作节点
按照
kubeadm init
输出信息,在工作节点z-k8s-n-1
等上执行:
kubeadm join z-k8s-api.staging.huatai.me:6443 --token <token> \
--discovery-token-ca-cert-hash <hash>
备注
kubeadm
初始化集群时候生成的 certificate
和 token
(24小时) 都是有一定有效期限。所以如果在初始化之后,再经过较长时间才添加管控节点和工作节点,就会遇到 token
和 certificate
相关错误。此时,需要重新上传certifiate和重新生成token。并且,对于使用 external etcd,还需要通过 kubeadm-config.yaml
传递etcd参数。