Kubespray管理
Kubespray维护etcd
kubespray
是在本地通过 Systemd进程管理器 包装的 容器运行时(Container Runtimes) 运行 etcd - 分布式kv存储 ,这里的runtime可以是 Docker 也可以是 containerd运行时(runtime) ,可以直接使用 systemctl 来简单管理和检查:systemctl status etcd
检查
ps aux | grep etcd
可以看到 kubespray 部署的etcd
运行参数,这个运行参数也可以直接检查/etc/systemd/system/etcdservice
,其中有一项配置:EnvironmentFile=-/etc/etcd.env
所以尝试采用:
借用systemd的etcd服务配置所使用的
/etc/etcd.env
来使用 etcdctl
# 参考 /etc/systemd/system/etcd.service 中配置项 EnvironmentFile
. /etc/etcd.env
etcdctl member list
但是遇到一个非常奇怪的报错:
通过
/etc/etcd.env
来使用 etcdctl
出现报错{"level":"warn","ts":"2023-06-05T11:32:04.494+0800","logger":"etcd-client","caller":"v3@v3.5.6/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000394a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection closed before server preface received"}
Error: context deadline exceeded
非常奇怪,为何访问 etcd-endpoints://0xc000394a80/127.0.0.1:2379
? 惭愧 ,我忽略了在 Bash 中,一定要使用 export
命令输出变量才能使得这个变量成为生效的环境变量。所以检查 /etc/etcd.env
可以知道需要生效以下环境变量
配置环境变量访问etcd
# CLI settings
ETCDCTL_ENDPOINTS=https://127.0.0.1:2379
ETCDCTL_CACERT=/etc/ssl/etcd/ssl/ca.pem
ETCDCTL_KEY=/etc/ssl/etcd/ssl/admin-y-k8s-m-1-key.pem
ETCDCTL_CERT=/etc/ssl/etcd/ssl/admin-y-k8s-m-1.pem
所以执行以下脚本命令为自己构建一个环境变量:
正确的 采用
/etc/etcd.env
输出环境变量来使用 etcdctl
# 从 /etc/etcd.env 中取出变量 ETCD_INITIAL_CLUSTER 并过滤出etcd的3个服务器IP地址
# 也就是将 ETCD_INITIAL_CLUSTER=etcd1=https://192.168.8.116:2380,etcd2=https://192.168.8.117:2380,etcd3=https://192.168.8.118:2380
# 转换成 192.168.8.116 192.168.8.117 192.168.8.118
# 可以使用 echo $ETCD_INITIAL_CLUSTER | awk -F'[= ,]' '{print $2, $4, $6}' | sed 's/https:\/\///g' | sed 's/:2380//g'
. /etc/etcd.env
echo ". /etc/etcd.env" >> ~/.bashrc
# 我发现还是通过sed转换更为简洁
var="ETCDCTL_ENDPOINTS=`echo $ETCD_INITIAL_CLUSTER | sed 's/etcd.=//g' | sed 's/:2380/:2379/g'`"
echo $var >> ~/.bashrc
echo "export ETCDCTL_ENDPOINTS ETCDCTL_CACERT ETCDCTL_KEY ETCDCTL_CERT" >> ~/.bashrc
echo "alias etcd_status='etcdctl --write-out=table endpoint status'" >> ~/.bashrc
现在执行 etcd_status
就能正确看到当前集群的etcd状态:
包装一个
etcd_status
命令查看etcd健康状态+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.8.116:2379 | 3ff555f9837c69b9 | 3.5.6 | 7.8 MB | true | false | 16 | 2014832 | 2014832 | |
| https://192.168.8.117:2379 | 4a784b4a93b49575 | 3.5.6 | 7.9 MB | false | false | 16 | 2014832 | 2014832 | |
| https://192.168.8.118:2379 | cb79cdeb0f0fe1cb | 3.5.6 | 7.9 MB | false | false | 16 | 2014832 | 2014832 | |
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+