私有云etcd服务¶
备注
本文步骤比较繁琐,主要是etcd证书生成步骤较多。我后面再部署新集群时候将改写为脚本以便快速部署。
通过 私有云KVM环境 构建3台虚拟机,并且部署 私有云数据层LVM卷管理 后,就可以在独立划分的存储 /var/lib/etcd
目录之上构建etcd,这样可以为 etcd - 分布式kv存储 提供高性能虚拟化存储。
主机IP |
主机名 |
---|---|
192.168.6.204 |
z-b-data-1 |
192.168.6.205 |
z-b-data-2 |
192.168.6.206 |
z-b-data-3 |
etcd集群证书生成¶
发行版安装cfssl¶
安装Cloudflare 的
cfssl
工具:
sudo apt install golang-cfssl -y
初始化证书认证¶
准备
ca-config.json
(有效期限10年):
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"server": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
},
"client": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"peer": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
备注
这里CA配置中, "server"
段落必须要添加 "client auth"
,否则高版本etcd启动时会提示连接错误。详见 部署TLS认证的etcd集群
配置CSR(Certificate Signing Request)配置文件
ca-csr.json
:
{
"CN": "priv k8s etcd",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Shanghai",
"O": "huatai.me",
"ST": "cloud-atlas",
"OU": "staging"
}
]
}
使用上述配置定义生成CA:
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
这样将获得3个文件:
ca-key.pem
ca.csr
ca.pem
警告
请确保 ca-key.pem
文件安全,该文件是CA可以创建任何证书
生成服务器证书: 直接编辑
server.json
:
{
"CN": "priv k8s etcd",
"hosts": [
"etcd.staging.huatai.me",
"192.168.6.204",
"192.168.6.205",
"192.168.6.206",
"127.0.0.1"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"C": "CN",
"L": "Shanghai",
"ST": "cloud-atlas"
}
]
}
生成服务器证书和私钥:
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server
这样获得3个文件:
server-key.pem
server.csr
server.pem
peer certificate (每个服务器一个,按对应主机名):
{
"CN": "z-b-data-1",
"hosts": [
"z-b-data-1.staging.huatai.me",
"z-b-data-1",
"192.168.6.204",
"127.0.0.1"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"C": "CN",
"L": "Shanghai",
"ST": "cloud-atlas"
}
]
}
{
"CN": "z-b-data-2",
"hosts": [
"z-b-data-2.staging.huatai.me",
"z-b-data-2",
"192.168.6.205",
"127.0.0.1"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"C": "CN",
"L": "Shanghai",
"ST": "cloud-atlas"
}
]
}
{
"CN": "z-b-data-3",
"hosts": [
"z-b-data-3.staging.huatai.me",
"z-b-data-3",
"192.168.6.206",
"127.0.0.1"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"C": "CN",
"L": "Shanghai",
"ST": "cloud-atlas"
}
]
}
对应生成3个主机的服务器证书:
for sn in `seq 3`; do
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer z-b-data-${sn}.json | cfssljson -bare z-b-data-${sn}
done
此时获得对应文件是:
z-b-data-1-key.pem
z-b-data-1.csr
z-b-data-1.pem
...
客户端证书
client.json
(主要是主机列表保持空):
{
"CN": "private k8s etcd client",
"hosts": [""],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"C": "CN",
"L": "Shanghai",
"ST": "cloud-atlas"
}
]
}
现在可以生成客户端证书:
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client.json | cfssljson -bare client
获得了以下文件:
client-key.pem
client.csr
client.pem
安装软etcd件包¶
采用 安装运行本地etcd 中安装脚本下载最新安装软件包(当前版本
3.5.4
)
ETCD_VER=v3.5.4
KERNEL=`uname -s` # Linux / Darwin
ARCH=`uname -m` # x86_64 / aarch64
if [ ${KERNEL} == "Linux" ];then
KERNEL="linux"
elif [ ${KERNEL} == "Darwin" ];then
KERNEL="darwin"
else
echo "Not Linux or macOS, exit!"
exit 0
fi
if [ ${ARCH} == "x86_64" ];then
ARCH="amd64"
elif [ ${ARCH} == "aarch64" ];then
ARCH="arm64"
else
echo "Not x86_64 or aarch64, exit!"
exit 0
fi
# choose either URL
GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GOOGLE_URL}
rm -f /tmp/etcd-${ETCD_VER}-${KERNEL}-${ARCH}.tar.gz
rm -rf /tmp/etcd-download-test && mkdir -p /tmp/etcd-download-test
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-${KERNEL}-${ARCH}.tar.gz -o /tmp/etcd-${ETCD_VER}-${KERNEL}-${ARCH}.tar.gz
tar xzvf /tmp/etcd-${ETCD_VER}-${KERNEL}-${ARCH}.tar.gz -C /tmp/etcd-download-test --strip-components=1
rm -f /tmp/etcd-${ETCD_VER}-${KERNEL}-${ARCH}.tar.gz
/tmp/etcd-download-test/etcd --version
/tmp/etcd-download-test/etcdctl version
/tmp/etcd-download-test/etcdutl version
sudo mv /tmp/etcd-download-test/etcd /usr/local/bin
sudo mv /tmp/etcd-download-test/etcdctl /usr/local/bin
sudo mv /tmp/etcd-download-test/etcdutl /usr/local/bin
在安装节点创建 etcd 目录以及用户和用户组(如果使用了 私有云数据层LVM卷管理 中构建的
lv-etcd
卷,则忽略目录创建):
sudo mkdir -p /etc/etcd /var/lib/etcd
groupadd -f -g 1501 etcd
useradd -c "etcd user" -d /var/lib/etcd -s /bin/false -g etcd -u 1501 etcd
chown -R etcd:etcd /var/lib/etcd
证书分发¶
为方便使用ssh/scp进行管理,首先采用 ssh密钥 的 shell环境解决ssh-agent对会话的要求 结合 ssh多路传输multiplexing加速 ,这样可以不必输入密码就可以ssh/scp到集群服务器
使用以下脚本进行分发:
cat << EOF > etcd_hosts
z-b-data-1
z-b-data-2
z-b-data-3
EOF
cat << EOF > prepare_etcd.sh
if [ -d /tmp/etcd_tls ];then
rm -rf /tmp/etcd_tls
mkdir /tmp/etcd_tls
else
mkdir /tmp/etcd_tls
fi
if [ ! -d /etc/etcd/ ];then
sudo mkdir /etc/etcd
fi
EOF
for host in `cat etcd_hosts`;do
scp prepare_etcd.sh $host:/tmp/
ssh $host 'sh /tmp/prepare_etcd.sh'
done
for host in `cat etcd_hosts`;do
scp ${host}.pem ${host}:/tmp/etcd_tls/
scp ${host}-key.pem ${host}:/tmp/etcd_tls/
scp ca.pem ${host}:/tmp/etcd_tls/
scp server.pem ${host}:/tmp/etcd_tls/
scp server-key.pem ${host}:/tmp/etcd_tls/
scp client.csr ${host}:/tmp/etcd_tls/
scp client.pem ${host}:/tmp/etcd_tls/
scp client-key.pem ${host}:/tmp/etcd_tls/
ssh $host 'sudo cp /tmp/etcd_tls/* /etc/etcd/;sudo chown etcd:etcd /etc/etcd/*'
done
执行脚本:
sh deploy_etcd_certificates.sh
这样在 etcd
主机上分别有对应主机的配置文件 /etc/etcd
目录下
配置etcd¶
执行脚本
generate_etcd_service
生成/etc/etcd/conf.yml
配置文件和 Systemd进程管理器 启动etcd
配置文件/lib/systemd/system/etcd.service
:
ETCD_HOST_IP=$(ip addr show enp1s0 | grep "inet\b" | awk '{print $2}' | cut -d/ -f1)
ETCD_NAME=$(hostname -s)
ETCD_HOST_1=z-b-data-1
ETCD_HOST_2=z-b-data-2
ETCD_HOST_3=z-b-data-3
ETCD_HOST_1_IP=192.168.6.204
ETCD_HOST_2_IP=192.168.6.205
ETCD_HOST_3_IP=192.168.6.206
INIT_TOKEN=initpasswd
cat << EOF > /etc/etcd/conf.yml
# This is the configuration file for the etcd server.
# Human-readable name for this member.
name: ${ETCD_NAME}
# Path to the data directory.
data-dir: /var/lib/etcd
# Path to the dedicated wal directory.
wal-dir:
# Number of committed transactions to trigger a snapshot to disk.
snapshot-count: 10000
# Time (in milliseconds) of a heartbeat interval.
heartbeat-interval: 100
# Time (in milliseconds) for an election to timeout.
election-timeout: 1000
# Raise alarms when backend size exceeds the given quota. 0 means use the
# default quota.
quota-backend-bytes: 0
# List of comma separated URLs to listen on for peer traffic.
listen-peer-urls: https://${ETCD_HOST_IP}:2380
# List of comma separated URLs to listen on for client traffic.
listen-client-urls: https://${ETCD_HOST_IP}:2379,https://127.0.0.1:2379
# Maximum number of snapshot files to retain (0 is unlimited).
max-snapshots: 5
# Maximum number of wal files to retain (0 is unlimited).
max-wals: 5
# Comma-separated white list of origins for CORS (cross-origin resource sharing).
cors:
# List of this member's peer URLs to advertise to the rest of the cluster.
# The URLs needed to be a comma-separated list.
initial-advertise-peer-urls: https://${ETCD_HOST_IP}:2380
# List of this member's client URLs to advertise to the public.
# The URLs needed to be a comma-separated list.
advertise-client-urls: https://${ETCD_HOST_IP}:2379
# Discovery URL used to bootstrap the cluster.
discovery:
# Valid values include 'exit', 'proxy'
discovery-fallback: 'proxy'
# HTTP proxy to use for traffic to discovery service.
discovery-proxy:
# DNS domain used to bootstrap initial cluster.
discovery-srv:
# Initial cluster configuration for bootstrapping.
initial-cluster: ${ETCD_HOST_1}=https://${ETCD_HOST_1_IP}:2380,${ETCD_HOST_2}=https://${ETCD_HOST_2_IP}:2380,${ETCD_HOST_3}=https://${ETCD_HOST_3_IP}:2380
# Initial cluster token for the etcd cluster during bootstrap.
initial-cluster-token: ${INIT_TOKEN}
# Initial cluster state ('new' or 'existing').
initial-cluster-state: 'new'
# Reject reconfiguration requests that would cause quorum loss.
strict-reconfig-check: false
# Accept etcd V2 client requests
enable-v2: true
# Enable runtime profiling data via HTTP server
enable-pprof: true
# Valid values include 'on', 'readonly', 'off'
proxy: 'off'
# Time (in milliseconds) an endpoint will be held in a failed state.
proxy-failure-wait: 5000
# Time (in milliseconds) of the endpoints refresh interval.
proxy-refresh-interval: 30000
# Time (in milliseconds) for a dial to timeout.
proxy-dial-timeout: 1000
# Time (in milliseconds) for a write to timeout.
proxy-write-timeout: 5000
# Time (in milliseconds) for a read to timeout.
proxy-read-timeout: 0
client-transport-security:
# Path to the client server TLS cert file.
cert-file: /etc/etcd/server.pem
# Path to the client server TLS key file.
key-file: /etc/etcd/server-key.pem
# Enable client cert authentication.
client-cert-auth: true
# Path to the client server TLS trusted CA cert file.
trusted-ca-file: /etc/etcd/ca.pem
# Client TLS using generated certificates
auto-tls: true
peer-transport-security:
# Path to the peer server TLS cert file.
cert-file: /etc/etcd/${ETCD_NAME}.pem
# Path to the peer server TLS key file.
key-file: /etc/etcd/${ETCD_NAME}-key.pem
# Enable peer client cert authentication.
client-cert-auth: true
# Path to the peer server TLS trusted CA cert file.
trusted-ca-file: /etc/etcd/ca.pem
# Peer TLS using generated certificates.
auto-tls: true
# Enable debug-level logging for etcd.
debug: false
logger: zap
# Specify 'stdout' or 'stderr' to skip journald logging even when running under systemd.
log-outputs: [stderr]
# Force to create a new one member cluster.
force-new-cluster: false
auto-compaction-mode: periodic
auto-compaction-retention: "1"
EOF
cat << EOF > /lib/systemd/system/etcd.service
[Unit]
Description=etcd service
Documentation=https://github.com/coreos/etcd
[Service]
User=etcd
Type=notify
ExecStart=/usr/local/bin/etcd \\
--config-file=/etc/etcd/conf.yml
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
激活服务:
sudo systemctl enable etcd.service
启动服务:
sudo systemctl start etcd.service
检查¶
启动
etcd
之后,检查服务进程:ps aux | grep etcd
检查日志:
journalctl -u etcd.service
验证etcd集群¶
为方便维护,配置
etcdctl
环境变量,添加到用户自己的 profile中:
export ETCDCTL_API=3
#export ETCDCTL_ENDPOINTS='https://etcd.staging.huatai.me:2379'
export ETCDCTL_ENDPOINTS=https://192.168.6.204:2379,https://192.168.6.205:2379,https://192.168.6.206:2379
export ETCDCTL_CACERT=/etc/etcd/ca.pem
export ETCDCTL_CERT=/etc/etcd/client.pem
export ETCDCTL_KEY=/etc/etcd/client-key.pem
然后可以检查
检查节点状态:
etcdctl --write-out=table endpoint status
检查节点健康状况:
etcdctl endpoint health
(重要步骤)由于
etcd
已经完成部署,之前在/etc/etcd/conf.yml
配置集群状态,需要从new
改为existing
,表明集群已经建设完成:# Initial cluster state ('new' or 'existing'). initial-cluster-state: 'existing'