CentOS 7上部署的Gluster 11 集群添加服务器
在完成 CentOS 7 部署Gluster 11 后,我需要为集群增加服务器节点,扩容服务器。
准备工作
安装和启动服务
安装方法同 CentOS 7 部署Gluster 6 :
在CentOS上安装GlusterFS
yum install glusterfs-server
启动GlusterFS管理服务:
启动和激活GlusterFS管理服务
systemctl enable --now glusterd
检查
glusterd
服务状态:
检查GlusterFS管理服务
systemctl status glusterd
添加GlusterFS新节点
配置gluster配对 只需要在一台服务器上执行一次 (可以在第一台服务器上) ,这里添加的服务器是我们集群的第7台服务器,不过由于只添加一台不需要区分,所以所以我命名为
server
:
在 一台 服务器上 执行一次
gluster peer probe
添加这台新服务器server=192.168.1.7
gluster peer probe ${server}
完成后检查
gluster peer
状态:
在 一台 服务器上执行
gluster peer status
检查新添加节点是否正确连接到集群gluster peer status
gluster peer status
输出显示 peer
是 Connected
状态则表明构建成功Number of Peers: 6
Hostname: 192.168.1.2
Uuid: c664761a-5973-4e2e-8506-9c142c657297
State: Peer in Cluster (Connected)
Hostname: 192.168.1.3
Uuid: 901b8027-5eab-4f6b-8cf4-aafa4463ca13
State: Peer in Cluster (Connected)
Hostname: 192.168.1.4
Uuid: 5ff667dd-5f45-4daf-900e-913e78e52297
State: Peer in Cluster (Connected)
Hostname: 192.168.1.5
Uuid: ebd1d002-0719-4704-a59d-b4e8b3b28c29
State: Peer in Cluster (Connected)
Hostname: 192.168.1.6
Uuid: 1f958e31-2d55-4904-815a-89f6ade360fe
State: Peer in Cluster (Connected)
Hostname: 192.168.1.7
Uuid: a023c435-097c-411b-9d50-1e84629b9673
State: Peer in Cluster (Connected)
依然是采用 CentOS 7 部署Gluster 11 的卷,所以扩展(也就是
add_brick
)采用如下简单脚本:
create_gluster
脚本,传递卷名作为参数就可以 扩展 ( add_brick
) 现有的``replica 3`` 分布式卷volume=$1
server=192.168.1.7
gluster volume add-brick ${volume} replica 3 \
${server}:/data/brick0/${volume} \
\
${server}:/data/brick1/${volume} \
\
${server}:/data/brick2/${volume} \
\
${server}:/data/brick3/${volume} \
\
${server}:/data/brick4/${volume} \
\
${server}:/data/brick5/${volume} \
\
${server}:/data/brick6/${volume} \
\
${server}:/data/brick7/${volume} \
\
${server}:/data/brick8/${volume} \
\
${server}:/data/brick9/${volume} \
\
${server}:/data/brick10/${volume} \
\
${server}:/data/brick11/${volume}
这里有一个报错提示:
add_brick
提示为一个replica卷添加的多个brick位于相同服务器上,不是优化设置volume add-brick: failed: Multiple bricks of a replicate volume are present on the same server. This setup is not optimal.
Bricks should be on different nodes to have best fault tolerant configuration.
Use 'force' at the end of the command if you want to override this behavior.
我验证了一下,确实可以在命令最后加上 force
关键字完成 add_brick
,但是给我带来如下困扰
新添加的
brick
全部排在bricks
列表的最后:
执行
gluster volume info
可以检查卷信息gluster volume info
可以看到最后添加的 192.168.1.7
所有的bricks:
执行
gluster volume info
看到新增加的服务器上所有bricks都是列在最后Volume Name: backup
Type: Distributed-Replicate
Volume ID: 9ff7cdb3-abf0-4e33-8293-aae69c28b8d9
Status: Started
Snapshot Count: 0
Number of Bricks: 24 x 3 = 72
Transport-type: tcp
Bricks:
Brick1: 192.168.1.1:/data/brick0/backup
Brick2: 192.168.1.2:/data/brick0/backup
Brick3: 192.168.1.3:/data/brick0/backup
Brick4: 192.168.1.4:/data/brick0/backup
Brick5: 192.168.1.5:/data/brick0/backup
Brick6: 192.168.1.6:/data/brick0/backup
Brick7: 192.168.1.1:/data/brick1/backup
Brick8: 192.168.1.2:/data/brick1/backup
Brick9: 192.168.1.3:/data/brick1/backup
Brick10: 192.168.1.4:/data/brick1/backup
Brick11: 192.168.1.5:/data/brick1/backup
Brick12: 192.168.1.6:/data/brick1/backup
...
Brick67: 192.168.1.1:/data/brick11/backup
Brick68: 192.168.1.2:/data/brick11/backup
Brick69: 192.168.1.3:/data/brick11/backup
Brick70: 192.168.1.4:/data/brick11/backup
Brick71: 192.168.1.5:/data/brick11/backup
Brick72: 192.168.1.6:/data/brick11/backup
Brick73: 192.168.1.7:/data/brick0/backup
Brick74: 192.168.1.7:/data/brick1/backup
Brick75: 192.168.1.7:/data/brick2/backup
Brick76: 192.168.1.7:/data/brick3/backup
Brick77: 192.168.1.7:/data/brick4/backup
Brick78: 192.168.1.7:/data/brick5/backup
Brick79: 192.168.1.7:/data/brick6/backup
Brick80: 192.168.1.7:/data/brick7/backup
Brick81: 192.168.1.7:/data/brick8/backup
Brick82: 192.168.1.7:/data/brick9/backup
Brick83: 192.168.1.7:/data/brick10/backup
Brick84: 192.168.1.7:/data/brick11/backup
Options Reconfigured:
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
这里扩容的新节点有一个严重的问题,所有 bricks
都位于一台服务器上,会使得一部分hash到 brick73
到 brick84
的数据全部落在一台服务器上: Gluster存储底层文件系统 采用裸盘 XFS文件系统 带来的限制就是服务器节点不可增加或缩减
备注
我将在 Gluster存储最佳实践 详细探讨我的实践方案以及总结改进
通常我们还需要做一次
volume rebalance
:
执行
gluster volume reblance
将brick上的文件重新hash均衡# 对名为 backup GlusterFS卷进行数据平衡分布
gluster volume reblance backup startup