使用libvirt和XFS在线扩展Ceph RBD设备

我在 使用kubeadm升级Kubernetes集群1.25 升级虚拟机操作系统,但是由于最初虚拟机创建的磁盘较小,所以需要扩展磁盘后才能升级系统。

执行 rbd ls 命令检查存储池中rbd磁盘
rbd -p libvirt-pool ls -l

可以看到:

NAME               SIZE     PARENT  FMT  PROT  LOCK
z-k8s-m-1          6.5 GiB            2        excl
z-k8s-m-1.docker   9.3 GiB            2        excl
z-k8s-m-2          6.5 GiB            2        excl
z-k8s-m-2.docker   9.3 GiB            2        excl
...
  • 检查详细的RBD存储块信息:

    rbd info libvirt-pool/z-k8s-m-1
    

输出显示:

rbd image 'z-k8s-m-1':
        size 6.5 GiB in 1669 objects
        order 22 (4 MiB objects)
        snapshot_count: 0
        id: 31f3344490f20
        block_name_prefix: rbd_data.31f3344490f20
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
        op_features:
        flags:
        create_timestamp: Fri Dec 10 11:53:41 2021
        access_timestamp: Tue Nov  8 21:56:00 2022
        modify_timestamp: Tue Nov  8 22:47:19 2022
  • RBD调整磁盘大小到16GB ( 1024x16=16384 ),并且 virsh blockresize 刷新虚拟机磁盘:

rbd resize调整RBD块设备镜像大小, virsh blockresize调整虚拟机vda大小
rbd resize --size 16384 libvirt-pool/z-k8s-m-1
virsh blockresize --domain z-k8s-m-1 --path vda --size 16G
  • 登录到虚拟机内部执行growpart和xfs_growfs调整分区以及文件系统大小:

在虚拟机内部使用growpart和xfs_growfs扩展根目录文件系统
#安装growpart
apt install cloud-guest-utils
#扩展分区2
growpart /dev/vda 2
#扩展XFS根分区
xfs_growfs /

在线扩展Ceph RBD磁盘vdb1

我在 在Kubernetes部署Stable Diffusion 也同样遇到了虚拟机 /dev/vdb1 空间不足导致无法运行容器的问题,解决方法相似

  • 再次检查rbd磁盘

执行 rbd ls 命令检查存储池中rbd磁盘
rbd -p libvirt-pool ls -l

可以看到虚拟机 z-k8s-n-1 磁盘:

NAME               SIZE     PARENT  FMT  PROT  LOCK
z-k8s-n-1           16 GiB            2        excl
z-k8s-n-1.docker   9.3 GiB            2        excl
  • 检查 z-k8s-n-1.docker 磁盘详细信息:

    rbd info libvirt-pool/z-k8s-n-1.docker
    

显示如下:

rbd image 'z-k8s-n-1.docker':
     size 9.3 GiB in 2385 objects
     order 22 (4 MiB objects)
     snapshot_count: 0
     id: 4f30059fb9053
     block_name_prefix: rbd_data.4f30059fb9053
     format: 2
     features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
     op_features:
     flags:
     create_timestamp: Wed Dec 29 08:21:52 2021
     access_timestamp: Fri Jan 13 10:37:56 2023
     modify_timestamp: Fri Jan 13 10:39:06 2023
  • z-k8s-n-1.docker 扩展到50G( 1024x50=51200 ),并且 virsh blockresize 刷新虚拟机磁盘:

rbd resize调整RBD块设备镜像大小, virsh blockresize调整虚拟机vdb大小
rbd resize --size 51200 libvirt-pool/z-k8s-n-1.docker
virsh blockresize --domain z-k8s-n-1 --path vdb --size 50G
  • 登录到虚拟机内部执行growpart和xfs_growfs调整分区以及文件系统大小:

在虚拟机内部使用growpart和xfs_growfs扩展vdb1对应文件系统/var/lib/containerd
#安装growpart
apt install cloud-guest-utils
#扩展vdb分区1
growpart /dev/vdb 1
#扩展XFS分区/var/lib/containerd
xfs_growfs /var/lib/containerd
  • 完成后检查空间可以看到已经在线扩展成50G:

    $ df -h | grep vdb1
    /dev/vdb1        50G  8.2G   42G  17% /var/lib/containerd
    

离线扩展Ceph RBD磁盘vdb1

单条命令安装kubeflow 遇到磁盘空间不足导致 节点压力驱逐 ,所以我将节点依次关闭:

案例以 z-k8s-n9 为例

虚拟磁盘添加到维护虚拟机

需要维护服务器rbd配置
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <auth username='libvirt'>
        <secret type='ceph' uuid='3f203352-fcfc-4329-b870-34783e13493a'/>
      </auth>
      <source protocol='rbd' name='libvirt-pool/z-k8s-n-9'>
        <host name='192.168.6.204' port='6789'/>
        <host name='192.168.6.205' port='6789'/>
        <host name='192.168.6.206' port='6789'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </disk>
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <auth username='libvirt'>
        <secret type='ceph' uuid='3f203352-fcfc-4329-b870-34783e13493a'/>
      </auth>
      <source protocol='rbd' name='libvirt-pool/z-k8s-n-9.docker'>
        <host name='192.168.6.204' port='6789'/>
        <host name='192.168.6.205' port='6789'/>
        <host name='192.168.6.206' port='6789'/>
      </source>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </disk>
  • 我使用 z-dev 虚拟机来加载这两个需要维护的磁盘,在 z-dev 的虚拟机上,原先的 vda 配置如下:

用于运维的 z-dev 磁盘 vda
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/vg-libvirt/z-dev'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </disk>
  • 将上述 z-k8s-n9Ceph Block Device(RBD) 配置添加到 z-dev 虚拟机,不过需要修改2个地方:

    • 磁盘 target 命名需要从 vdavdb 修改为 vdbvdc

    • 删除虚拟磁盘 <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/> 类似的这行配置,让 Libvirt虚拟机管理器 自动决定配置(否则容易冲突)

  • 启动 z-dev 之后检查 fdisk -l 输入如下:

挂载的 rbd 磁盘 vdbvdc
Disk /dev/vdb: 32 GiB, 34359738368 bytes, 67108864 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: D9ADB788-0FE1-45C3-80C5-B412A3C4AB19

Device      Start      End  Sectors  Size Type
/dev/vdb1    2048   499711   497664  243M EFI System
/dev/vdb2  499712 67108830 66609119 31.8G Linux filesystem


Disk /dev/vdc: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 7B582F7C-AC2D-4D04-B600-35743C17BD96

Device     Start       End   Sectors Size Type
/dev/vdc1   2048 104857566 104855519  50G Linux filesystem
  • 上述2个磁盘为需要调整磁盘,其中 vdc 先扩展到 100G (方法同上):

执行 rbd ls 命令检查存储池中rbd磁盘
rbd -p libvirt-pool ls -l

需要扩容的磁盘如下:

执行 rbd ls 命令检查存储池中rbd磁盘,需要扩容的磁盘
NAME               SIZE     PARENT  FMT  PROT  LOCK
...
z-k8s-n-9           32 GiB            2
z-k8s-n-9.docker    50 GiB            2
  • RBD调整磁盘大小到100GB ( 1024x100=102400 ),并且 virsh blockresize 刷新虚拟机磁盘:

rbd resize调整RBD块设备镜像大小, virsh blockresize调整虚拟机磁盘大小,100g
rbd resize --size 102400 libvirt-pool/z-k8s-n-9.docker
virsh blockresize --domain z-dev --path vdc --size 100G
  • 完成后在虚拟机内部检查 fdisk -l 可以看到磁盘扩展到100G:

rbd resizevirsh blockresize 之后在虚拟机内部可以看到扩展后的虚拟机磁盘达到100G
Disk /dev/vdc: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 7B582F7C-AC2D-4D04-B600-35743C17BD96

Device     Start       End   Sectors Size Type
/dev/vdc1   2048 104857566 104855519  50G Linux filesystem
vdc 上构建 XFS文件系统
parted -s /dev/vdc mklabel gpt
parted -s -a optimal /dev/vdc mkpart primary 0% 100%
parted -s /dev/vdc name 1 data
mkfs.xfs -n ftype=1 /dev/vdc1 -f
  • 挂载磁盘:

挂载 vdbvdc 的分区,准备数据迁移
mkdir /vdb2 /vdc1
mount /dev/vdb2 /vdb2
mount /dev/vdc1 /vdc1

现在需要迁移的数据:

/vdb2/var/lib/containerd  => /vdc2 (这个磁盘后续将挂载为目标主机的 /var/lib/containerd)
  • 数据迁移:

数据迁移
# 原始目录重命名,创建空目录方便后在运行主机挂载 /dev/vdb1
mv /vdb2/var/lib/containerd /vdb2/var/lib/containerd.old
mkdir /vdb2/var/lib/containerd

# 使用 tar 命令同步数据
# (cd /vdb2/var/lib/containerd.old && tar cf .)|(cd /vdc1 && tar xf -)

#使用 rsync 同步数据
rsync -a /vdb2/var/lib/containerd.old/ /vdc1

  • 修改 /vdb2/etc/fstab (这个是系统磁盘上挂载磁盘配置):

    /dev/vdb1    /var/lib/containerd    xfs   defaults,quota,gquota,prjquota 0 1
    

参考