OpenStack 中的复制

复制为关键任务型工作负载提供灾难恢复 (DR) 解决方案。本指南将逐步向您介绍如何在您自己的部署中配置/利用 Cinder 复制功能。该功能分为 Cinder 端和驱动程序端两部分。Cinder 端步骤应通用,但驱动程序端步骤可能会有所不同。本指南将以 RBD 作为过程的参考驱动程序。

先决条件

  • 应具有 2 个后端集群

  • Cinder 驱动程序应支持复制

请参阅 Cinder 驱动程序支持矩阵,以了解哪些后端支持复制。

启用复制

CEPH

参考:https://docs.ceph.net.cn/en/2025.2/rbd/rbd-mirroring

注意:这些步骤特定于 Ceph,并针对 Ceph 的 Pacific 版本进行了测试。请确保

  • 两个存储集群上都存在名称相同的池。

  • 池包含您想要镜像的已启用日志的镜像。

步骤

  • 获取主 Ceph 集群和辅助 Ceph 集群的 shell 访问权限

site-a # sudo cephadm shell --fsid <PRIMARY_FSID> -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
site-b # sudo cephadm shell --fsid <SECONDARY_FSID> -c /etc/ceph2/ceph.conf -k /etc/ceph2/ceph.client.admin.keyring
  • 在两个主机上启用 RBD 镜像

site-a # ceph orch apply rbd-mirror --placement=<Primary Host>
site-b # ceph orch apply rbd-mirror --placement=<Secondary Host>
  • 启用镜像级别镜像

site-a # rbd mirror pool enable volumes image
site-b # rbd mirror pool enable volumes image
  • 引导对等体

注意:这些命令需要在 cephadm shell 之外执行。

site-a # sudo cephadm shell --fsid <PRIMARY_FSID> -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring -- rbd mirror pool peer bootstrap create --site-name <FSID of site-a> <pool_name> | awk 'END{print}' > "$HOME/token_file"
site-b # sudo cephadm shell --fsid <SECONDARY_FSID> -c /etc/ceph2/ceph.conf -k /etc/ceph2/ceph.client.admin.keyring -- rbd mirror pool peer bootstrap import --site-name <FSID of site-b> <pool_name> - < "$HOME/token_file"

验证

验证在以下输出中是否设置了 模式:镜像方向:rx-tx

site-a # rbd mirror pool info volumes

Mode: image
Site Name: 55b6325e-e6b3-4b7c-91fd-64b5720c1685

Peer Sites:

UUID: 544777e2-4418-4dba-8f10-03238f63990d
Name: 69cc3310-8dd4-4656-a75b-64d4890b0ca6
Mirror UUID:
Direction: rx-tx
Client: client.rbd-mirror-peer
site-b # rbd mirror pool info volumes

Mode: image
Site Name: 69cc3310-8dd4-4656-a75b-64d4890b0ca6

Peer Sites:

UUID: a102dd15-cc37-4df6-acf1-266ec0248a37
Name: 55b6325e-e6b3-4b7c-91fd-64b5720c1685
Mirror UUID:
Direction: rx-tx
Client: client.rbd-mirror-peer

CINDER

步骤

  • cinder.conf 文件中设置 replication_device 值。

replication_device = backend_id:<unique_identifier>,conf:<ceph.conf path for site-b>,user:<user for site-b>,secret_uuid: <libvirt secret UUID>
  • 创建一个复制的卷类型。请注意,我们在此处使用了 volume_backend_name=ceph,这对于您的部署可能不同。

openstack volume type create --property replication_enabled='<is> True' --property volume_backend_name='ceph' ceph

验证

  • 使用复制的卷类型创建一个卷

openstack volume create --type ceph --size 1 replicated-volume
  • 确认 RBD 端是否创建了副本

在 site-a 上,您将看到 mirroring primary: true

site-a # rbd info volumes/volume-d217e292-0a98-4572-ae68-a4c40b73a278

rbd image 'volume-d217e292-0a98-4572-ae68-a4c40b73a278':
        size 1 GiB in 256 objects
        order 22 (4 MiB objects)
        snapshot_count: 0
        id: a9ebeef62570
        block_name_prefix: rbd_data.a9ebeef62570
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling
        op_features:
        flags:
        create_timestamp: Thu May 15 14:15:04 2025
        access_timestamp: Thu May 15 14:15:04 2025
        modify_timestamp: Thu May 15 14:15:04 2025
        journal: a9ebeef62570
        mirroring state: enabled
        mirroring mode: journal
        mirroring global id: e8f583ed-abab-489c-b9d5-ef68c0a1b56f
        mirroring primary: true

在 site-b 上,您将看到 mirroring primary: false

rbd ls volumes
volume-d217e292-0a98-4572-ae68-a4c40b73a278

rbd info volumes/volume-d217e292-0a98-4572-ae68-a4c40b73a278
rbd image 'volume-d217e292-0a98-4572-ae68-a4c40b73a278':
        size 1 GiB in 256 objects
        order 22 (4 MiB objects)
        snapshot_count: 0
        id: 6a993924cde
        block_name_prefix: rbd_data.6a993924cde
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling
        op_features:
        flags:
        create_timestamp: Thu May 15 14:15:06 2025
        access_timestamp: Thu May 15 14:15:06 2025
        modify_timestamp: Thu May 15 14:15:06 2025
        journal: 6a993924cde
        mirroring state: enabled
        mirroring mode: journal
        mirroring global id: e8f583ed-abab-489c-b9d5-ef68c0a1b56f
        mirroring primary: false

从卷启动服务器的故障转移

  • 创建一个可引导的复制卷

openstack volume create --type ceph --image <Image-UUID> --size 1 test-bootable-replicated
  • 从该卷启动服务器

openstack server create --flavor c1 --nic=none --volume <Volume-UUID> test-bfv-server
  • 创建一个文件以将数据写入 VM 磁盘

$ cat > failover-dr <<EOF
> # Before failover
> this should be consistent before/after failover
> EOF
  • 故障转移复制的 cinder 后端

cinder failover-host <host>@<backend>
  • 挂起/取消挂起服务器。(这需要从主后端断开卷的连接,并在辅助后端创建到卷副本的新连接)

openstack server shelve <server-UUID>
openstack server unshelve <server-UUID>

验证

  • 验证现在是否已从辅助集群建立连接

# In cinder-volume logs, we can see the ``hosts``, ``cluster_name`` and ``auth_username`` fields will point to secondary cluster
Connection info returned from driver {'name': 'volumes/volume-e310359c-6587-4454-9a9c-a590b50dd4a5', 'hosts': ['127.0.0.1'], 'ports': ['6789'], 'cluster_name': 'ceph2', 'auth_enabled': True, 'auth_username': 'cinder2', 'secret_type': '***', 'secret_uuid': '***', 'volume_id': 'e310359c-6587-4454-9a9c-a590b50dd4a5', 'discard': True, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False, 'cacheable': False, 'driver_volume_type': 'rbd', 'attachment_id': 'b691cd50-83a1-4484-8081-7120a5cad054', 'enforce_multipath': True}
  • 确认在故障转移之前写入的数据是否仍然存在。

$ cat failover-dr
# Before failover
this should be consistent before/after failover

从卷启动服务器的故障恢复

  • 创建一个文件并将数据写入 VM 磁盘。(请注意,卷后端处于故障转移模式,我们正在写入辅助后端中的副本磁盘。)

$ cat > failover-dr <<EOF
> # Before Failback
> this should be consistent before/after failback
> EOF
  • 故障恢复到主后端

cinder failover-host <host>@<backend> --backend_id default
  • 挂起/取消挂起服务器(这需要从辅助后端中的副本卷断开连接,并在主后端创建到原始卷的新连接)

openstack server shelve <server UUID>
openstack server unshelve <server UUID>

验证

  • 验证现在是否已从主集群建立连接

# In cinder-volume logs, we can see the ``hosts``, ``cluster_name`` and ``auth_username`` fields will point to primary cluster
Connection info returned from driver {'name': 'volumes/volume-e310359c-6587-4454-9a9c-a590b50dd4a5', 'hosts': ['10.0.79.218'], 'ports': ['6789'], 'cluster_name': 'ceph', 'auth_enabled': True, 'auth_username': 'cinder', 'secret_type': '***', 'secret_uuid': '***', 'volume_id': 'e310359c-6587-4454-9a9c-a590b50dd4a5', 'discard': True, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False, 'cacheable': False, 'driver_volume_type': 'rbd', 'attachment_id': '2c8bb96b-5d5c-444c-aba5-13272b673b34', 'enforce_multipath': True}
  • 确认在故障恢复之前写入的数据是否仍然存在。

$ cat failback-dr
# Before Failback
this should be consistent before/after failback

外部数据卷的故障转移

  • 创建一个测试服务器

openstack server create --flavor c1 --nic=none --image <Image UUID> test-server
  • 创建并将其附加到数据卷

openstack volume create --type ceph --size 1 replicated-vol
openstack server add volume <Server UUID> <Volume UUID>
  • 将数据写入卷。请注意,创建文件系统和挂载设备在此处是隐含的。

$ cat > failover-dr <<EOF
> # Before failover
> this should be consistent before/after failover
> EOF
  • 分离并重新附加外部数据卷

openstack server remove volume <Server UUID> <Volume UUID>
openstack server add volume <Server UUID> <Volume UUID>

验证

  • 验证现在是否已从辅助集群建立连接

# In cinder-volume logs, we can see the ``hosts``, ``cluster_name`` and ``auth_username`` fields will point to secondary cluster
Connection info returned from driver {'name': 'volumes/volume-437573fd-08e2-42c9-b658-2f982bc0cdd2', 'hosts': ['127.0.0.1'], 'ports': ['6789'], 'cluster_name': 'ceph2', 'auth_enabled': True, 'auth_username': 'cinder2', 'secret_type': '***', 'secret_uuid': '***', 'volume_id': '437573fd-08e2-42c9-b658-2f982bc0cdd2', 'discard': True, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False, 'cacheable': False, 'driver_volume_type': 'rbd', 'attachment_id': '595bd265-4212-4d9a-8d48-ba6fb59d19fe', 'enforce_multipath': True}
  • 验证故障转移后数据是否存在。请注意,在某些情况下,数据可能/可能不会持久存在,具体取决于复制类型,即异步或同步。

$ cat failover-dr
# Before failover
this should be consistent before/after failover

外部数据卷的故障恢复

  • 创建一个文件并将数据写入外部数据卷。(请注意,卷后端处于故障转移模式,我们正在写入辅助后端中的副本磁盘。)

$ cat > failback-dr <<EOF
> # Before Failback
> this should be consistent before/after failback
> EOF
  • 故障恢复到主后端

cinder failover-host <host>@<backend> --backend_id default
  • 分离并重新附加外部数据卷

openstack server remove volume <Server UUID> <Volume UUID>
openstack server add volume <Server UUID> <Volume UUID>

验证

  • 验证现在是否已从主集群建立连接

# In cinder-volume logs, we can see the ``hosts``, ``cluster_name`` and ``auth_username`` fields will point to primary cluster
Connection info returned from driver {'name': 'volumes/volume-437573fd-08e2-42c9-b658-2f982bc0cdd2', 'hosts': ['10.0.79.218'], 'ports': ['6789'], 'cluster_name': 'ceph', 'auth_enabled': True, 'auth_username': 'cinder', 'secret_type': '***', 'secret_uuid': '***', 'volume_id': '437573fd-08e2-42c9-b658-2f982bc0cdd2', 'discard': True, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False, 'cacheable': False, 'driver_volume_type': 'rbd', 'attachment_id': 'b4e0c0a6-50b6-4ff3-83a5-a3da7be0e18c', 'enforce_multipath': True}
  • 确认在故障恢复之前写入的数据是否仍然存在。请注意,在某些情况下,数据可能/可能不会持久存在,具体取决于复制类型,即异步或同步。

$ cat failback-dr
# Before Failback
this should be consistent before/after failback