OpenStack 中的复制¶
复制为关键任务型工作负载提供灾难恢复 (DR) 解决方案。本指南将逐步向您介绍如何在您自己的部署中配置/利用 Cinder 复制功能。该功能分为 Cinder 端和驱动程序端两部分。Cinder 端步骤应通用,但驱动程序端步骤可能会有所不同。本指南将以 RBD 作为过程的参考驱动程序。
先决条件¶
应具有 2 个后端集群
Cinder 驱动程序应支持复制
请参阅 Cinder 驱动程序支持矩阵,以了解哪些后端支持复制。
启用复制¶
CEPH¶
参考:https://docs.ceph.net.cn/en/2025.2/rbd/rbd-mirroring
注意:这些步骤特定于 Ceph,并针对 Ceph 的 Pacific 版本进行了测试。请确保
两个存储集群上都存在名称相同的池。
池包含您想要镜像的已启用日志的镜像。
步骤¶
获取主 Ceph 集群和辅助 Ceph 集群的 shell 访问权限
site-a # sudo cephadm shell --fsid <PRIMARY_FSID> -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
site-b # sudo cephadm shell --fsid <SECONDARY_FSID> -c /etc/ceph2/ceph.conf -k /etc/ceph2/ceph.client.admin.keyring
在两个主机上启用 RBD 镜像
site-a # ceph orch apply rbd-mirror --placement=<Primary Host>
site-b # ceph orch apply rbd-mirror --placement=<Secondary Host>
启用镜像级别镜像
site-a # rbd mirror pool enable volumes image
site-b # rbd mirror pool enable volumes image
引导对等体
注意:这些命令需要在 cephadm shell 之外执行。
site-a # sudo cephadm shell --fsid <PRIMARY_FSID> -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring -- rbd mirror pool peer bootstrap create --site-name <FSID of site-a> <pool_name> | awk 'END{print}' > "$HOME/token_file"
site-b # sudo cephadm shell --fsid <SECONDARY_FSID> -c /etc/ceph2/ceph.conf -k /etc/ceph2/ceph.client.admin.keyring -- rbd mirror pool peer bootstrap import --site-name <FSID of site-b> <pool_name> - < "$HOME/token_file"
验证¶
验证在以下输出中是否设置了 模式:镜像 和 方向:rx-tx。
site-a # rbd mirror pool info volumes
Mode: image
Site Name: 55b6325e-e6b3-4b7c-91fd-64b5720c1685
Peer Sites:
UUID: 544777e2-4418-4dba-8f10-03238f63990d
Name: 69cc3310-8dd4-4656-a75b-64d4890b0ca6
Mirror UUID:
Direction: rx-tx
Client: client.rbd-mirror-peer
site-b # rbd mirror pool info volumes
Mode: image
Site Name: 69cc3310-8dd4-4656-a75b-64d4890b0ca6
Peer Sites:
UUID: a102dd15-cc37-4df6-acf1-266ec0248a37
Name: 55b6325e-e6b3-4b7c-91fd-64b5720c1685
Mirror UUID:
Direction: rx-tx
Client: client.rbd-mirror-peer
CINDER¶
步骤¶
在
cinder.conf文件中设置replication_device值。
replication_device = backend_id:<unique_identifier>,conf:<ceph.conf path for site-b>,user:<user for site-b>,secret_uuid: <libvirt secret UUID>
创建一个复制的卷类型。请注意,我们在此处使用了
volume_backend_name=ceph,这对于您的部署可能不同。
openstack volume type create --property replication_enabled='<is> True' --property volume_backend_name='ceph' ceph
验证¶
使用复制的卷类型创建一个卷
openstack volume create --type ceph --size 1 replicated-volume
确认 RBD 端是否创建了副本
在 site-a 上,您将看到 mirroring primary: true
site-a # rbd info volumes/volume-d217e292-0a98-4572-ae68-a4c40b73a278
rbd image 'volume-d217e292-0a98-4572-ae68-a4c40b73a278':
size 1 GiB in 256 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: a9ebeef62570
block_name_prefix: rbd_data.a9ebeef62570
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling
op_features:
flags:
create_timestamp: Thu May 15 14:15:04 2025
access_timestamp: Thu May 15 14:15:04 2025
modify_timestamp: Thu May 15 14:15:04 2025
journal: a9ebeef62570
mirroring state: enabled
mirroring mode: journal
mirroring global id: e8f583ed-abab-489c-b9d5-ef68c0a1b56f
mirroring primary: true
在 site-b 上,您将看到 mirroring primary: false
rbd ls volumes
volume-d217e292-0a98-4572-ae68-a4c40b73a278
rbd info volumes/volume-d217e292-0a98-4572-ae68-a4c40b73a278
rbd image 'volume-d217e292-0a98-4572-ae68-a4c40b73a278':
size 1 GiB in 256 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 6a993924cde
block_name_prefix: rbd_data.6a993924cde
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling
op_features:
flags:
create_timestamp: Thu May 15 14:15:06 2025
access_timestamp: Thu May 15 14:15:06 2025
modify_timestamp: Thu May 15 14:15:06 2025
journal: 6a993924cde
mirroring state: enabled
mirroring mode: journal
mirroring global id: e8f583ed-abab-489c-b9d5-ef68c0a1b56f
mirroring primary: false
从卷启动服务器的故障转移¶
创建一个可引导的复制卷
openstack volume create --type ceph --image <Image-UUID> --size 1 test-bootable-replicated
从该卷启动服务器
openstack server create --flavor c1 --nic=none --volume <Volume-UUID> test-bfv-server
创建一个文件以将数据写入 VM 磁盘
$ cat > failover-dr <<EOF
> # Before failover
> this should be consistent before/after failover
> EOF
故障转移复制的 cinder 后端
cinder failover-host <host>@<backend>
挂起/取消挂起服务器。(这需要从主后端断开卷的连接,并在辅助后端创建到卷副本的新连接)
openstack server shelve <server-UUID>
openstack server unshelve <server-UUID>
验证¶
验证现在是否已从辅助集群建立连接
# In cinder-volume logs, we can see the ``hosts``, ``cluster_name`` and ``auth_username`` fields will point to secondary cluster
Connection info returned from driver {'name': 'volumes/volume-e310359c-6587-4454-9a9c-a590b50dd4a5', 'hosts': ['127.0.0.1'], 'ports': ['6789'], 'cluster_name': 'ceph2', 'auth_enabled': True, 'auth_username': 'cinder2', 'secret_type': '***', 'secret_uuid': '***', 'volume_id': 'e310359c-6587-4454-9a9c-a590b50dd4a5', 'discard': True, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False, 'cacheable': False, 'driver_volume_type': 'rbd', 'attachment_id': 'b691cd50-83a1-4484-8081-7120a5cad054', 'enforce_multipath': True}
确认在故障转移之前写入的数据是否仍然存在。
$ cat failover-dr
# Before failover
this should be consistent before/after failover
从卷启动服务器的故障恢复¶
创建一个文件并将数据写入 VM 磁盘。(请注意,卷后端处于故障转移模式,我们正在写入辅助后端中的副本磁盘。)
$ cat > failover-dr <<EOF
> # Before Failback
> this should be consistent before/after failback
> EOF
故障恢复到主后端
cinder failover-host <host>@<backend> --backend_id default
挂起/取消挂起服务器(这需要从辅助后端中的副本卷断开连接,并在主后端创建到原始卷的新连接)
openstack server shelve <server UUID>
openstack server unshelve <server UUID>
验证¶
验证现在是否已从主集群建立连接
# In cinder-volume logs, we can see the ``hosts``, ``cluster_name`` and ``auth_username`` fields will point to primary cluster
Connection info returned from driver {'name': 'volumes/volume-e310359c-6587-4454-9a9c-a590b50dd4a5', 'hosts': ['10.0.79.218'], 'ports': ['6789'], 'cluster_name': 'ceph', 'auth_enabled': True, 'auth_username': 'cinder', 'secret_type': '***', 'secret_uuid': '***', 'volume_id': 'e310359c-6587-4454-9a9c-a590b50dd4a5', 'discard': True, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False, 'cacheable': False, 'driver_volume_type': 'rbd', 'attachment_id': '2c8bb96b-5d5c-444c-aba5-13272b673b34', 'enforce_multipath': True}
确认在故障恢复之前写入的数据是否仍然存在。
$ cat failback-dr
# Before Failback
this should be consistent before/after failback
外部数据卷的故障转移¶
创建一个测试服务器
openstack server create --flavor c1 --nic=none --image <Image UUID> test-server
创建并将其附加到数据卷
openstack volume create --type ceph --size 1 replicated-vol
openstack server add volume <Server UUID> <Volume UUID>
将数据写入卷。请注意,创建文件系统和挂载设备在此处是隐含的。
$ cat > failover-dr <<EOF
> # Before failover
> this should be consistent before/after failover
> EOF
分离并重新附加外部数据卷
openstack server remove volume <Server UUID> <Volume UUID>
openstack server add volume <Server UUID> <Volume UUID>
验证¶
验证现在是否已从辅助集群建立连接
# In cinder-volume logs, we can see the ``hosts``, ``cluster_name`` and ``auth_username`` fields will point to secondary cluster
Connection info returned from driver {'name': 'volumes/volume-437573fd-08e2-42c9-b658-2f982bc0cdd2', 'hosts': ['127.0.0.1'], 'ports': ['6789'], 'cluster_name': 'ceph2', 'auth_enabled': True, 'auth_username': 'cinder2', 'secret_type': '***', 'secret_uuid': '***', 'volume_id': '437573fd-08e2-42c9-b658-2f982bc0cdd2', 'discard': True, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False, 'cacheable': False, 'driver_volume_type': 'rbd', 'attachment_id': '595bd265-4212-4d9a-8d48-ba6fb59d19fe', 'enforce_multipath': True}
验证故障转移后数据是否存在。请注意,在某些情况下,数据可能/可能不会持久存在,具体取决于复制类型,即异步或同步。
$ cat failover-dr
# Before failover
this should be consistent before/after failover
外部数据卷的故障恢复¶
创建一个文件并将数据写入外部数据卷。(请注意,卷后端处于故障转移模式,我们正在写入辅助后端中的副本磁盘。)
$ cat > failback-dr <<EOF
> # Before Failback
> this should be consistent before/after failback
> EOF
故障恢复到主后端
cinder failover-host <host>@<backend> --backend_id default
分离并重新附加外部数据卷
openstack server remove volume <Server UUID> <Volume UUID>
openstack server add volume <Server UUID> <Volume UUID>
验证¶
验证现在是否已从主集群建立连接
# In cinder-volume logs, we can see the ``hosts``, ``cluster_name`` and ``auth_username`` fields will point to primary cluster
Connection info returned from driver {'name': 'volumes/volume-437573fd-08e2-42c9-b658-2f982bc0cdd2', 'hosts': ['10.0.79.218'], 'ports': ['6789'], 'cluster_name': 'ceph', 'auth_enabled': True, 'auth_username': 'cinder', 'secret_type': '***', 'secret_uuid': '***', 'volume_id': '437573fd-08e2-42c9-b658-2f982bc0cdd2', 'discard': True, 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False, 'cacheable': False, 'driver_volume_type': 'rbd', 'attachment_id': 'b4e0c0a6-50b6-4ff3-83a5-a3da7be0e18c', 'enforce_multipath': True}
确认在故障恢复之前写入的数据是否仍然存在。请注意,在某些情况下,数据可能/可能不会持久存在,具体取决于复制类型,即异步或同步。
$ cat failback-dr
# Before Failback
this should be consistent before/after failback