diff --git a/xml/book_administration.xml b/xml/book_administration.xml
index 65c22ac4..a4718845 100644
--- a/xml/book_administration.xml
+++ b/xml/book_administration.xml
@@ -66,6 +66,7 @@
+
diff --git a/xml/ha_virtualization.xml b/xml/ha_virtualization.xml
new file mode 100644
index 00000000..30782501
--- /dev/null
+++ b/xml/ha_virtualization.xml
@@ -0,0 +1,512 @@
+
+
+
+ %entities;
+]>
+
+
+ &ha; for virtualization
+
+
+
+ This chapter explains how to configure virtual machines as highly available cluster resources.
+
+
+
+
+ yes
+
+
+
+
+ Overview
+
+ Virtual machines can take different roles in a &ha; cluster:
+
+
+
+
+ A virtual machine can be managed by the cluster as a resource, without the cluster
+ managing the services that run on the virtual machine. In this case, the VM is opaque
+ to the cluster. This is the scenario described in this document.
+
+
+
+
+ A virtual machine can be a cluster resource and run &pmremote;,
+ which allows the cluster to manage services running on the virtual machine.
+ In this case, the VM is a guest node and is transparent to the cluster.
+ For this scenario, see .
+
+
+
+
+ A virtual machine can run a full cluster stack. In this case, the VM is a regular
+ cluster node and is not managed by the cluster as a resource. For this scenario,
+ see .
+
+
+
+
+ The following procedures describe how to set up highly available virtual machines on
+ block storage, with another block device used as an &ocfs; volume to store the VM lock files
+ and XML configuration files. The virtual machines and the &ocfs; volume are configured as
+ resources managed by the cluster, with resource constraints to ensure that the
+ lock file directory is always available before a virtual machine starts on any node.
+ This prevents the virtual machines from starting on multiple nodes.
+
+
+
+
+ Requirements
+
+
+
+ A running &ha; cluster with at least two nodes and a fencing device such as SBD.
+
+
+
+
+ Passwordless &rootuser; SSH login between the cluster nodes.
+
+
+
+
+ A network bridge on each cluster node, to be used for installing and running the VMs.
+ This must be separate from the network used for cluster communication and management.
+
+
+
+
+ Two or more shared storage devices (or partitions on a single shared device),
+ so that all cluster nodes can access the files and storage required by the VMs:
+
+
+
+
+ A device to use as an &ocfs; volume, which will store the VM lock files and XML configuration files.
+ Creating and mounting the &ocfs; volume is explained in the following procedure.
+
+
+
+
+ A device containing the VM installation source (such as an ISO file or disk image).
+
+
+
+
+ Depending on the installation source, you might also need another device for the
+ VM storage disks.
+
+
+
+
+ To avoid I/O starvation, these devices must be separate from the shared device used for SBD.
+
+
+
+
+ Stable device names for all storage paths, for example,
+ /dev/disk/by-id/DEVICE_ID.
+ A shared storage device might have mismatched /dev/sdX names on
+ different nodes, which will cause VM migration to fail.
+
+
+
+
+
+
+ Configuring cluster resources to manage the lock files
+
+ Use this procedure to configure the cluster to manage the virtual machine lock files.
+ The lock file directory must be available on all nodes so that the cluster is aware of the
+ lock files no matter which node the VMs are running on.
+
+
+ You only need to run the following commands on one of the cluster nodes.
+
+
+ Configuring cluster resources to manage the lock files
+
+
+ Create an &ocfs; volume on one of the shared storage devices:
+
+&prompt.root;mkfs.ocfs2 /dev/disk/by-id/DEVICE_ID
+
+
+
+ Run crm configure to start the crm interactive shell.
+
+
+
+
+ Create a primitive resource for DLM:
+
+&prompt.crm.conf;primitive dlm ocf:pacemaker:controld \
+ op monitor interval=60 timeout=60
+
+
+
+ Create a primitive resource for the &ocfs; volume:
+
+&prompt.crm.conf;primitive ocfs2 Filesystem \
+ params device="/dev/disk/by-id/DEVICE_ID" directory="/mnt/shared" fstype=ocfs2 \
+ op monitor interval=20 timeout=40
+
+
+
+ Create a group for the DLM and &ocfs; resources:
+
+&prompt.crm.conf;group g-virt-lock dlm ocfs2
+
+
+
+ Clone the group so that it runs on all nodes:
+
+&prompt.crm.conf;clone cl-virt-lock g-virt-lock \
+ meta interleave=true
+
+
+
+ Review your changes with show.
+
+
+
+
+ If everything is correct, submit your changes with commit
+ and leave the crm live configuration with quit.
+
+
+
+
+ Check the status of the group clone. It should be running on all nodes:
+
+&prompt.root;crm status
+[...]
+Full List of Resources:
+[...]
+ * Clone Set: cl-virt-lock [g-virt-lock]:
+ * Started: [ &node1; &node2; ]
+
+
+
+
+
+
+
+
+ Preparing the cluster nodes to host virtual machines
+
+ Use this procedure to install and start the required virtualization services, and to
+ configure the nodes to store the VM lock files on the shared &ocfs; volume.
+
+
+ This procedure uses crm cluster run to run commands on all
+ nodes at once. If you prefer to manage each node individually, you can omit the
+ crm cluster run portion of the commands.
+
+
+ Preparing the cluster nodes to host virtual machines
+
+
+ Install the virtualization packages on all nodes in the cluster:
+
+&prompt.root;crm cluster run "zypper install -y -t pattern kvm_server kvm_tools"
+
+
+
+ On one node, find and enable the lock_manager setting in the file
+ /etc/libvirt/qemu.conf:
+
+lock_manager = "lockd"
+
+
+
+ On the same node, find and enable the file_lockspace_dir setting in the
+ file /etc/libvirt/qemu-lockd.conf, and change the value to point to
+ a directory on the &ocfs; volume:
+
+file_lockspace_dir = "/mnt/shared/lockd"
+
+
+
+ Copy these files to the other nodes in the cluster:
+
+&prompt.root;crm cluster copy /etc/libvirt/qemu.conf
+&prompt.root;crm cluster copy /etc/libvirt/qemu-lockd.conf
+
+
+
+ Enable and start the libvirtd service on all nodes in the cluster:
+
+&prompt.root;crm cluster run "systemctl enable --now libvirtd"
+
+ This also starts the virtlockd service.
+
+
+
+
+
+
+ Adding virtual machines as cluster resources
+
+ Use this procedure to add virtual machines to the cluster as cluster resources, with
+ resource constraints to ensure the VMs can always access the lock files. The lock files are
+ managed by the resources in the group g-virt-lock, which is available on
+ all nodes via the clone cl-virt-lock.
+
+
+ Adding virtual machines as cluster resources
+
+
+ Install your virtual machines on one of the cluster nodes, with the following restrictions:
+
+
+
+
+ The installation source and storage must be on shared devices.
+
+
+
+
+ Do not configure the VMs to start on host boot.
+
+
+
+
+ For more information, see
+
+ &virtual; for &sles;.
+
+
+
+
+ If the virtual machines are running, shut them down. The cluster will start the VMs
+ after you add them as resources.
+
+
+
+
+ Dump the XML configuration to the &ocfs; volume. Repeat this step for each VM:
+
+&prompt.root;virsh dumpxml VM1 > /mnt/shared/VM1.xml
+
+
+ Make sure the XML files do not contain any references to unshared local paths.
+
+
+
+
+
+ Run crm configure to start the crm interactive shell.
+
+
+
+
+ Create primitive resources to manage the virtual machines. Repeat this step for each VM:
+
+&prompt.crm.conf;primitive VM1 VirtualDomain \
+ params config="/mnt/shared/VM1.xml" remoteuri="qemu+ssh://%n/system" \
+ meta allow-migrate=true \
+ op monitor timeout=30s interval=10s
+
+ The option allow-migrate=true enables live migration. If the value is
+ set to false, the cluster migrates the VM by shutting it down on
+ one node and restarting it on another node.
+
+
+ If you need to set utilization attributes to help place VMs based on their load impact,
+ see .
+
+
+
+
+ Create a colocation constraint so that the virtual machines can only start on nodes where
+ cl-virt-lock is running:
+
+&prompt.crm.conf;colocation col-fs-virt inf: ( VM1 VM2 VMX ) cl-virt-lock
+
+
+
+ Create an ordering constraint so that cl-virt-lock always starts before
+ the virtual machines:
+
+&prompt.crm.conf;order o-fs-virt Mandatory: cl-virt-lock ( VM1 VM2 VMX )
+
+
+
+ Review your changes with show.
+
+
+
+
+ If everything is correct, submit your changes with commit
+ and leave the crm live configuration with quit.
+
+
+
+
+ Check the status of the virtual machines:
+
+&prompt.root;crm status
+[...]
+Full List of Resources:
+[...]
+ * Clone Set: cl-virt-lock [g-virt-lock]:
+ * Started: [ &node1; &node2; ]
+ * VM1 (ocf::heartbeat:VirtualDomain): Started &node1;
+ * VM2 (ocf::heartbeat:VirtualDomain): Started &node1;
+ * VMX (ocf::heartbeat:VirtualDomain): Started &node1;
+
+
+
+ The virtual machines are now managed by the &ha; cluster, and can migrate between the cluster nodes.
+
+
+ Do not manually start or stop cluster-managed VMs
+
+ After adding virtual machines as cluster resources, do not manage them manually.
+ Only use the cluster tools as described in .
+
+
+ To perform maintenance tasks on cluster-managed VMs, see .
+
+
+
+
+
+ Testing the setup
+
+ Use the following tests to confirm that the virtual machine &ha; setup works as expected.
+
+
+
+ Perform these tests in a test environment, not a production environment.
+
+
+
+ Verifying that the VM resource is protected across cluster nodes
+
+
+ The virtual machine VM1 is running on node &node1;.
+
+
+
+
+ On node &node2;, try to start the VM manually with
+ virsh start VM1.
+
+
+
+
+ Expected result: The virsh command
+ fails. VM1 cannot be started manually on &node2;
+ when it is running on &node1;.
+
+
+
+
+ Verifying that the VM resource can live migrate between cluster nodes
+
+
+ The virtual machine VM1 is running on node &node1;.
+
+
+
+
+ Open two terminals.
+
+
+
+
+ In the first terminal, connect to VM1 via SSH.
+
+
+
+
+ In the second terminal, try to migrate VM1 to node
+ &node2; with crm resource move VM1 bob.
+
+
+
+
+ Run crm_mon -r to monitor the cluster status until it
+ stabilizes. This might take a short time.
+
+
+
+
+ In the first terminal, check whether the SSH connection to VM1
+ is still active.
+
+
+
+
+ Expected result: The cluster status shows that
+ VM1 has started on &node2;. The SSH connection
+ to VM1 remains active during the whole migration.
+
+
+
+
+ Verifying that the VM resource can migrate to another node when the current node reboots
+
+
+ The virtual machine VM1 is running on node &node2;.
+
+
+
+
+ Reboot &node2;.
+
+
+
+
+ On node &node1;, run crm_mon -r to
+ monitor the cluster status until it stabilizes. This might take a short time.
+
+
+
+
+ Expected result: The cluster status shows that
+ VM1 has started on &node1;.
+
+
+
+
+ Verifying that the VM resource can fail over to another node when the current node crashes
+
+
+ The virtual machine VM1 is running on node &node1;.
+
+
+
+
+ Simulate a crash on &node1; by forcing the machine off or
+ unplugging the power cable.
+
+
+
+
+ On node &node2;, run crm_mon -r to
+ monitor the cluster status until it stabilizes. VM failover after a node crashes
+ usually takes longer than VM migration after a node reboots.
+
+
+
+
+ Expected result: After a short time, the cluster status
+ shows that VM1 has started on &node2;.
+
+
+
+
+
+