Tuesday 5 March 2019

How to Configure VMware vSphere HA Orchestrated Restart

We have been using VMware vSphere High Availability feature from a long time now wherein we add ESXi host to the HA cluster and let the election process begin to ensure we have one of the host as the Master host and rest other behave as Slave host, with VMware vSphere HA virtual machines achieve better protection against any unplanned downtime be it an Application Failure, Guest OS Failure, ESXi host failure, Network Failure and Datastore failure.

VMware vSphere HA not just provide the protection against unplanned downtime but with admission control policies it also ensure that enough resources are available within the cluster to power on the Virtual Machines which means cluster should not only meet the reservation requirements of the Virtual Machine’s during host failures but it should also meet the overall allocation requirements for the Virtual Machine.

vSphere HA calculate the resources based on the percentage of cluster resources rather than using the old default slot size calculations. This was one of the improvement made available to vSphere HA admission control in VMware vSphere 6.5 wherein the default option for defining failover capacity is set as Cluster resource percentage with a possibility to choose other options including slot size or disable it completely.

Prior to VMware vSphere 6.5 vSphere HA was only concerned about securing the resources required for the virtual machines and restarting them, and we use VM override option by defining the VM restart priority (Low, Medium High) in which they are going to secure these resources, however this seems to be a problem when we have a 3 tier Virtual Machines and when there are plenty of resources available for every VM.

In this case the VM will start receiving the allocation more fastly and will start powering On and there could be a possibility that Database server may take more time as compared to the application server and in this case the VM will fail to power on because it can’t access the database. Well the problem with this approach was that despite of the fact the restart priority was configured there was no mechanism to figure out VM readiness.

The above issues can be further fixed by using vSphere HA Orchestrated restart which performs a number of check to ensure Virtual Machines resources are secured and also focuses on Virtual Machine readiness.There are various checks performed like 1) VM has resources secured will be first check perform by vSphere HA ensuring that VM has CPU and Memory reservation requirements on one of the host in the cluster, 2) VM is powered On 3) Wait for VMware tools heartbeats ensuring that the operating system has started within the Virtual Machine and VMware tools application heartbeat is detected  and at last 4) VMware Tools Application Heartbeat confirmation ensuring that the VM is ready and the services are now available.


While configuring vSphere HA Orchestrated restart VM’s can be grouped into tiers representing their startup priority wherein the priority 1 tier virtual Machines will receive the resources first and will be powered on and after all the VM’s defined in this tier met their restart condition vSphere HA will move on to Priority 2 VM’s. Restart tier dependency is a soft rule means we have the option to configure the timeout value ensuring if one of the VM in tier 1 is problematic then at least other VM’s can be started.

The default VM restart priority like Low, Medium and High which has been there in the older VMware vSphere versions there are two new restart priority i.e lowest and highest which has been added with vSphere 6.5, however the restart priority defined will have no effect on the agent VM’s and Fault Tolerance Secondary Virtual Machines wherein the agent VM’s will always be given the top most restart priority.

We can always configure dependencies for Virtual Machines either within the same tier or between VM’s in different tiers wherein the VM will not power on until the VM on which this VM is dependent has been started. Unlike the restart priority which are soft rules i.e timeout period was allowed before proceeding further the Virtual Machines dependencies are Hard rule which means if the first VM doesn’t meet it’s restart condition HA will not start the second VM.