What is VRA?
- Virtual Replication Appliance is a VM installed on each hypervisor hosting virtual machines to be protected or recovered.
- It managed replication of data between the protected and recovery sites.
- A VM can be included in several VPGs so it can be recovered at several site ( for eit it will only be one Park Royal or Swindon)
What is VPG?
- Virtual Protection Group in the protected site contains virtual machine ( one or more) to be replicated to DR site.
- VMs defined in VPG can come from different hypervisor/clusters so various application servers can be protected in a single VPG.
What is Journal?
- Every write to a protected virtual machine is copied by Zerto Virtual Replication.
- Copy is asynchronously sent to the recovery site and written to a journal managed by VRA. Each VM has its own journal. Journal is part of VPG.
- Journal History : Determines max amount of history that can be saved in a journal. Max 30 days allowed.
- Default Journal Storage : The location of journal files
- Journal Size hard limit : The maximum size that journal can grow. Once it hit the threshold changed are written to the disk
Why do I see history not meeting SLA?
- In this case journal size limit has been crossed and Zerto is not able to Push new writes from source to target
What is RPO and RTO ?
RTO = It’s a metric that helps to calculate how quickly you need to recover services in case of disaster to bring back critical services. RPO = is a measurement of the maximum tolerable amount of data to lose. It helps measure how much time can occurs between your last data backup and DR. with Zerto it’s 6 seconds.
Things to consider when failing over with zerto
- It can around 24 hours to replicate data back to primary site once failed over ( with reverse protected this can be quicker)
- It is good practice to split VPG e.g application, database, processing . Once big VPG is not recommended as it can take the whole service down for sometime
- It is also recommended to failover in steps. E.g application first, database second, processing third.
- IP, DNS and GW changes will be required if subnet is not available on failover site.
Ensure default gateway is site local. Remove GW will increase latency and impact application
- Static routes might be impacted as some of them might go through some firewalls on primary site ( as long as dg is not changed this should be ok)