On my previous blog I went trough the design options for connecting sites to cloud using Aviatrix with the following requirements:
- overlapping addresses
The blog can be found at:
In this post I include a new requirement to the architecture: disaster recovery. Disaster may occur in a cloud provider regions or network. For the continuity of your business-critical applications you need to have a disaster recovery design.
When you interconnect the same network using more than one connection, you introduce parallel paths between the networks. Parallel paths could lead to asymmetrical routing.
Transit Gateway Peering connects two or more Aviatrix Transit Gateways in a partial or full-mesh manner and it is configured under the Multi-Cloud Transit menu:
Excluded Network CIDRs is an optional field. Using the filter option prevents the overlapped CIDRs from being propagated to the other Transit Gateway.
Peering over private network advanced option only appears and applies to when the two Multi-Cloud Transit Gateways is each launched in Insane Mode and each is in a different cloud type.
Manual DR Configuration
Tunnels 1 and 2 were created during the writing of the post mentioned before. In this step, we will configure tunnels 3 and 4:
Tunnels 5000 and 5100 connects on-prem to the east region (spoke30) while tunnel8000 connects on-prem to the west (spoke3). Spoke3 is running without HA.
Global Load Balancer
While there are several global load balancers solutions available from different vendors, I’m going to use Azure Traffic Manager for this blog.
Traffic Manager uses DNS to direct the client requests to the appropriate service endpoint based on a traffic-routing method. The following methods are currently supported:
- Performance: the endpoint closest to the end user is returned.
- Geographic: the endpoint mapped to serve the geographic location based on the query request IP’s is returned. If that endpoint is unavailable, another endpoint won’t be selected to fail over to, since a geographic location can be mapped only to one endpoint in a profile.
- MultiValue: multiple endpoints mapped to IPv4/IPv6 addresses are returned.
- Subnet: the endpoint mapped to a set of IP address ranges is returned.
There is a cost associated to Traffic Manager. The cost is based on the number of DNS queries received. For more information, access: https://azure.microsoft.com/en-us/pricing/details/traffic-manager/
As the objective of this setup is to have a single tunnel, the tunnel configuration should be the same for both gateways (active and standby):
Traffic Manager Configuration
We are going to use TCP to port 443 as a way to check the health of the gateways:
We also have the option to fine tuning the fail over times (default to 30, 3, 10) and the DNS time to live (default 60 seconds).
Once the probe configuration is done, we can move on and register the gateways for this Traffic Manager profile using priority:
TrafficManager should be able to probe the AVX gateways. Because Azure uses several IP addresses for the health checks it is easier to use Service Tag:
Tunnel configuration using Traffic Manager. Once the configuration is in place with an FQDN as the tunnel destination, the show run command shows the IP address instead of the name. This is because the resolution happens just once:
A workaround for this is to configure an applet in order to resolve the tunnel destination every minute:
event manager applet change-tunnel-dest event timer cron name test-lab-aviatrix cron-entry "* * * * *" action 1.0 cli command "enable" action 1.1 cli command "configure terminal" action 1.2 cli command "interface tunnel5000" action 1.3 cli command "tunnel destination lab-test-aviatrix-vpn.trafficmanager.net"
I started my testing shooting down spoke30:
After a few seconds Traffic Manager detected that the gateway went down and the crontab script reconfigured the tunnel: