5 min RTO with Aviatrix and Terraform

Disaster recovery involves a set of policies, tools, and procedures to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster.

The Recovery Time Objective (RTO) is the targeted duration of time and a service level within which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity.

I covered using Aviatrix to address the challenges of DR/BC before:

Site2Cloud Design for DR/BC using Aviatrix

In this new blog I address a new set of requirements:

remote branches do not support multiple tunnels
remote branches overlapping IPs
Applications with hard coded IP (different App instances must run with the same IP address)

Proposed Design

The proposed solution has the following major points:

a set of resources is available only to the active region: vnet, gateways, ipsec tunnels, gateway attachments, and route propagation
a set of resources is in standby on the non-active region
Mapped NAT to overcome IP overlap
terraform is used to manually switch over the active and standby regions (thanks to Chris to the idea of using terraform state to take care of that. Also thanks to Dennis to always help me with terraform)

Terraform and Aviatrix Provider to the rescue

Site-2-Cloud terraform:

	resource "aviatrix_site2cloud" "site2cloud_connection-east" {
	depends_on = [
	aviatrix_gateway.aviatrix_gateway_standalone-east
	]
	count = var.region_active == "east" ? 1 : 0
	vpc_id = aviatrix_gateway.aviatrix_gateway_standalone-east.vpc_id
	connection_name = "${aviatrix_gateway.aviatrix_gateway_standalone-east.id}-${var.region_active}-${replace("${var.remote_gateway_ip}", ".", "-")}"
	connection_type = "mapped"
	remote_gateway_type = "generic"
	tunnel_type = "route"
	pre_shared_key = var.pre_shared_key
	enable_ikev2 = true
	primary_cloud_gateway_name = aviatrix_gateway.aviatrix_gateway_standalone-east.gw_name
	remote_gateway_ip = var.remote_gateway_ip
	custom_mapped = false
	remote_subnet_cidr = var.remote_subnet_cidr
	remote_subnet_virtual = var.remote_virtual
	local_subnet_cidr = aviatrix_vpc.azure_vnet_user-spoke-east-2.cidr
	#local_subnet_virtual = var.cloud_virtual
	enable_single_ip_ha = true
	backup_gateway_name = aviatrix_gateway.aviatrix_gateway_standalone-east.peering_ha_gw_name
	ha_enabled = true
	backup_remote_gateway_ip = var.remote_gateway_ip
	backup_pre_shared_key = var.pre_shared_key
	}

	resource "aviatrix_site2cloud" "site2cloud_connection-west" {
	depends_on = [
	aviatrix_gateway.aviatrix_gateway_standalone-west
	]
	count = var.region_active == "west" ? 1 : 0
	vpc_id = aviatrix_gateway.aviatrix_gateway_standalone-west.vpc_id
	connection_name = "${aviatrix_gateway.aviatrix_gateway_standalone-west.id}-${var.region_active}-${replace("${var.remote_gateway_ip}", ".", "-")}"
	connection_type = "unmapped"
	remote_gateway_type = "generic"
	tunnel_type = "route"
	pre_shared_key = var.pre_shared_key
	enable_ikev2 = true
	primary_cloud_gateway_name = aviatrix_gateway.aviatrix_gateway_standalone-west.gw_name
	remote_gateway_ip = var.remote_gateway_ip
	custom_mapped = false
	remote_subnet_cidr = var.remote_subnet_cidr
	#remote_subnet_virtual = var.remote_virtual
	local_subnet_cidr = aviatrix_vpc.azure_vnet_user-spoke-west-2.cidr
	#local_subnet_virtual = var.cloud_virtual
	enable_single_ip_ha = true
	backup_gateway_name = aviatrix_gateway.aviatrix_gateway_standalone-west.peering_ha_gw_name
	ha_enabled = true
	backup_remote_gateway_ip = var.remote_gateway_ip
	backup_pre_shared_key = var.pre_shared_key
	}

view raw site-2-cloud-tf hosted with ❤ by GitHub

The code above creates a Site-2-Site connection to an existing AVX gateway but it only creates on the active region using the expression count = var.region_active == “west” ? 1 : 0. The active region is determined by the value of the variable region_active declared in the terraform.tfvars.

The same principle is used to advertise the remote branch prefixes to the AVX fabric from the proper region using the included_advertised_spoke_routes variable:

	module "vpn-spoke-west-2" {
	source = "terraform-aviatrix-modules/mc-spoke/aviatrix"
	version = "1.3.0"
	account = var.account
	cloud = var.cloud
	region = var.region-a
	cidr = cidrsubnet("${trimsuffix(var.cidr-region-a-1, "23")}16", 8, 2)
	inspection = true
	transit_gw = module.corp-west-2-transit.transit_gateway.gw_name
	ha_gw = true
	instance_size = var.instance_size
	single_az_ha = false
	az_support = false
	name = "vpn-spoke-west-2-poc"
	gw_name = "vpn-spoke-west-2-poc"
	included_advertised_spoke_routes = var.region_active == "west" ? var.remote_subnet_virtual : null
	}

view raw spokes.tf hosted with ❤ by GitHub

Because the applications requires the same ip addresses, only one vnet will be attached to the transit:

	resource "aviatrix_azure_spoke_native_peering" "user-spoke-west-2" {
	count = var.region_active == "west" ? 1 : 0
	transit_gateway_name = module.user-west-2-transit.transit_gateway.gw_name
	spoke_account_name = var.account
	spoke_region = var.region-a
	spoke_vpc_id = aviatrix_vpc.azure_vnet_user-spoke-west-2.vpc_id
	}

view raw peering.tf hosted with ❤ by GitHub

If a need exists to switch over from one region to another, the fail over is as simple as change the value of region_active in the terraform.tfvars and run terraform apply. Terraform will “destory” the site-2-cloud connection on the active region, detached the workload vnet from the transit, and withdrawn the remote branch prefix from the vpn spoke gateway. Terraform will also create the new objects in the now active region.

References

https://en.wikipedia.org/wiki/Disaster_recovery

RTrentin's world

Secure Multi-Cloud Networking

Leave a ReplyCancel reply

Published by

rtrentin

Proposed Design

Terraform and Aviatrix Provider to the rescue

References

Share this:

Like this:

Related

Leave a ReplyCancel reply

Published by

rtrentin

Discover more from RTrentin's world