In this document, I show how to design, configure, and use Azure Route Server to provide high-availability and scalability to L4-L7 NVA devices for inter-spoke flows.
Route Server simplifies dynamic routing between network virtual appliances (NVA) like firewalls, load balancers, and Azure fabric. It allows the exchange of routing information using Border Gateway Protocol (BGP) routing protocol between NVAs and Azure without the need to manually configure or maintain route tables.
Constraints
- Route Server cannot direct traffic between subnets in the same virtual network to flow inter-subnet traffic through an NVA. System routes for traffic related to virtual network, virtual network peerings, or virtual network service endpoints, are preferred routes, even if BGP routes are more specific. As Route Server uses BGP to advertise routes, currently this is not supported by design. You must continue to use UDRs to force override the routes, and you can’t utilize BGP to quickly failover these routes
- Router Server needs to ensure connectivity to the backend service that manages the Route Server configuration, as such a public IP address is required.
- Route Server doesn’t support configuring a UDR on the RouteServerSubnet.
- Route Server doesn’t support NSG association to the RouteServerSubnet.
- Public ASNs or private ASNs are supported.
- The following ASNs are reserved by Azure or IANA: 8074, 8075, 1207, 65515, 65517, 65518, 65519, 65520, 23456, 64496–64511, 65535–65551
- Route Server has the following limits (per deployment):

- VPN gateway is supported only in Active-Active mode.
If your NVA advertises more routes than the limit, the BGP session will get dropped.
Topology
- A pair of firewalls peers with a Route Server deployed into the hub using eBGP
- Firewall’s BGP keepalive and hold timers are configured to a lower value than the default 30/180 seconds (20/60) to speed up convergence in case of failure
- Firewalls advertised a default route to the Route Server
- A route table is configured for the firewall’s untrusted/external subnet. Gateway route propagation is disabled, and no routes are configured for this route table. This route table is not needed if the firewall NVA is not advertising a default route to Azure Route Server; it is used to prevent traffic loops because Azure fabric will have a default route pointing back to the firewall’s trusted/internal IP address by virtue of BGP route exchange with Azure Route Server
- A route table is configured for the firewall’s trusted/internal subnet. Gateway route propagation is disabled, and no routes are configured for this route table. This route table is not needed if the firewall NVA is not advertising a default route to Azure Route Server; it is used to prevent traffic loops because Azure fabric will have a default route pointing back to the firewall’s trusted/internal IP address by virtue of BGP route exchange with Azure Route Server

The scalability of the firewalls can be further enhanced by leveraging virtual machine scale sets with an auto scale policy. Jose’s post explains how to use VMSS:
https://blog.cloudtrooper.net/2021/05/31/azure-route-server-and-nvas-running-on-scale-sets/
Palo Alto Networks provides templates to help you deploy an auto-scaling tier of VM-Series firewalls using Azure services such as Virtual Machine Scale Sets, Application Insights, Azure load balancers, Azure functions, Panorama and the Panorama plugin for Azure, and VM-Series automation capabilities — including the PAN-OS API and bootstrapping. The templates leverage Azure scalability features designed to manage sudden surges in demand for application workload resources, allowing you to independently scale the VM-Series firewalls in response to changing workloads:
https://blog.cloudtrooper.net/2021/05/31/azure-route-server-and-nvas-running-on-scale-sets/
For crafting this document, a new environment is created. The configuration is detailed below.
Create Hub

Create Subnets
Using the vnet wizard, I created a reserved subnet for the gateway, a subnet dedicated to the route server, a mgmt. subnet, and two subnets for the data interfaces of the firewall:

Create Route Tables
I created rts with route propagation disabled for trusted and another for the untrusted interface subnets to accommodate future cases . Same was done for the mgmt. interface but not show below.
Trusted:

Untrusted:

Create Spokes

Subnets

Deploy Test VMs
VMs for testing were deployed on the spoke1 and spoke2.
Vnet Peering Configuration
Hub to spoke peering is configured to use “this virtual network’s gateway or Route Server”:

Spokes are configured to use the hub route server:

Deploy Firewalls
We deployed a pair of PANs from azure marketplace:

Firewall interfaces are configured to use the previously created subnets:

Configure Firewalls
We configure eth1/1 and eth1/2 with static ip address (I grabbed the address from azure):

A default pointing to the gateway of the untrusted interface is configured:

Two statics are also added to force the east-west traffic to use the trusted interface. Redistribution profile:

Redistribution Rules:

We use AS 65501 for firewall1and 65502 for firewall2. A single peer group per firewall is created:

I’m using the trusted interface to peer with the route server.
Connection Options:

NAT rule for original packets:

NAT rule for translated packets:

Create Route Server
Route Server is deployed as a scale set with two instances .4 and .5:


Configure Route Server
Peer the route server with both firewalls:

Check Route Server
List:
az network routeserver list — subscription <subscription> -o table
Advertised Routes:
az network routeserver peering list-advertised-routes — resource-group <resource group name> — routeserver <route server name> — name <peer name> — subscription <subscription>
Learned Routes:
az network routeserver peering list-learned-routes — resource-group <resource group name> — routeserver <route server name> — name <peer name> — subscription <subscription>
Check Firewall Routes
Routing table:

Forwarding table:

Check VM routes
VM1:

VM2:

Testing
Pinging VM2 from VM1:

Failure Scenario
For testing a failure condition, I rebooted the firewall that was in use while pinging VM2 from VM1. The tcpdump output below shows the moment the active firewall for the ping fails and ping starts using the other active firewall:

When SNAT is not option
For scenarios where the source address is required, BGP can be used to implement an active-standby deployment. There are different ways to use BGP to prefer a path over others (https://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13753-25.html). I will insert a second AS to the AS-Path of the second firewall:

After saving and committing the configuration (SNAT configuration removed), examining the BGP RIB out (advertised routes) tab we can see the AS PATH has two entries:

VMs route table have a single entry to firewall 1:

Failover Scenario
In the case of the active firewall failure, the default routing pointing to firewall 1 will be withdraw and the route pointing to firewall 2 becomes active:

A “few” pings were lost during failover:

References
https://blog.cloudtrooper.net/2021/05/31/azure-route-server-and-nvas-running-on-scale-sets/https://blog.cloudtrooper.net/2021/05/31/azure-route-server-and-nvas-running-on-scale-sets/https://blog.cloudtrooper.net/2021/05/31/azure-route-server-and-nvas-running-on-scale-sets/https://blog.cloudtrooper.net/2021/05/31/azure-route-server-and-nvas-running-on-scale-sets/https://blog.cloudtrooper.net/2021/05/31/azure-route-server-and-nvas-running-on-scale-sets/https://blog.cloudtrooper.net/2021/05/31/azure-route-server-and-nvas-running-on-scale-sets/