A little help from my friend… hacks on how to work with default routes

Most if not all GCP customers consume GCP PaaS/SaaS services like GKE, Cloud SQL, and others. Those services have their compute capacity provisioned inside Google owned VPCs and to establish a data plane for customers to use them vpc peerings are used.

AVX Behavior

  • Routes are created with a fixed priority of 1000
  • Egress through Spoke or Firenet creates routes with tags (avx-snat-noip) and priority 991
NAME                                  NETWORK  DEST_RANGE      NEXT_HOP                            PRIORITY
avx-0869709459044dd3ab184f9a7c18c885  vpc003   0.0.0.0/0       us-east1-b/instances/gcp-spoke-003  991
avx-132f04c21c274870a00d1717fba75421  vpc003   192.168.0.0/16  us-east1-b/instances/gcp-spoke-003  1000
avx-2250d167f6c44869b619f9338563962d  vpc003   172.16.0.0/12   us-east1-b/instances/gcp-spoke-003  1000
avx-aaee80e084aa4f9e8255b38efebc5361  vpc003   0.0.0.0/0       default-internet-gateway            1000
avx-afb0539fe13a4967be78e1e1f4625f21  vpc003   10.0.0.0/8      us-east1-b/instances/gcp-spoke-003  1000
default-route-0ac78bbc901a6640        vpc003   10.13.64.0/24   vpc003                              0
default-route-de7040c7b44e831a        vpc003   10.13.65.0/24   vpc003                              0
default-route-dfafcf4749ec29a9        vpc003   10.13.66.0/24   vpc003                              0
default-route-fd5dbac6f98fbdc5        vpc003   0.0.0.0/0       default-internet-gateway            1000

Constraints

  • Tagged routes cannot be exported or imported across vpc peerings

Workarounds

AVX Gateway Routes

Create routes with a higher priority and with the tag avx-<vpc name>-gbl with the next hop “Default internet gateway”. Those are used exclusively by AVX Spoke Gateways.

gcloud compute routes create avx-gateway-0-0-0-0-1 \
    --network vpc003\
    --destination-range 0.0.0.0/1\
    --next-hop-gateway default-internet-gateway \
    --tags avx-vpc003-gbl\
    --priority 100
gcloud compute routes create avx-gateway-128-0-0-0-1 \
    --network vpc003\
    --destination-range 128.0.0.0/1\
    --next-hop-gateway default-internet-gateway \
    --tags avx-vpc003-gbl\
    --priority 100

This step is necessary to prevent a route loop when executing the step below.

0.0.0.0/0 Option 1

It is possible to use the feature Customize Spoke VPC Routing Table to trigger the creation of 0.0.0.0/1 and 128.0.0.0/1 custom routes pointing to the gateway.

This feature should be tested as in some versions the creation of 0/0, 128/1, and 0/1 from the controller is blocked.

The routing table looks like the following:

NAME                                   NETWORK  DEST_RANGE        NEXT_HOP                            PRIORITY
avx-0-0-0-0-0                          vpc003   0.0.0.0/0         us-east1-b/instances/gcp-spoke-003  900
avx-41e0705ff87b4117960c65694fdab6ce   vpc003   0.0.0.0/1         us-east1-b/instances/gcp-spoke-003  1000
avx-aa2190407b7a4b5e9ff134515627d0e5   vpc003   128.0.0.0/1       us-east1-b/instances/gcp-spoke-003  1000
avx-aaee80e084aa4f9e8255b38efebc5361   vpc003   0.0.0.0/0         default-internet-gateway            1000
default-route-0ac78bbc901a6640         vpc003   10.13.64.0/24     vpc003                              0
default-route-de7040c7b44e831a         vpc003   10.13.65.0/24     vpc003                              0
default-route-dfafcf4749ec29a9         vpc003   10.13.66.0/24     vpc003                              0
default-route-fd5dbac6f98fbdc5         vpc003   0.0.0.0/0         default-internet-gateway            1000

There are corner cases where 0/1 and 128/1 are not supported by Google PaaS services.

0.0.0.0/0 Option 2

Create a 0.0.0.0/0 pointing to a NLB front ending the AVX gateways with a priority high enough to bring the traffic to the gateways:

gcloud compute routes create avx-0-0-0-0-0\
    --network vpc003\
    --destination-range 0.0.0.0/0\
    --next-hop-ilb avx-nlb-vpc003-feip\
    --priority 900

This route is not monitored by the AVX Controller. After executing the command above, the route table looks like:

NAME                                   NETWORK  DEST_RANGE        NEXT_HOP                                 PRIORITY
avx-0-0-0-0-0                          vpc003   0.0.0.0/0         10.13.64.9                               900
avx-0819c846ec4e4b4395d16a31d34bba0f   vpc003   172.16.0.0/12     us-east1-b/instances/gcp-spoke-003       1000
avx-aaee80e084aa4f9e8255b38efebc5361   vpc003   0.0.0.0/0         default-internet-gateway                 1000
avx-ba648051a45e41678097dfedc04f2bff   vpc003   192.168.0.0/16    us-east1-b/instances/gcp-spoke-003       1000
avx-c088391a7e924e00bc3af30ab9df0e0c   vpc003   0.0.0.0/0         us-east1-b/instances/gcp-spoke-003       991
avx-e03f07b01ee14676ab005c0d8dc1a7cd   vpc003   10.0.0.0/8        us-east1-b/instances/gcp-spoke-003       1000
default-route-0ac78bbc901a6640         vpc003   10.13.64.0/24     vpc003                                   0
default-route-de7040c7b44e831a         vpc003   10.13.65.0/24     vpc003                                   0
default-route-dfafcf4749ec29a9         vpc003   10.13.66.0/24     vpc003                                   0
default-route-fd5dbac6f98fbdc5         vpc003   0.0.0.0/0         default-internet-gateway                 1000

From the console:

0.0.0.0/0 Option 3

Create a 0.0.0.0/0 pointing to the AVX gateway with a priority high enough to bring the traffic to the gateways:

gcloud compute routes create avx-gw-0-0-0-0-0\
    --network vpc003\
    --destination-range 0.0.0.0/0\
    --next-hop-instance gcp-spoke-003\
    --priority 900
gcloud compute routes create avx-hagw-0-0-0-0-0\
    --network vpc003\
    --destination-range 0.0.0.0/0\
    --next-hop-instance gcp-spoke-003-hagw\
    --priority 900

This route is not monitored by the AVX Controller. Google checks if the next hop compute instance is up or down and it will properly set the route active or inactive but it has no visibility on the health of the instance.

After executing the command above, the route table looks like:

NAME                                  NETWORK  DEST_RANGE      NEXT_HOP                            PRIORITY
avx-gw-0-0-0-0-0                         vpc003   0.0.0.0/0    us-east1-b/instances/gcp-spoke-003  900
avx-0869709459044dd3ab184f9a7c18c885  vpc003   0.0.0.0/0       us-east1-b/instances/gcp-spoke-003  991
avx-132f04c21c274870a00d1717fba75421  vpc003   192.168.0.0/16  us-east1-b/instances/gcp-spoke-003  1000
avx-2250d167f6c44869b619f9338563962d  vpc003   172.16.0.0/12   us-east1-b/instances/gcp-spoke-003  1000
avx-aaee80e084aa4f9e8255b38efebc5361  vpc003   0.0.0.0/0       default-internet-gateway            1000
avx-afb0539fe13a4967be78e1e1f4625f21  vpc003   10.0.0.0/8      us-east1-b/instances/gcp-spoke-003  1000
avx-gateway-0-0-0-0-1                 vpc003   0.0.0.0/1       default-internet-gateway            100
avx-gateway-128-0-0-0-1               vpc003   128.0.0.0/1     default-internet-gateway            100
default-route-0ac78bbc901a6640        vpc003   10.13.64.0/24   vpc003                              0
default-route-de7040c7b44e831a        vpc003   10.13.65.0/24   vpc003                              0
default-route-dfafcf4749ec29a9        vpc003   10.13.66.0/24   vpc003                              0
default-route-fd5dbac6f98fbdc5        vpc003   0.0.0.0/0       default-internet-gateway            1000

All the internet traffic is diverted to the AVX gateway, including Google API calls.

Avoiding Google API calls through the Fabric

To avoid sending Google API calls through the fabric, add the following route for private.googleapis.com:

gcloud compute routes create google-api-restricted-199-36-153-4-30 \
    --network vpc003\
    --destination-range 199.36.153.8/30\
    --next-hop-gateway default-internet-gateway

To avoid sending Google API calls through the fabric, add the following route for restricted.googleapis.com:

gcloud compute routes create google-api-restricted-199-36-153-4-30 \
    --network vpc003\
    --destination-range 199.36.153.4/30\
    --next-hop-gateway default-internet-gateway

Google Console SSH Access

gcloud compute routes create google-console-ssh-35-235-240-0-20 \
    --network vpc003\
    --destination-range 35.235.240.0/20\
    --next-hop-gateway default-internet-gateway

Failure or Maintenance Scenarios

While the routes created by the controller are managed in different scenarios like gateway failure or gateway maintenance, Google custom routes provides limited monitoring. Google checks if the next hop compute instance is up or down and it will properly set the route active or inactive but it has no visibility on the health of the instance.

A route will become inactive when the compute instance is down and it will become active when the instance is up. The time it takes to mark the route as active or inactive depends on the Google API.

To avoid traffic disruption during a scheduled maintenance, the custom 0/0 route created can be deleted and then recreated once the maintenance is concluded.

For example, when upgrading gateways image:

  • remove 0/0 pointing to hagw
  • upgrade hagw
  • create 0/0 poiting to hagw
  • delete 0/0 pointing to gw
  • upgrade gw
  • create 0/0 poiting to gw

Fail over Times for VPC/VNET Egress

  • Tests were executed shooting the compute instance down.
  • Ping was used as the measurement tool

Option 1: 4 seconds

Mon Jul 29 22:06:10 UTC 2024: 64 bytes from ua-in-f138.1e100.net (108.177.12.138): icmp_seq=325 ttl=114 time=0.675 ms
Mon Jul 29 22:06:14 UTC 2024: 64 bytes from ua-in-f138.1e100.net (108.177.12.138): icmp_seq=329 ttl=114 time=3.97 ms

Option 2: 18 seconds

Mon Jul 29 22:21:02 UTC 2024: 64 bytes from vl-in-f139.1e100.net (74.125.141.139): icmp_seq=311 ttl=114 time=0.670 ms
Mon Jul 29 22:21:20 UTC 2024: 64 bytes from vl-in-f139.1e100.net (74.125.141.139): icmp_seq=329 ttl=114 time=4.16 ms

Option 3: 11 seconds

Mon Jul 29 22:28:15 UTC 2024: 64 bytes from vl-in-f139.1e100.net (74.125.141.139): icmp_seq=743 ttl=114 time=0.668 ms
Mon Jul 29 22:28:26 UTC 2024: 64 bytes from vl-in-f139.1e100.net (74.125.141.139): icmp_seq=754 ttl=114 time=4.10 ms

Next steps

  • Fail over tests with overloaded spokes
  • Firenet

References

https://cloud.google.com/vpc/docs/routes

https://cloud.google.com/vpc/docs/configure-private-google-access-hybrid

https://cloud.google.com/vpc-service-controls/docs/set-up-private-connectivity

Leave a Reply