Home

  • VPC Peering Security Groups

    VPC Peering Security Groups
    black android smartphone on top of white book
    Photo by Pixabay on Pexels.com

    A security group serves as a protective barrier, functioning like a firewall to manage the flow of network traffic to and from the resources within your Virtual Private Cloud (VPC). With security groups, you have the flexibility to select the specific ports and communication protocols that are permitted for both incoming (inbound) and outgoing (outbound) network traffic.

    You have the capability to modify the inbound or outbound rules within your VPC’s security groups to make reference to security groups in a peered VPC. This adjustment enables the smooth exchange of network traffic between instances associated with the specified security groups in the peered VPC.

    Testing

    Testing topology:

    SG:

    Result:

    Changing from cross referenced SG to CIDR:

    Results:

    No pings were lost.

    References

    https://docs.aws.amazon.com/vpc/latest/userguide/security-groups.html

    https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-security-groups.html

  • Google Cloud Shared VPC

    Google Cloud Shared VPC
    Mobi Bike Share Handlebars

    A Shared Virtual Private Cloud (VPC) is a feature within Google Cloud that enables organizations to connect resources from multiple projects to a common network infrastructure. This shared network, hosted within a designated “host project,” allows secure and efficient communication among resources using internal IP addresses. Service projects, attached to the host project’s network, can utilize specific subnets for their instances.

    Source: https://cloud.google.com/vpc/docs/shared-vpc

    This setup offers a balance between centralized control over network resources, such as subnets and firewalls, and decentralized administration of instances within individual service projects. By segregating administrative responsibilities, organizations can enforce consistent access control policies, enhance security, and manage costs effectively.

    Configuration

    • Enable host project:

    Permission compute.organizations.enableXpnHost is required to configure a project as “host”.

    • select subnets to share:
    • attach service project(s):

    The Compute Engine API should be enable for a “service” project attach to a “host” project.

    Testing

    • shared vpc view from the service account:
    • when creating a compute instance, the “networks shared with me” option is available:

    References

    https://cloud.google.com/vpc/docs/shared-vpc

  • Configuring Google Cloud Workload Identity Federation (AWS)

    Configuring Google Cloud Workload Identity Federation (AWS)

    A workload identity is a special identity used for authentication and access by software applications and services. It helps them connect to other services and resources securely.

    The most direct method for external workloads to use Google Cloud APIs is by using downloaded service account keys. However, this approach comes with two significant challenges:

    • Management Complexity: The keys must be stored securely and regularly changed, which can be administratively demanding.
    • Security Vulnerability: Since keys are long-term credentials, they are susceptible to compromise, posing a security risk.

    To address these issues, workload identity federation offers an alternative. This approach allows applications outside of Google Cloud to replace persistent service account keys with short-lived access tokens. This is accomplished by establishing a trust relationship between Google Cloud and an external identity provider. The external identity provider issues time-limited credentials that applications can use to act as service accounts.

    The benefits of this approach include heightened security and reduced management overhead. By using temporary access tokens, the exposure window for potential security breaches is minimized. Additionally, the need for ongoing key management and rotation is diminished.

    Google Cloud Workload Identity Federation Primer

    https://cloud.google.com/iam/docs/workload-identity-federation

    Configuration

    To implement workload identity federation, you need to configure the external identity provider to issue trusted tokens. Then, set up Google Cloud’s Identity and Access Management (IAM) policies to permit the external provider to generate tokens on behalf of designated service accounts. Applications can then utilize these short-lived tokens to securely access resources in Google Cloud.

    This blog shows how to use workload identity federation to let AWS workloads authenticate to Google Cloud without a service account key.

    • In the Google Cloud console, enable the IAM, Resource Manager, Service Account Credentials, and Security Token Service APIs:
    • Create the workload identity pool:
    • create a provider:
    • configure provider attributes:
    • Create a service account for the external workload. Grant the service account access to resources that you want external identities to access:
    • To allow external identities to impersonate a service account, you grant them the Workload Identity User role roles/iam.workloadIdentityUser:

    Testing

    For testing, we will deploy a VM running Linux and install gcloud-cli on it.

    • Create a credential configuration:
    • Authenticate using the credential configuration exported above:
    • run “a few gcloud commands”:

    and it works!

    References

    https://cloud.google.com/iam/docs/workload-identity-federation

    https://cloud.google.com/iam/docs/workload-identity-federation-with-other-clouds

    https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/tree/master/blueprints/cloud-operations/workload-identity-federation

  • Google Cloud Migration Scenarios

    Google Cloud Migration Scenarios
    shallow focus photo of herd of black and brown buffaloes
    Photo by Dick Scholten on Pexels.com

    Lab and Configuration Staging

    The lab diagram for this exercise is show below:

    • gcp vpc configuration
    • Cloud Router:
    • VPN:
    • Peer Gateway:
    • CSR1000v configuration:
    
    crypto ikev2 keyring KEYRING1
     peer 35.242.4.56
      address 35.242.4.56
      pre-shared-key q6UehAxCDgBm19Cf2Y59BiQoPxGG7AWB
     !
     peer 35.220.10.92
      address 35.220.10.92
      pre-shared-key q6UehAxCDgBm19Cf2Y59BiQoPxGG7AWB
     !
    !
    
    crypto ikev2 profile IKEV2-PROFILE-GCP
     match identity remote address 35.242.4.56 255.255.255.255 
     match identity remote address 35.220.10.92 255.255.255.255 
     authentication remote pre-share
     authentication local pre-share
     keyring local KEYRING1
     lifetime 28800
     dpd 10 10 on-demand
    !
    
    interface Tunnel100
     ip address 169.254.5.70 255.255.255.252
     ip mtu 1400
     ip tcp adjust-mss 1360
     tunnel source GigabitEthernet1
     tunnel mode ipsec ipv4
     tunnel destination 35.242.4.56
     tunnel protection ipsec profile ipsec-vpn-gcp
     ip virtual-reassembly
    !
    interface Tunnel110
     ip address 169.254.138.178 255.255.255.252
     ip mtu 1400
     ip tcp adjust-mss 1360
     tunnel source GigabitEthernet1
     tunnel mode ipsec ipv4
     tunnel destination 35.220.10.92
     tunnel protection ipsec profile ipsec-vpn-gcp
     ip virtual-reassembly
    !
    
    router bgp 36180
     bgp log-neighbor-changes
     bgp graceful-restart
     neighbor 169.254.5.70 remote-as 64514
     neighbor 169.254.138.178 remote-as 64514
     !
     address-family ipv4
      redistribute connected
      redistribute static
      neighbor 169.254.5.70 activate
      neighbor 169.254.138.178 activate
     exit-address-family
    • CSR1000v routes:
    csr1000v-3#show ip route
    Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
           D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
           N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
           E1 - OSPF external type 1, E2 - OSPF external type 2, m - OMP
           n - NAT, Ni - NAT inside, No - NAT outside, Nd - NAT DIA
           i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
           ia - IS-IS inter area, * - candidate default, U - per-user static route
           H - NHRP, G - NHRP registered, g - NHRP registration summary
           o - ODR, P - periodic downloaded static route, l - LISP
           a - application route
           + - replicated route, % - next hop override, p - overrides from PfR
           & - replicated local route overrides by connected
    
    Gateway of last resort is 172.31.0.1 to network 0.0.0.0
    
    S*    0.0.0.0/0 [1/0] via 172.31.0.1, GigabitEthernet1
          10.0.0.0/24 is subnetted, 4 subnets
    B        10.11.0.0 [20/100] via 169.254.138.177, 00:45:55
                       [20/100] via 169.254.5.69, 00:45:55
    B        10.11.1.0 [20/100] via 169.254.138.177, 00:45:55
                       [20/100] via 169.254.5.69, 00:45:55
    B        10.11.2.0 [20/100] via 169.254.138.177, 00:45:55
                       [20/100] via 169.254.5.69, 00:45:55
    B        10.11.3.0 [20/100] via 169.254.138.177, 02:16:05
                       [20/100] via 169.254.5.69, 02:16:05
          169.254.0.0/16 is variably subnetted, 4 subnets, 2 masks
    C        169.254.5.68/30 is directly connected, Tunnel100
    L        169.254.5.70/32 is directly connected, Tunnel100
    C        169.254.138.176/30 is directly connected, Tunnel110
    L        169.254.138.178/32 is directly connected, Tunnel110
          172.31.0.0/16 is variably subnetted, 3 subnets, 2 masks
    C        172.31.0.0/28 is directly connected, GigabitEthernet1
    L        172.31.0.13/32 is directly connected, GigabitEthernet1
    S        172.31.0.128/28 [1/0] via 172.31.0.1
    • VPC001 routes:
    • Custom Cloud Route configuration:
    • CSR1000v route table:
    csr1000v-3#show ip route
    Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
           D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
           N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
           E1 - OSPF external type 1, E2 - OSPF external type 2, m - OMP
           n - NAT, Ni - NAT inside, No - NAT outside, Nd - NAT DIA
           i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
           ia - IS-IS inter area, * - candidate default, U - per-user static route
           H - NHRP, G - NHRP registered, g - NHRP registration summary
           o - ODR, P - periodic downloaded static route, l - LISP
           a - application route
           + - replicated route, % - next hop override, p - overrides from PfR
           & - replicated local route overrides by connected
    
    Gateway of last resort is 172.31.0.1 to network 0.0.0.0
    
    S*    0.0.0.0/0 [1/0] via 172.31.0.1, GigabitEthernet1
          10.0.0.0/8 is variably subnetted, 6 subnets, 2 masks
    B        10.11.0.0/24 [20/100] via 169.254.138.177, 00:06:35
                          [20/100] via 169.254.5.69, 00:06:35
    B        10.11.1.0/24 [20/100] via 169.254.138.177, 00:06:35
                          [20/100] via 169.254.5.69, 00:06:35
    B        10.11.2.0/24 [20/100] via 169.254.138.177, 00:06:35
                          [20/100] via 169.254.5.69, 00:06:35
    B        10.11.3.0/24 [20/100] via 169.254.138.177, 00:06:35
                          [20/100] via 169.254.5.69, 00:06:35
    B        10.12.0.0/22 [20/100] via 169.254.138.177, 00:00:48
                          [20/100] via 169.254.5.69, 00:00:48
    B        10.13.0.0/22 [20/100] via 169.254.138.177, 00:00:48
                          [20/100] via 169.254.5.69, 00:00:48
          169.254.0.0/16 is variably subnetted, 4 subnets, 2 masks
    C        169.254.5.68/30 is directly connected, Tunnel100
    L        169.254.5.70/32 is directly connected, Tunnel100
    C        169.254.138.176/30 is directly connected, Tunnel110
    L        169.254.138.178/32 is directly connected, Tunnel110
          172.31.0.0/16 is variably subnetted, 3 subnets, 2 masks
    C        172.31.0.0/28 is directly connected, GigabitEthernet1
    L        172.31.0.13/32 is directly connected, GigabitEthernet1
    S        172.31.0.128/28 [1/0] via 172.31.0.1
    • VPC001 route table:
    • VPC002 route table:
    • VPC003 route table:

    Staging Aviatrix

    • stage controller (7.1) and copilot (3.10)
    • stage transit gateways
    • stage AVX Transit to CSR1000v IPSec and BGP
    • stage spoke gateways using a new subnetwork
    • spokes are not attached, except for gcp-vpc003-gw:

    Flows of Interest

    • flow 1: native google cloud hub to on-prem
    • flow 2: native google cloud spoke to on-prem
    • flow 3: native google cloud hub to spoke
    • flow 4: avx spoke gateway to native cloud hub

    Constraints

    • Flow 1 and Flow 2 depends on the Cloud VPN IPSec connection
    • Flow 3 depends on the vpc peering

    Migration Approaches

    The Slicer

    • this approach leverages the gateway customize Spoke VPC Routing Table and Customize Spoke VPC Routing Table to attract traffic towards the fabric
    • Spoke VPC Routing Table: This feature allows you to customize Spoke VPC/VNet route table entry by specifying a list of comma separated CIDRs. When a CIDR is inserted in this field, automatic route propagation to the Spoke(s) VPC/VNet will be disabled, overriding propagated CIDRs from other spokes, transit gateways and on-prem network. One use case of this feature is for a Spoke VPC/VNet that is customer facing and your customer is propagating routes that may conflict with your on-prem routes.
    • Customize Spoke VPC Routing Table: This route policy enables you to selectively exclude some VPC/VNet CIDRs from being advertised to on-prem.. When this policy is applied to an Aviatrix Spoke Gateway, the list is an “Include list” meaning only the CIDRs in the input fields are advertised to on-prem

    Constraints

    • The slicer requires gateways on every vpc for proper routing
    • The slicer does not support /32s
    • The slicer is limited to the numbers of custom routes supported in a single google project (600)
    • The slicer requires vpc peering to be tiered down
    • Ther slicer, to keep flow symmetric, requires both sides of a connection to be updated

    Testing

    Flow 1 and Flow 2 Migration using The Slicer (Switch Traffic)

    Slicing it:

    CSR1000v routes:

    
    csr1000v-3#show ip route
    Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
           D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
           N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
           E1 - OSPF external type 1, E2 - OSPF external type 2, m - OMP
           n - NAT, Ni - NAT inside, No - NAT outside, Nd - NAT DIA
           i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
           ia - IS-IS inter area, * - candidate default, U - per-user static route
           H - NHRP, G - NHRP registered, g - NHRP registration summary
           o - ODR, P - periodic downloaded static route, l - LISP
           a - application route
           + - replicated route, % - next hop override, p - overrides from PfR
           & - replicated local route overrides by connected
    
    Gateway of last resort is 172.31.0.1 to network 0.0.0.0
    
    S*    0.0.0.0/0 [1/0] via 172.31.0.1, GigabitEthernet1
          10.0.0.0/8 is variably subnetted, 15 subnets, 3 masks
    B        10.11.0.0/24 [20/0] via 169.254.132.78, 00:46:41
    B        10.11.1.0/24 [20/100] via 169.254.138.177, 00:00:36
                          [20/100] via 169.254.5.69, 00:00:36
    B        10.11.2.0/24 [20/100] via 169.254.138.177, 00:00:36
                          [20/100] via 169.254.5.69, 00:00:36
    B        10.11.3.0/24 [20/100] via 169.254.138.177, 00:00:36
                          [20/100] via 169.254.5.69, 00:00:36
    B        10.12.0.0/16 [20/100] via 169.254.138.177, 00:01:59
                          [20/100] via 169.254.5.69, 00:01:59
    B        10.12.64.0/24 [20/0] via 169.254.132.78, 00:46:17
    B        10.12.65.0/25 [20/0] via 169.254.132.78, 00:04:14
    B        10.12.65.128/25 [20/0] via 169.254.132.78, 00:04:14
    B        10.12.66.0/25 [20/0] via 169.254.132.78, 00:04:14
    B        10.12.66.128/25 [20/0] via 169.254.132.78, 00:04:14
    B        10.13.0.0/16 [20/100] via 169.254.138.177, 00:01:59
                          [20/100] via 169.254.5.69, 00:01:59
    B        10.13.64.0/24 [20/0] via 169.254.132.78, 03:20:13
    B        10.13.65.0/24 [20/0] via 169.254.132.78, 00:00:11
    B        10.13.66.0/24 [20/0] via 169.254.132.78, 00:00:11
    B        10.14.64.0/24 [20/0] via 169.254.132.78, 03:20:13
          169.254.0.0/16 is variably subnetted, 6 subnets, 2 masks
    C        169.254.5.68/30 is directly connected, Tunnel100
    L        169.254.5.70/32 is directly connected, Tunnel100
    C        169.254.132.76/30 is directly connected, Tunnel300
    L        169.254.132.77/32 is directly connected, Tunnel300
    C        169.254.138.176/30 is directly connected, Tunnel110
    L        169.254.138.178/32 is directly connected, Tunnel110
          172.31.0.0/16 is variably subnetted, 3 subnets, 2 masks
    C        172.31.0.0/28 is directly connected, GigabitEthernet1
    L        172.31.0.13/32 is directly connected, GigabitEthernet1
    S        172.31.0.128/28 [1/0] via 172.31.0.1

    If the Cloud Router custom advertisement is doing (route) summarization, the slice on the avx spoke gateway advertised routes is not required. In this case, we should customize the advertisement to only allow the subnetwork where the avx gateway was deployed.

    Flow3 and Flow 4 Migration using The Slicer (Switch Traffic)

    This step requires that all the north-south flows were properly migrated at least between spoke and hub (peering will be removed) . Deletion of the vpc peering after a north-south migration will make the RFC1918 routes to kick in and conclude the migration:

    • initial routes
    • peer removed

    once the gateways is reconfigured:

    As an example, vpc001 route table without the north-south migration complete:

    In this case, the spoke gateway requires customization to avoid an asymmetric flow or black hole. VPCs can be migrated individually or even subnets with this approach. More details/tests can be found on the following blog:

    BGPoLAN

    • Peering the AVX Transit with Google Cloud Cloud Router using NCC (Network Connectivity Center) is a migration approach for customers using Cloud Interconnects and with demand for high throughput
    • The architecture show below is one of many possible and it does not require to create new cloud interconnect lan interfaces
    • There is a cost associated with NCC: https://cloud.google.com/network-connectivity/docs/network-connectivity-center/pricing. Price is based on a flat utilization fee plus data transfer.
    • For this scenario work properly, do not forget to enable Site-to-site data transfer during the NCC spoke creation.

    Flow 1, Flow 2, Flow 3 and Flow 4 Migration using BGPoLAN (Switch Traffic)

    Flow 1 requires no migration in this scenario where the cloud native hub is repurposed as a connectivity vpc or avx bgpolan vpc.

    The Slicer could once again be used to migrate north-south flows in a different time than east-west. One advantage of this approach is that we can migrate all flows from a vpc in a single operation… breaking the vpc peering:

    After a few seconds the routes are withdrawn and the custom RFC1918 programmed by the avx controller will be preferred:

    vpc001 route table is dynamically populated with the route learned from the transit: workloads on vpc001 to reach out to vpc002 needs to transverse the fabric ingress at the avx transit “bpgolan” interface:

    vpc003 talks to vpc001 and vpc002 using the RFC1918 custom routes programmed by the avx controller. For more information on Google Cloud Network Connectivity Center and Avx visit the link below:

    Custom Routes

    Cleanup

    Multiple Regions

    References

    Terraform Examples

    VPC

    resource "google_compute_network" "vpc_network" {
      for_each                = var.vpcs
      project                 = var.project
      name                    = each.value.name
      auto_create_subnetworks = false
      routing_mode            = "GLOBAL"
    }

    Subnetwork

    resource "google_compute_subnetwork" "network" {
      depends_on = [
        google_compute_network.vpc_network
      ]
      for_each      = var.networks
      project       = var.project
      name          = each.value.name
      ip_cidr_range = each.value.ip_cidr_range
      region        = each.value.region
      network       = each.value.network
    }

    Firewall Rule

    resource "google_compute_firewall" "vpc001_compute_firewall" {
      project = var.project
      name    = "fw-${google_compute_network.vpc_network["vpc001"].name}"
      network = google_compute_network.vpc_network["vpc001"].name
    
      allow {
        protocol = "icmp"
      }
    
      allow {
        protocol = "tcp"
        ports    = ["22", "80", "443", "53"]
      }
    
      allow {
        protocol = "udp"
        ports    = ["53"]
      }
    
      source_ranges = ["192.168.0.0/16", "172.16.0.0/12", "10.0.0.0/8", "35.191.0.0/16", "130.211.0.0/22", "35.199.192.0/19"]
    }

    Cloud Router

    resource "google_compute_router" "google_compute_router1" {
      depends_on = [
        google_compute_network.vpc_network
      ]
      project = var.project
      name    = "cr-east-${google_compute_network.vpc_network["vpc001"].name}"
      network = google_compute_network.vpc_network["vpc001"].name
      bgp {
        asn            = 64514
        advertise_mode = "DEFAULT"
      }
      region = "us-east1"
    }

    Cloud VPN Gateway

    resource "google_compute_ha_vpn_gateway" "ha_gateway1" {
      depends_on = [
        google_compute_router.google_compute_router1
      ]
      project = var.project
      region  = google_compute_router.google_compute_router1.region
      name    = "vpn-east-${google_compute_network.vpc_network["vpc001"].name}"
      network = google_compute_network.vpc_network["vpc001"].name
    }

    External Gateway

    resource "google_compute_external_vpn_gateway" "external_gateway1" {
      project         = var.project
      name            = "peer-${replace(var.remote_ip1, ".", "-")}"
      redundancy_type = "SINGLE_IP_INTERNALLY_REDUNDANT"
      interface {
        id         = 0
        ip_address = var.remote_ip1
      }
    }

    VPN Tunnel

    resource "google_compute_vpn_tunnel" "tunnel1" {
      depends_on = [
        google_compute_router.google_compute_router1,
        google_compute_ha_vpn_gateway.ha_gateway1
      ]
      project                         = var.project
      name                            = "tunnel-1-${google_compute_external_vpn_gateway.external_gateway1.name}"
      peer_external_gateway           = google_compute_external_vpn_gateway.external_gateway1.self_link
      peer_external_gateway_interface = "0"
      router                          = google_compute_router.google_compute_router1.self_link
      shared_secret                   = "Avtx2019!"
      vpn_gateway                     = google_compute_ha_vpn_gateway.ha_gateway1.self_link
      vpn_gateway_interface           = "0"
      region                          = google_compute_router.google_compute_router1.region
    }

    BGP Router Interface

    resource "google_compute_router_interface" "router1_interface1" {
      project    = var.project
      name       = "router1-interface1"
      router     = google_compute_router.google_compute_router1.name
      region     = google_compute_router.google_compute_router1.region
      ip_range   = "169.254.0.1/30"
      vpn_tunnel = google_compute_vpn_tunnel.tunnel1.name
    }
    
    resource "google_compute_router_interface" "router1_interface2" {
      project    = var.project
      name       = "router1-interface2"
      router     = google_compute_router.google_compute_router1.name
      region     = google_compute_router.google_compute_router1.region
      ip_range   = "169.254.0.5/30"
      vpn_tunnel = google_compute_vpn_tunnel.tunnel2.name
    }

    BGP Rote Peering

    resource "google_compute_router_peer" "router1_peer1" {
      project                   = var.project
      name                      = "router1-peer1"
      router                    = google_compute_router.google_compute_router1.name
      region                    = google_compute_router.google_compute_router1.region
      peer_ip_address           = "169.254.0.2"
      peer_asn                  = var.peer_asn
      advertised_route_priority = var.advertised_route_priority
      interface                 = google_compute_router_interface.router1_interface1.name
    }
    
    resource "google_compute_router_peer" "router1_peer2" {
      project                   = var.project
      name                      = "router1-peer2"
      router                    = google_compute_router.google_compute_router1.name
      region                    = google_compute_router.google_compute_router1.region
      peer_ip_address           = "169.254.0.6"
      peer_asn                  = var.peer_asn
      advertised_route_priority = var.advertised_route_priority
      interface                 = google_compute_router_interface.router1_interface2.name
    }

    Discovery

    • Get the list of VPCs in the project:
    vpcs = service.networks().list(project=project_id).execute()
    • Get the list of subnetworks in the project:
    subnetworks = service.subnetworks().list(project=project_id, region=region).execute()
    • Get the list of routes in the project.
    routes = service.routes().list(project=project_id).execute()
    • Get the list of Cloud Interconnects in the project.
    interconnects = service.interconnects().list(project=project_id).execute()
    • Get the list of Cloud LAN Interfaces in the project.
    lan_interfaces = service.regionBackendServices().list(project=project_id, region=region).execute()
    • Get the list of Cloud VPN Gateways in the project.
    vpn_gateways = service.vpnGateways().list(project=project_id, region=region).execute()
    • Get the list of Cloud VPN External Gateways in the project.
    external_gateways = service.externalVpnGateways().list(project=project_id).execute()
    • Get the list of Cloud VPN Tunnels in the project.
    vpn_tunnels = service.vpnTunnels().list(project=project_id, region=region).execute()
    • Get the list of Cloud Routers in the project.
    routers = service.routers().list(project=project_id, region=region).execute()

    Where:

        service = get_authenticated_service()

    and

    def get_authenticated_service():
        """Authenticate and create a service object for the Compute Engine API."""
        credentials, project_id = google.auth.default(scopes=['https://www.googleapis.com/auth/compute'])
    
        # Create a service object with the authenticated credentials
        service = googleapiclient.discovery.build('compute', 'v1', credentials=credentials)
        return service

    Saving Routes

    A route consists of a destination range and a next hop. The destination range specifies the range of IP addresses that the route applies to. The next hop specifies the IP address or hostname of the device that the traffic should be sent to.

    Google Cloud supports the following types of routes:

    • Default routes: These routes are created automatically when you create a VPC network. They direct traffic to the internet gateway for your VPC network.
    • Subnet routes: These routes are created when you create a subnet in a VPC network. They direct traffic to the default route for the subnet.
    • Static routes: These routes are created manually. They can be used to direct traffic to specific destinations, such as an internet gateway or a cloud load balancer.
    • Dynamic routes: These routes are created automatically by Cloud Router. They are used to direct traffic to destinations that are connected to your VPC network through a Cloud Router.

    For backup and restore operations, we need to store static routes in a file and in case of fallback, we would need to read that file and apply the static route back.

    Migration Steps

    Switch Traffic

    • Delete vpc peering
    • Delete static routes (behavior should be handled by an argument)
    • If the scenario includes a new cloud router, and cloud lan interfaces for their interconnect, we also need to remove the vpc prefix(es) from the cloud router custom advertisement IP ranges.
    • Change avx gateway propagation to advertise all prefixes
  • Apigee not bee 🙂

    Apigee not bee :)
    animal bee bloom blooming
    Photo by Pixabay on Pexels.com

    Apigee is a Google SaaS platform for developing and managing APIs. Apigee provides an abstraction layer to backend service APIs and provides security, rate limiting, quotas, and analytics. Apigee consists of the following components:

    • Apigee services: the APIs that you use to create, manage, and deploy API proxies.
    • Apigee runtime: a set of containerized runtime services in a Kubernetes cluster that Google maintains. All API traffic passes through and is processed by these services.
    • GCP services: provides identity management, logging, analytics, metrics, and project management functions.
    • Back-end services: back-end services are responsible for performing business logic, accessing databases, processing requests, and generating responses. These services can be hosted on the same server as the API proxy or on a separate server, and they communicate with the API proxy through a RESTful API or other protocols.
    Source: https://cloud.google.com/apigee/docs/api-platform/get-started/what-apigee

    A more granular network friendly diagram is show below:

    Source: https://cloud.google.com/apigee/docs/api-platform/get-started/what-apigee

    A more in depth overview is provided here: https://cloud.google.com/apigee/docs/api-platform/architecture/overview

    Setting it up

    There are at least three different ways to provision Apigee:

    https://cloud.google.com/apigee/docs/api-platform/get-started/provisioning-intro#provisioning-options

    I’m going to use a free trial wizard to get acquainted with Apigee:

    The evaluation wizard guides us through the steps:

    • Enable APIs
    • Networking
    • Organization
    • Access Routing

    Apigee runtime requires a dedicated /22 range for evaluation:

    Each Apigee instance requires a non-overlapping CIDR range of /22 and /28. The Apigee runtime plane is assigned IP addresses from within this CIDR range.

    Organization provisioning can take up to 45 minutes

    Client to Apigee traffic is also called “northbound” traffic. Northbound configuration options include the following:

    • internal with VPC peering
    • external with MIG
    • internal with PSC
    • external with PSC

    Once the access routing is configured, Apigee is ready.

    Network

    The network and cidr provided to the wizard is used to deploy the ingress internal load balancer (instance):

    A vpc peering allows communication between VPCs:

    Source: https://cloud.google.com/apigee/docs/api-platform/architecture/overview

    The vpc network peering is part of the private service connection configuration:

    To route traffic from client apps on the internet to Apigee, we can use a global external HTTPS load balancer. An LB can communicate across GCP projects.

    We could also provision a MIG of virtual machines as a network bridge. The MIG VMs have the capability to communicate bidirectionally across the peered networks.

    Apps on the internet talk to the XLB, the XLB talks to the bridge VM, and the bridge VM talks to the Apigee network.

    Source: https://cloud.google.com/apigee/docs/api-platform/architecture/overview

    Load Balancers

    The reason we cannot simply position a load balancer in front of the Apigee ingestion:

    A compute instance working as a proxy is required for routing traffic from outside the customer vpc (vpc001). More on that during the testing.

    Using Apigee

    I’m going to use the classic console as not every feature is available under the google cloud console:

    Create API Proxy

    Deploy

    Testing

    From a VM running in the customer vpc001

    From a VM running in the customer vpc001 (the one directly attached to the Apigee:

    Tracing the API call using the Apigee trace:

    From a VM running in the customer vpc002 (vpc peering)

    vpc peering is not transitive and Apigee cidr is not exported from vpc001 towards vpc002 what makes necessary a proxy like a VM running on VPC001

    From a VM running in the customer vpc002 (MIG)

    Enable Private Google Access for a subnet of your VPC network:

    Define variables:

    Create a instance template:

    Please folllow the entire procedure to deploy an external or internal load balancer with a MIG for a proper supported solution. The procedure can be found at https://cloud.google.com/apigee/docs/api-platform/get-started/install-cli#externalmig

    gcloud compute instance-templates create $MIG_NAME \
      --project $PROJECT_ID \
      --region $REGION \
      --network $VPC_NAME \
      --subnet $VPC_SUBNET \
      --tags=https-server,apigee-mig-proxy,gke-apigee-proxy \
      --machine-type e2-medium --image-family debian-10 \
      --image-project debian-cloud --boot-disk-size 20GB \
      --no-address \
      --metadata ENDPOINT=$APIGEE_ENDPOINT,startup-script-url=gs://apigee-5g-saas/apigee-envoy-proxy-release/latest/conf/startup-script.sh

    From an instance running in a second vpc (my case vpc002) we can access the apigee proxy using one of the MIG instance:

    The debug shows the connection comes from 10.11.1.3 which is one of the MIG instance:

    Policies

    From a VM running in the customer vpc002 using AVX

    For this scenario, I removed the peering connection between vpc001 and vpc002. Custom advertise the Apigee CIDR range using the Customize Spoke Advertised VPC CIDRs:

    VPC001 “imports” the apigee ranges:

    VPC001 exports to apigee the RFC1918 created and controlled by the avx controller:

    VPC002 gateway routing table:

    Gateway vpc001 routing table:

    From a VM running on vpc002 I can access the apigee lb without a proxy:

    From the debug session we can see the client IP is indeed the IP from the vpc002 compute instance:

    Private Service Connection (PSC)

    One of the advantages of using PSC is that there is no need to deploy a MIG. Find out the apigee service attachment:

    ricardotrentin@RicardontinsMBP workflows % curl -i -H "$AUTH" \
      "https://apigee.googleapis.com/v1/organizations/$PROJECT_ID/instances"
    HTTP/2 200 
    content-type: application/json; charset=UTF-8
    vary: X-Origin
    vary: Referer
    vary: Origin,Accept-Encoding
    date: Sun, 23 Apr 2023 18:08:59 GMT
    server: ESF
    cache-control: private
    x-xss-protection: 0
    x-frame-options: SAMEORIGIN
    x-content-type-options: nosniff
    alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
    accept-ranges: none
    
    {
      "instances": [
        {
          "name": "test-lab-apigee-us-east1",
          "location": "us-east1",
          "host": "10.11.248.2",
          "port": "443",
          "createdAt": "1682013894059",
          "lastModifiedAt": "1682015454529",
          "diskEncryptionKeyName": "projects/rtrentin-01/locations/us-east1/keyRings/lab-test-apigee-kr/cryptoKeys/lab-test-apigee-key",
          "state": "ACTIVE",
          "peeringCidrRange": "SLASH_22",
          "runtimeVersion": "1-9-0-apigee-25",
          "ipRange": "10.11.248.0/28,10.11.248.16/28",
          "consumerAcceptList": [
            "rtrentin-01"
          ],
          "serviceAttachment": "projects/u86f317c835229a5b-tp/regions/us-east1/serviceAttachments/apigee-us-east1-kt9m"
        }
      ]
    }

    Create a network end point group:

    gcloud compute network-endpoint-groups create apigee-neg \
      --network-endpoint-type=private-service-connect \
      --psc-target-service=projects/u86f317c835229a5b-tp/regions/us-east1/serviceAttachments/apigee-us-east1-kt9m \
      --region=$RUNTIME_LOCATION \
      --network=vpc001 \
      --subnet=subnet002 \
      --project=rtrentin-01

    Resuming the load balancer creation we initiated before:

    The remaining of the configuration is straightforward.

    Next Steps

    https://cloud.google.com/apigee/docs/api-platform/get-started/go-deeper

    Costs

    https://cloud.google.com/apigee/pricing/

    References

    https://cloud.google.com/apigee/docs/api-platform/get-started/what-apigee

    https://cloud.google.com/vpc/docs/private-service-connect

    https://cloud.google.com/apigee/docs/api-platform/get-started/accessing-internal-proxies

  • Using Azure Route Server for Dynamic Routing

    Using Azure Route Server for Dynamic Routing
    road sign post under the cloudy sky
    Photo by Frans van Heerden on Pexels.com

    Azure Route Server is a service provided by Microsoft Azure that simplifies the process of dynamic routing for network virtual appliances (NVAs). NVAs are commonly used in virtual networks to perform tasks such as load balancing, network address translation (NAT), and virtual private network (VPN) connectivity.

    In a traditional network setup, dynamic routing protocols such as Border Gateway Protocol (BGP) require manual configuration and maintenance of each individual NVA. This can become time-consuming and error-prone as the network scales. With Azure Route Server, NVAs can simply connect to the route server and exchange routing information automatically.

    Azure Route Server supports both BGP and static routing protocols, allowing for flexible and scalable network configurations. In addition, it integrates with Azure Firewall and other Azure networking services to provide a complete solution for managing network traffic and security.

    By using Azure Route Server, you can simplify your network infrastructure and reduce the administrative overhead of managing NVAs.

    Topology

    Configuration

    Fortinet disponibilizes templates for the most common cases at https://github.com/fortinet/fortigate-terraform-deploy

    BGP configuration:

    config router bgp
        set as 65500
        set ebgp-multipath enable
        set additional-path enable
        set graceful-restart enable
        config neighbor
            edit "172.1.4.4"
                set capability-graceful-restart enable
                set ebgp-enforce-multihop enable
                set interface "port3"
                set remote-as 65515
                set keep-alive-timer 1
                set holdtime-timer 3
            next
            edit "172.1.4.5"
                set ebgp-enforce-multihop enable
                set interface "port3"
                set remote-as 65515
                set keep-alive-timer 1
                set holdtime-timer 3
            next
        end
        config redistribute "connected"
        end
        config redistribute "static"
            set status enable
        end
    end

    ARS:

    ARS peers:

    AVX configuration:

    Disable Route Propagation

    Azure Route Server will learn routes from the NVAs and propagate them to the virtual instances, which can cause loops if not properly configured. When a route loop occurs, network traffic may be sent in a continuous loop between two or more network devices, leading to degraded network performance or complete network failure.

    To prevent route loops when using Azure Route Server with NVAs, it’s important to properly configure the network routing rules. One way to do this is to use an empty route table and attach it to the subnets of interest, as explained in the previous answer. This will prevent the NVAs from propagating routes to the virtual instances and causing loops.

    Testing

    Ping from the client VM across FortiGates and AVX fabric to VM running on the spoke vnet:

    Failover

    Failover happens extremely fast with only two pings lost:

    Troubleshooting

    Spoke VM route table:

    Spoke route table:

    Transit Gateway route table:

    Transit Gateway eth3 route table:

    FortiGate route table:

    FortiGate port3 route table:

    References

    https://learn.microsoft.com/en-us/azure/route-server/overview

    https://github.com/fortinet/fortigate-terraform-deploy

  • Hyperautomation with GCP (draft)

    Hyperautomation with GCP (draft)
    colorful toothed wheels
    Photo by Digital Buggu on Pexels.com

    Hyperautomation

    Hyperautomation is a business-driven, disciplined approach that organizations use to rapidly identify, vet and automate as many business and IT processes as possible. Hyperautomation involves the orchestrated use of multiple technologies, tools or platforms, including: artificial intelligence (AI), machine learning, event-driven software architecture, robotic process automation (RPA), business process management (BPM) and intelligent business process management suites (iBPMS), integration platform as a service (iPaaS), low-code/no-code tools, packaged software, and other types of decision, process and task automation tools.

    Gartner

    Here are some use cases:

    1. Accounts Payable and Receivable: Hyperautomation can be used to automate the process of invoicing, payment, and reconciliation in the finance and accounting department.
    2. Customer Service: Hyperautomation can be used to automate the process of handling customer inquiries, complaints, and support tickets by using chatbots and automated email responses.
    3. Human Resources: Hyperautomation can be used to automate the process of onboarding, employee records management, and payroll processing in the HR department.
    4. Sales and Marketing: Hyperautomation can be used to automate the process of lead generation, lead nurturing, and customer relationship management in the sales and marketing department.
    5. Supply Chain and Logistics: Hyperautomation can be used to automate the process of order processing, inventory management, and shipping and logistics.
    6. Healthcare: Hyperautomation can be used to automate patient record management, appointment scheduling, and billing in the healthcare industry.
    7. Legal: Hyperautomation can be used to automate the process of contract management, document review, and legal research in the legal industry.
    8. Manufacturing: Hyperautomation can be used to automate the process of quality control, inventory management, and production planning in the manufacturing industry.
    9. IT Operations: Hyperautomation can be used to automate the process of software deployment, system monitoring, and incident management in the IT department.
    10. Research and Development: Hyperautomation can be used to automate the process of data analysis, experiment management, and knowledge sharing in the research and development department.

    Prime: AI

    Artificial intelligence (AI) is a key component of hyperautomation, as it enables organizations to automate more complex and cognitive tasks that previously required human intervention. AI technologies, such as machine learning (ML), natural language processing (NLP), and computer vision, can be used to automate tasks such as data analysis, decision-making, and customer service.

    In the hyperautomation context, AI is often used in conjunction with other automation technologies, such as robotic process automation (RPA) and low-code development platforms, to create end-to-end automated workflows. For example, an RPA bot could be used to collect and process data, which could then be fed into an ML model to make predictions or generate insights. These insights could then be used to trigger automated actions or inform human decision-making.

    Prime: RPA

    Robotic Process Automation (RPA) is a technology that enables organizations to automate repetitive, rules-based tasks using software robots or “bots”. RPA bots are designed to mimic the actions of human workers, interacting with software applications in the same way that a human worker would.

    RPA is often used in conjunction with other automation technologies, such as artificial intelligence (AI), machine learning (ML), and natural language processing (NLP), to create more sophisticated and intelligent automation solutions. For example, an RPA bot could be used to collect and process data, which could then be fed into an ML model to make predictions or generate insights. These insights could then be used to trigger automated actions or inform human decision-making.

    There are many RPA (Robotic Process Automation) platforms available in the market today, each with its own set of features and capabilities. Here are a few examples of popular RPA platforms:

    1. UiPath: UiPath is a leading RPA platform that provides a range of features for automating routine and repetitive tasks. It includes a visual interface for designing and managing bots, as well as pre-built connectors and adapters for popular applications and systems.
    2. Automation Anywhere: Automation Anywhere is an RPA platform that provides a range of features for automating routine and repetitive tasks. It includes a visual interface for designing and managing bots, as well as pre-built connectors and adapters for popular applications and systems.
    3. Blue Prism: Blue Prism is an RPA platform that provides a range of features for automating routine and repetitive tasks. It includes a visual interface for designing and managing bots, as well as pre-built connectors and adapters for popular applications and systems.
    4. WorkFusion: WorkFusion is an RPA platform that provides a range of features for automating routine and repetitive tasks. It includes a visual interface for designing and managing bots, as well as pre-built connectors and adapters for popular applications and systems.
    5. Kryon: Kryon is an RPA platform that provides a range of features for automating routine and repetitive tasks. It includes a visual interface for designing and managing bots, as well as pre-built connectors and adapters for popular applications and systems.

    Prime: iBPMS

    Intelligent Business Process Management Suites (iBPMS) is a category of business process management (BPM) software that uses artificial intelligence (AI) and other advanced technologies to automate and optimize business processes.

    An iBPMS platform provides a range of features for managing and automating business processes, including process modeling, workflow automation, rules management, analytics, and integration with other systems and services. iBPMS solutions typically include advanced analytics and AI capabilities, such as machine learning and predictive analytics, which enable organizations to analyze process data and make better decisions in real-time.

    iBPMS platforms are designed to support the entire process lifecycle, from process design and modeling to execution and monitoring. They provide a visual interface for designing and managing workflows, and allow users to set up rules and triggers for automated actions.

    There are several iBPMS (Intelligent Business Process Management Suite) solutions available in the market today, each with its own set of features and capabilities. Here are a few examples of popular iBPMS solutions:

    1. Appian: Appian is an iBPMS solution that provides a range of features for managing and automating business processes. It includes a visual interface for designing and managing workflows, as well as pre-built connectors and adapters for popular applications and systems. Appian also includes AI capabilities, such as machine learning and natural language processing, to automate and optimize business processes.
    2. IBM Business Automation Workflow: IBM Business Automation Workflow is an iBPMS solution that provides a range of features for managing and automating business processes. It includes a visual interface for designing and managing workflows, as well as pre-built connectors and adapters for popular applications and systems. IBM Business Automation Workflow also includes AI capabilities, such as machine learning and predictive analytics, to automate and optimize business processes.
    3. Pegasystems: Pegasystems is an iBPMS solution that provides a range of features for managing and automating business processes. It includes a visual interface for designing and managing workflows, as well as pre-built connectors and adapters for popular applications and systems. Pegasystems also includes AI capabilities, such as machine learning and natural language processing, to automate and optimize business processes.
    4. Kofax TotalAgility: Kofax TotalAgility is an iBPMS solution that provides a range of features for managing and automating business processes. It includes a visual interface for designing and managing workflows, as well as pre-built connectors and adapters for popular applications and systems. Kofax TotalAgility also includes AI capabilities, such as machine learning and predictive analytics, to automate and optimize business processes.

    Prime: iPaaS

    Integration Platform as a Service (iPaaS) is a cloud-based platform that provides a set of tools and services for integrating applications, data, and systems across different cloud and on-premise environments.

    iPaaS solutions typically provide a range of features for connecting and integrating systems, including pre-built connectors and adapters, data mapping and transformation, workflow automation, and data governance and security. iPaaS platforms also typically provide a range of tools and services for managing and monitoring integrations, including dashboards, alerts, and analytics.

    There are many iPaaS solutions available in the market today, each with its own set of features and capabilities. Here are a few examples of popular iPaaS solutions:

    1. Dell Boomi: Dell Boomi is a cloud-based iPaaS platform that provides a range of features for integrating applications, data, and systems across different cloud and on-premise environments. It includes pre-built connectors and adapters for popular applications and systems, as well as a visual interface for designing and managing integrations.
    2. MuleSoft Anypoint Platform: MuleSoft Anypoint Platform is a cloud-based iPaaS platform that enables organizations to connect and integrate applications, data, and systems across different cloud and on-premise environments. It includes a range of tools and services for designing, managing, and monitoring integrations, as well as pre-built connectors and adapters for popular applications and systems.
    3. Jitterbit: Jitterbit is a cloud-based iPaaS platform that provides a range of features for integrating applications, data, and systems across different cloud and on-premise environments. It includes a visual interface for designing and managing integrations, as well as pre-built connectors and adapters for popular applications and systems.
    4. SnapLogic: SnapLogic is a cloud-based iPaaS platform that enables organizations to connect and integrate applications, data, and systems across different cloud and on-premise environments. It includes pre-built connectors and adapters for popular applications and systems, as well as a visual interface for designing and managing integrations.

    Prime: Low-Code

    Low-code is a visual development approach to software development that allows users to create applications through a drag-and-drop interface with minimal coding. The idea behind low-code is to simplify the development process by abstracting away much of the complexity of traditional coding, allowing users with little to no programming experience to create functional applications.

    Low-code platforms typically provide a set of pre-built components and modules that can be pieced together to create a custom application. These platforms also offer a range of tools and features to help users design, develop, test, and deploy applications quickly and easily.

    Low-code development is being increasingly used by businesses to rapidly develop and deploy custom applications that meet their specific needs, without the need for dedicated software development teams. The low-code approach can help reduce costs, accelerate development times, and improve the agility and responsiveness of businesses to changing market conditions.

    Low-code development platforms are an important aspect of hyperautomation because they provide a way for non-technical users to create applications and workflows quickly and easily. This enables organizations to automate processes that might have previously required custom software development, reducing costs and accelerating time to value.

    There are many low-code development platforms available in the market today, each with its own set of features and capabilities. Here are a few examples of popular low-code platforms:

    1. Microsoft Power Apps: Microsoft Power Apps is a low-code platform that enables users to build custom business applications quickly and easily. It includes a visual interface for designing and building applications, as well as pre-built templates and connectors for popular data sources and services.
    2. OutSystems: OutSystems is a low-code platform that enables users to build custom business applications quickly and easily. It includes a visual interface for designing and building applications, as well as pre-built templates and connectors for popular data sources and services.
    3. Mendix: Mendix is a low-code platform that enables users to build custom business applications quickly and easily. It includes a visual interface for designing and building applications, as well as pre-built templates and connectors for popular data sources and services.
    4. Salesforce Lightning: Salesforce Lightning is a low-code platform that enables users to build custom business applications quickly and easily. It includes a visual interface for designing and building applications, as well as pre-built templates and connectors for popular data sources and services.
    5. Appian: Appian is a low-code platform that enables users to build custom business applications quickly and easily. It includes a visual interface for designing and building applications, as well as pre-built templates and connectors for popular data sources and services.

    Architecture

    The architecture of a hyperautomation system should be designed to support the different components and technologies used in the system. Here’s a high-level overview of the architecture of a hyperautomation system:

    1. Data Ingestion: This component is responsible for collecting data from various sources such as emails, documents, databases, and other systems.
    2. Intelligent Automation: This component is responsible for processing the data using AI and ML algorithms to extract relevant information and automate tasks.
    3. Process Automation: This component is responsible for automating end-to-end business processes by using RPA tools to automate repetitive tasks and human workflows.
    4. Integration and Orchestration: This component is responsible for integrating the hyperautomation system with other enterprise systems and orchestrating the automation workflows.
    5. Analytics and Reporting: This component is responsible for providing insights and analytics on the performance of the hyperautomation system.

    Google Cloud

    Google Cloud offers a variety of services and tools that can be used to implement a hyperautomation system:

    1. Data Ingestion: Google Cloud offers several services for data ingestion, including Cloud Storage for storing data, Cloud Pub/Sub for messaging, and Cloud Dataflow for data processing.
    2. Intelligent Automation: Google Cloud offers several AI and ML services, such as Cloud AutoML, Cloud AI Platform, and Cloud Vision API, which can be used for intelligent automation.
    3. Process Automation: Google Cloud offers a service called Cloud Composer, which is a fully managed workflow orchestration service that can be used for process automation. It also supports integrating with RPA tools such as UiPath.
    4. Integration and Orchestration: Google Cloud offers several services for integration and orchestration, including Cloud Functions, Cloud Run, and Cloud Workflows.
    5. Analytics and Reporting: Google Cloud offers several services for analytics and reporting, including BigQuery for data warehousing and analysis, and Data Studio for creating and sharing reports.

    Google AppSheet

    Google offers a low-code development platform called Google AppSheet. Google AppSheet allows users to create custom business applications quickly and easily using a drag-and-drop interface, without the need for extensive coding.

    Google AppSheet provides a range of features for building custom applications, including data modeling, workflow automation, and integration with other Google services, such as Google Drive, Google Sheets, and Google Cloud services. AppSheet can be used to build applications for a variety of use cases, including inventory management, project management, and field service management, among others.

    Proof of Concept Architecture

    1. VPC Flow Logs: Enable VPC Flow Logs on your network to capture all network traffic to and from instances in your VPC.
    2. BigQuery: Set up a BigQuery dataset and table to store your VPC Flow Logs.
    3. Cloud Functions: Create a Cloud Function to process new VPC Flow Log events and trigger a notification to the user to ask if the flow should be allowed.
    4. Pub/Sub: Create a Pub/Sub topic to receive messages from the Cloud Function when a new flow is detected.
    5. Dialogflow: Use Dialogflow to create a chatbot or voicebot that prompts the user to allow or deny the new flow.
    6. Cloud Functions: Create another Cloud Function to handle the response from the user. If the user allows the new flow, the Cloud Function should create a firewall rule to allow the traffic.
    7. Google AppSheet: Create a custom mobile or web application using Google AppSheet that enables users to manage firewall rules and alerts, including viewing new flow alerts, approving or denying new flows, and creating new firewall rules.
    8. Integrate with the workflow: Use AppSheet’s connectors and APIs to integrate the application with the rest of the workflow.
    9. Monitor the workflow and application: Use monitoring and analytics tools such as Stackdriver Logging and AppSheet’s built-in monitoring and analytics to track usage, identify issues, and optimize the workflow and application.

    This hyperautomation solution combines several Google Cloud Platform services to automate the process of managing firewall rules based on VPC Flow Logs. The solution enables real-time monitoring of network traffic and quickly responds to potential security threats. By automating the process of asking for permission to allow new flows and creating firewall rules, the workflow ensures that all firewall rules are reviewed and approved by authorized users, reducing the risk of unauthorized access. The use of Google AppSheet enables users to manage firewall rules and alerts from their mobile or web devices, providing quick access to relevant data. The workflow is also monitored using Stackdriver Logging and AppSheet’s built-in monitoring and analytics tools to ensure optimal performance.

    VPC Flow Logs

    VPC Flow Logs and Big Query is covered here:

    Pub/Sub

    Function

    Dialogflow

    Functions

    AppSheet

    Monitoring

    References

    https://www.gartner.com/en/information-technology/glossary/hyperautomation

    BrainStorm Area

    Implementation

    To implement a hyperautomation system, you need to follow these steps:

    1. Identify the processes to be automated: The first step is to identify the processes that can be automated using hyperautomation technologies.
    2. Design the architecture: Once the processes are identified, design the architecture that best suits your requirements.
    3. Choose the technologies: Choose the right technologies for each component of the hyperautomation system.
    4. Build the system: Build the system using the selected technologies and architecture.
    5. Test and validate the system: Test and validate the system to ensure that it meets the business requirements and objectives.
    6. Deploy the system: Deploy the system in the production environment.

    Documentation

    Documentation is an essential part of any hyperautomation system. Here are the documents that you need to prepare:

    1. Architecture document: This document should provide a detailed overview of the hyperautomation system’s architecture and design.
    2. Implementation document: This document should provide a step-by-step guide on how to implement the hyperautomation system.
    3. User manual: This document should provide instructions on how to use the hyperautomation system.
    4. Test plan: This document should provide a detailed plan for testing the hyperautomation system.
    5. Maintenance and support document: This document should provide information on how to maintain and support the hyperautomation system.

    IT Operations

    Here’s an example architecture for an IT operations hyperautomation system using Google Cloud services:

    1. Data Ingestion: Use Google Cloud Storage to store log files from various systems and applications, and Cloud Pub/Sub to receive notifications of new log files.
    2. Intelligent Automation: Use Cloud AI Platform to analyze log data and identify patterns, anomalies, and errors. You can also use Cloud AutoML to train custom models for specific use cases.
    3. Process Automation: Use Cloud Composer to create and manage workflows for incident management, such as deploying new code, restarting services, and sending notifications to stakeholders.
    4. Integration and Orchestration: Use Cloud Functions to trigger automation workflows based on events in other systems or applications. Use Cloud Run to deploy containerized applications that can be scaled automatically based on demand.
    5. Analytics and Reporting: Use BigQuery to store and analyze log data over time, and use Data Studio to create dashboards and reports that provide insights into the performance of the IT operations system.
  • Using NATGW for Centralized Internet Outbound

    Using NATGW for Centralized Internet Outbound
    Archive: Australia Fire Scars (NASA, International Space Station, 10/07/02)
    Archive: Australia Fire Scars (NASA, International Space Station, 10/07/02) by NASA’s Marshall Space Flight Center is licensed under CC-BY-NC 2.0

    Topology

    Initial Config (No NATGW)

    • SNAT is done on the firewall interface
    • Firewall eth1/1 private and public IPs:

    Testing:

    (for testing I’m using curl from the internal VM towards another VM running in a different cloud provider running NGIX)

    • Firewall PIP interface is used

    NATGW

    Once a NATGW is attached to the firewall eth1/1 interface subnet, the NATGW takes precedence:

    Testing:

    PIP can disassociate for egress only case

    Adding Multiple Private IPs

    • we can add multiple IPs to the external interface (with/without public ips):
    • PAN NAT configuration requires no change as from their documentation:

    The advantage of specifying the interface in the NAT rule is that the NAT rule will be automatically updated to use any address subsequently acquired by the interface. DIPP is sometimes referred to as interface-based NAT or network address port translation (NAPT).

    References

    https://docs.paloaltonetworks.com/pan-os/10-1/pan-os-networking-admin/nat/source-nat-and-destination-nat/source-nat

    https://docs.paloaltonetworks.com/pan-os/10-1/pan-os-networking-admin/nat/dynamic-ip-and-port-nat-oversubscription#id2a358bd4-94c0-4976-a681-dad3845f8174

  • Scaling Up/Scaling Down HPE Gateways

    Scaling Up/Scaling Down HPE Gateways
    measurement-millimeter-centimeter-meter-162500.jpeg
    Photo by Pixabay on Pexels.com

    High Performance Encryption (HPE) is an Aviatrix technology that enables 10 Gbps and higher IPsec performance between two single Aviatrix Gateway instances or between a single Aviatrix Gateway instance and on-prem Aviatrix appliance.

    You can change Gateway Size if needed to change gateway throughput. The gateway will restart with a different instance size.

    IP addresses per network interface

    The following tables list the maximum number of network interfaces per instance type, and the maximum number of private IPv4 addresses and IPv6 addresses per network interface:

    Constraints

    • Although increasing the size of an Amazon EC2 instance for a gateway can be considered an online operation (since traffic can be diverted to other spokes), it still requires a re-attachment process to bring the additional tunnels up. In other words, even though the traffic can be directed to other spokes during the instance size upgrade, the process still requires a re-attachment step to enable the new tunnels and ensure that the gateway functions properly with the increased capacity.
    • When decreasing the size of an Amazon EC2 instance that requires removing IP addresses from the network interface, it’s not possible to do so online. This means that the instance must be stopped and the network interface detached in order to remove the IP addresses that are no longer available for the new instance type. In the specific example provided, reducing the instance size from a c5n.18xlarge to a c5n.9xlarge would require removing 20 IP addresses, which cannot be done without detaching the network interface from the instance.

    Initial Scenario

    • gateways are c5n.large:
    • number of secondary IPs:
    • number of tunnels:

    Scale Up

    I’m going to scale to a c5n.9xlarge:

    • 1x Private IP and 29 Secondary IPs:
    • Number of tunnels:

    Tunnels are created or destroyed only during an de/attachment operation.

    • 14 tunnels per transit gateway after detaching and attaching:

    The number of tunnels depends on the transit gateway size.

    Scale Down

    From c5n.9xlarge to c5n.4xlarge:

    Decreasing to smaller sizes

    • c5n.2xlarge:
    • c5n.large:

    References

    https://docs.aviatrix.com/documentation/latest/building-your-network/gateway-settings.html?expand=true

    https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/MultipleIP.html

    https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html

  • Scaling Out Secure Dedicated Ingress on GCP

    Scaling Out Secure Dedicated Ingress on GCP
    close up photography of yellow green red and brown plastic cones on white lined surface
    Photo by Pixabay on Pexels.com

    Proposed Architecture

    The architecture presented below satisfies GCP customers requirements to use third party compute instance based appliances in their flows.

    The design considers HTTP(S) load balancers due its advanced capabilities.

    Constraints

    • HTTP(S) supports port 80, 8080, and 443.
    • The combination instance (responsible for SNAT/DNAT ingress traffic) and port (back end port) can be used a single time
    • An instance may belong to at most one load-balanced instance group

    GCP Load Balancers Decision Chart

    Chart from https://cloud.google.com/load-balancing/docs/load-balancing-overview

    Update DNS

    • Add the second app to Cloud DNS for proper name resolution
    • Create a second instance group and health check.

    How to Scale Scenario 1

    • add a new external load balancer
    • add a new set of compute instances

    How to Scale Scenario 2

    • add a second back end using another set of compute instances
    • Use Routing Rules to forward traffic to the new back end

    How to Scale Scenario 3

    • add a new external HTTP(S) load balancer
    • create a new back end using the same instance group as before but using different ports
    • this step requires the creation of a new named port in the instance group
    • this step also requires proper secure firewall rules proper configured
    • compute instance DNAT using SRC:DST port 81 and DST:DST port 80

    How to Scale Scenario 4

    • this scenario is a hybrid of scenarios 2 and 3
    • a new BE is created using port 82

    The HC as before is the same as we are checking the health of the compute instances:

    • routing rules
    • compute instance DNAT config:

    References

    https://research.google/pubs/pub44824/

    https://cloud.google.com/load-balancing/docs/load-balancing-overview

    https://cloud.google.com/load-balancing/docs/backend-service

Leave a Reply