Cisco ACI Multi-Pod Part-1 || Overview

Shehab Wagdy Nagy

Cloud Enthusiast: AWS | CCIE | SDN Solutions | ACI | Network Automation Enthusiast

Published Apr 19, 2024

To be able to understand how ACI Multi-Pod works and how it is provide fault tolerant fabric, we need to understand its own control plane and how it works under the hood.

Cisco ACI Multi-Pod control plane protocols are run individually in each pod as follows:

Intermediate system-to-intermediate system (IS-IS): For infra tunnel endpoint (TEP) within a pod.

And in case of IS-IS stopped working in one pod, it doesn't affect the IS-IS running in the other pod because it runs only between spine and leaf switches in each pod.

For TEP reachability toward nodes in other pods, Spines learn a TEP range of other pods instead of individual TEP IPs via OSPF from IPN network (will discuss it later).

And then the IS-IS advertise it locally in its pod to all Leafs.

Council of oracle protocol (COOP): For endpoint details learned in a pod, it won't be affected if the COOP in the other pod stopped working, because coop is run only between Leafs and Spines in each pod.

And regarding endpoint details in one pod is shared and stored in the COOP database of the pod via (MP-BGP EVPN) between spine switches in each pod through IPN.

VPN v4/v6 MP-BGP: for L3out route distribution within a pod.

Also if the MP-BGP stopped working in one pod, it doesn't affect MP-BGP in

the other pod, because MP-BGP is runs and establishes neighborship between spine route reflector and leaf switches in each pod to distribute the L3Out within the pod.

On top of MP-BGP within a pod, Multi-POD establishes other MP-BGP VPNv4/v6 sessions between spine switches in each pod through IPN, to share learned L3Out in one pod to the pod.

Failure Scenarios

Since the communications inside ACI depends on COOP database, and as we knows the database is used by Cisco APIC is split into several database units (shards), which each shard is replicated 3-times with each copy is assigned to a specific Cisco APIC. So Multi-pod fabric may face different failure scenarios due to APIC node positioning in pods.

In case of 3-node APIC cluster nodes, so the database shard is replicated on every APIC node in the cluster.

But in case of 5-node cluster, the database shards will be replicated on the three of the five nodes, So in this case an issue will happen to the fabric in case of failure.

Recommended by LinkedIn

Cisco ACI acronyms and terms

Victor Mahdal 3 months ago

Cisco ASR9006/RSP440-SE

Pim van Pelt 11 months ago

Cisco ACI Talks

Shehab Wagdy Nagy 8 months ago

Split-Brain Failure Scenario:

A split brain failure scenario happens when the connectivity between the pod is interrupted.

In such scenario, all the APIC cluster nodes are up but connectivity between pods are down, so there is no communication between APIC nodes in pod-1 and APIC node in pod-2, So in this case there will be no issue regarding the read-write configuration in pod-1 because of the majority of APIC nodes are in pod-1 and APIC node in pod-2 will go for read only mode which is affecting its operation and can't perform any configuration for its local pod.

In case the APIC cluster has 5 APIC nodes (ex. 3 APICs in Pod 1, 2 APICs in Pod 2), the read/write, or read-only mode will be indeterministic because of shard replica distribution (3 replicas of each object). Replicas of some objects may be on 1 APIC in Pod 1 and 2 APICs in Pod 2, where Pod 2 APICs are majority and in read/write mode, but others may have majority in Pod 1, where Pod 1 APICs are in read/write mode.

So it is very important to keep the connectivity between two pods are up and fix it asap once happen.

Pod Failure Scenario

In case of pod failure or disaster happen to one of the data center, So let's assume 3 APIC cluster and the failure happen to the pod with the majority APIC nodes where the shards database replicated across the 3 nodes, So in this case the pod with the single APIC node will go into read only mode.

So in such case we can add one APIC as standby in pod 2 and promote this controller to be active and re-establish the quorum of the cisco APICs.

In scenario of 5 node APIC cluster (3 APICs in pod-1 and 2 APICs inpod-2),

and the failure happen to pod-1 which has the majority of the APICs so in this case the same scenario happen and APICs on pod-2 go in read only mode, so we can add a standby controller which will provide cluster majority and re-establishment of quorum.

The only difference here is even if have a standby controller and due to this failure it may lead to the loss of information for the shards that were replicated in the three nodes on pod-1 (failed pod).

In such scenario, we can make fabric recovery from configuration backup.

In the next article, Will discuss Inter-Pod Network, how it is works and its consideration.

Tech Talks

4,862 followers

+ Subscribe

George Ngoru

Senior Network Engineer @ NEC Australia | IP Networking, Enterprise Networking and Security

8mo

Thanks for sharing

Palash Barua

SDN/IP/MPLS/Cloud-Native/Solution-Architect/Automation/Linux/DB (CCIE Enterprise # 60345)

8mo

Awesome series about ACI

1 Reaction

See more comments

To view or add a comment, sign in

See all

Cisco ACI Multi-Pod Part-1 || Overview

Shehab Wagdy Nagy

Cloud Enthusiast: AWS | CCIE | SDN Solutions | ACI | Network Automation Enthusiast

Failure Scenarios

Recommended by LinkedIn

Split-Brain Failure Scenario:

Pod Failure Scenario

Tech Talks

4,862 followers

More articles by this author

Insights from the community

Others also viewed

ACI Multi-Pod || Part#2

Cisco Wi-Fi 7 Announcements

Cisco ACI Talks

Unpacking the Network #4 | Cisco Live Melbourne 2024

ACI Multi-Site Overview || Part#1

DDI (DNS, DHCP, and IPAM) Market to Witness Massive Growth by 2032 | Infoblox, Cisco Systems, BlueCat Networks

CISCO ACI L3OUT EXPLAINED

Part 2: Cisco Nexus Dashboard Fabric Controller - Create VXLAN EVPN Fabric (Greenfield Import)

Cisco ACI Talks

Cisco ACI TALKs

Explore topics

Failure Scenarios

Recommended by LinkedIn

Split-Brain Failure Scenario:

Pod Failure Scenario

Tech Talks

4,862 followers

Configuring The EVPN VXLAN Fabric || Lab-1

Oct 3, 2024

Understanding Layer 3 Packet Walk in VXLAN EVPN

Aug 25, 2024

MP-BGP EVPN ARP Suppression

Jul 31, 2024

VXLAN EVPN Distributed Anycast Gateway

Jul 26, 2024

VXLAN Layer 2 Packet Walk (BUM Traffic)

Jul 17, 2024

VXLAN EVPN Layer 2 Traffic Flow

Jul 10, 2024

VXLAN EVPN Data Plane

Jul 8, 2024

VXLAN MP-BGP EVPN Route Types

Jul 3, 2024

VXLAN EVPN Control Plane

Jun 30, 2024

Introduction to VXLAN

Jun 25, 2024

Insights from the community

Others also viewed

ACI Multi-Pod || Part#2

Cisco Wi-Fi 7 Announcements

Cisco ACI Talks

Unpacking the Network #4 | Cisco Live Melbourne 2024

ACI Multi-Site Overview || Part#1

DDI (DNS, DHCP, and IPAM) Market to Witness Massive Growth by 2032 | Infoblox, Cisco Systems, BlueCat Networks

CISCO ACI L3OUT EXPLAINED

Part 2: Cisco Nexus Dashboard Fabric Controller - Create VXLAN EVPN Fabric (Greenfield Import)

Cisco ACI Talks

Cisco ACI TALKs

Explore topics