Use Cases Data Center Interconnect

Data Center Interconnect (DCI)

ExaBGP enables Data Center Interconnect using EVPN, VPLS, and Layer 2 extension for multi-site deployments, disaster recovery, and workload mobility.

Overview

DCI Requirements

Data Center Interconnect solutions need:

Layer 2 extension: Stretch VLANs across sites for VM mobility
Layer 3 routing: Efficient inter-site routing and traffic engineering
Multi-tenancy: Isolated networks per customer/application
Disaster recovery: Seamless failover between sites
Workload mobility: Live migration of VMs/containers across sites
Geographic distribution: Active-active or active-standby deployments

DCI Technologies

ExaBGP supports multiple DCI approaches:

Technology	Layer	Use Case	Complexity
EVPN	L2/L3	Modern DCI, VXLAN-based	Medium
VPLS	L2	Legacy DCI, MPLS-based	Medium
L3VPN	L3	IP-only DCI, no L2 extension	Low
Stretched Anycast	L3	Service availability, no state	Low

ExaBGP Role

ExaBGP enables DCI by:

Route exchange: Advertise endpoints between data centers via BGP
Topology discovery: Collect site information via BGP-LS
Traffic engineering: Control inter-site traffic flow
Automation: Dynamic provisioning via API

Important: ExaBGP exchanges routes but does NOT create tunnels or manipulate forwarding tables. Your application must configure VXLAN/GRE tunnels and install routes.

DCI Architectures

Pattern 1: Active-Active DCI

Both data centers serve traffic simultaneously:

[DC1: VTEP 10.1.0.1]  <--EVPN-->  [DC2: VTEP 10.2.0.1]
        |                                  |
    [VMs: VNI 10000]                  [VMs: VNI 10000]
        |                                  |
    [IP: 192.168.1.0/24]              [IP: 192.168.1.0/24]

Benefits:

Load distribution across sites
No idle capacity
Optimal resource utilization

Challenges:

Requires stretched Layer 2
Potential for MAC flapping
Need for loop prevention

Pattern 2: Active-Standby DCI

One site active, other standby for DR:

[DC1: Active]  <--EVPN-->  [DC2: Standby]
     |                           |
  [Services]              [Services (inactive)]

Benefits:

Simple failover
No split-brain issues
Clear ownership

Challenges:

Wasted standby capacity
Slower failover than active-active

Pattern 3: Multi-Site Fabric

Multiple data centers in full mesh:

    [DC1]
    /    \
[DC2]----[DC3]
    \    /
    [DC4]

Benefits:

Geographic distribution
Resilience to site failures
Flexible workload placement

Challenges:

Complex routing
Higher WAN bandwidth needs
Increased latency

EVPN for DCI

EVPN Type 2: MAC/IP Advertisement

Advertise endpoints across data centers:

#!/usr/bin/env python3
import sys

# Data center configuration
DC_ID = 1
VTEP_IP = f"10.{DC_ID}.0.1"
VNI = 10000

def announce_endpoint(mac, ip):
    """Announce endpoint to remote DC"""
    rd = f"{VTEP_IP}:1"
    rt = f"65000:{VNI}"

    print(f"announce route-distinguisher {rd} "
          f"mac {mac} ip {ip} "
          f"label {VNI} "
          f"route-target {rt}", flush=True)

# Announce local endpoints
announce_endpoint("00:11:22:33:44:55", "192.168.1.10")
announce_endpoint("00:11:22:33:44:56", "192.168.1.11")

while True:
    line = sys.stdin.readline().strip()
    if not line:
        break

EVPN Type 5: IP Prefix Advertisement

For Layer 3 DCI without L2 extension:

#!/usr/bin/env python3
import sys

DC_ID = 1
VTEP_IP = f"10.{DC_ID}.0.1"
VNI = 10000

def announce_prefix(prefix):
    """Announce IP prefix to remote DC (EVPN Type 5)"""
    rd = f"{VTEP_IP}:1"
    rt = f"65000:{VNI}"

    # EVPN Type 5 via text API
    print(f"announce route {prefix} "
          f"next-hop {VTEP_IP} "
          f"route-target {rt} "
          f"label {VNI}", flush=True)

# Announce DC1 prefixes to DC2
announce_prefix("192.168.1.0/24")
announce_prefix("192.168.2.0/24")

while True:
    line = sys.stdin.readline().strip()
    if not line:
        break

Gateway Redundancy

Use EVPN for gateway failover:

#!/usr/bin/env python3
import sys
import time
import requests

DC_ID = 1
VTEP_IP = f"10.{DC_ID}.0.1"
GATEWAY_IP = "192.168.1.1"
VNI = 10000

def check_gateway_health():
    """Check if local gateway is healthy"""
    try:
        # Check gateway reachability
        response = requests.get(f"http://{GATEWAY_IP}/health", timeout=2)
        return response.status_code == 200
    except:
        return False

def announce_gateway(mac, ip):
    """Announce gateway IP to take over"""
    rd = f"{VTEP_IP}:1"
    rt = f"65000:{VNI}"

    print(f"announce route-distinguisher {rd} "
          f"mac {mac} ip {ip} "
          f"label {VNI} "
          f"route-target {rt}", flush=True)

def withdraw_gateway(mac, ip):
    """Withdraw gateway announcement"""
    rd = f"{VTEP_IP}:1"

    print(f"withdraw route-distinguisher {rd} "
          f"mac {mac} ip {ip}", flush=True)

# Monitor gateway and announce/withdraw
announced = False
GATEWAY_MAC = "00:11:22:33:44:01"

while True:
    healthy = check_gateway_health()

    if healthy and not announced:
        announce_gateway(GATEWAY_MAC, GATEWAY_IP)
        announced = True
    elif not healthy and announced:
        withdraw_gateway(GATEWAY_MAC, GATEWAY_IP)
        announced = False

    time.sleep(10)

VPLS for DCI

VPLS Overview

VPLS (Virtual Private LAN Service) provides Layer 2 VPN using MPLS:

#!/usr/bin/env python3
import sys

# VPLS configuration
VPLS_ID = 100
RD = "10.1.0.1:100"
RT = "65000:100"
PE_ADDRESS = "10.1.0.1"

def announce_vpls():
    """Announce VPLS endpoint"""
    # VPLS via BGP (RFC 4761)
    print(f"announce route-distinguisher {RD} "
          f"vpls {VPLS_ID} "
          f"endpoint {PE_ADDRESS} "
          f"route-target {RT}", flush=True)

announce_vpls()

while True:
    line = sys.stdin.readline().strip()
    if not line:
        break

Multi-Site VPLS

Connect multiple data centers via VPLS:

#!/usr/bin/env python3
import sys

# Multi-site VPLS
SITES = {
    'dc1': {'pe': '10.1.0.1', 'vpls_id': 100},
    'dc2': {'pe': '10.2.0.1', 'vpls_id': 100},
    'dc3': {'pe': '10.3.0.1', 'vpls_id': 100}
}

SITE_NAME = 'dc1'  # Current site
RD_BASE = SITES[SITE_NAME]['pe']
VPLS_ID = SITES[SITE_NAME]['vpls_id']
RT = f"65000:{VPLS_ID}"

def announce_vpls_endpoint():
    """Announce VPLS endpoint for this site"""
    rd = f"{RD_BASE}:{VPLS_ID}"
    pe = SITES[SITE_NAME]['pe']

    print(f"announce route-distinguisher {rd} "
          f"vpls {VPLS_ID} "
          f"endpoint {pe} "
          f"route-target {RT}", flush=True)

announce_vpls_endpoint()

while True:
    line = sys.stdin.readline().strip()
    if not line:
        break

Configuration Examples

EVPN DCI Configuration

Configuration (/etc/exabgp/dci-evpn.conf):

process dci-controller {
    run python3 /etc/exabgp/dci-evpn.py;
    encoder json;
}

# Connection to local spine switches
neighbor 10.1.0.254 {
    router-id 10.1.0.1;
    local-address 10.1.0.1;
    local-as 65001;
    peer-as 65001;

    family {
        l2vpn evpn;
        ipv4 unicast;
    }

    api {
        processes [ dci-controller ];
    }
}

# Connection to remote DC (via DCI circuit)
neighbor 10.2.0.1 {
    router-id 10.1.0.1;
    local-address 10.1.0.1;
    local-as 65001;
    peer-as 65002;

    family {
        l2vpn evpn;
    }

    api {
        processes [ dci-controller ];
    }
}

VPLS DCI Configuration

Configuration (/etc/exabgp/dci-vpls.conf):

process vpls-controller {
    run python3 /etc/exabgp/vpls-announce.py;
    encoder text;
}

neighbor 10.1.0.254 {
    router-id 10.1.0.1;
    local-address 10.1.0.1;
    local-as 65001;
    peer-as 65001;

    family {
        l2vpn vpls;
    }

    api {
        processes [ vpls-controller ];
    }
}

neighbor 10.2.0.1 {
    router-id 10.1.0.1;
    local-address 10.1.0.1;
    local-as 65001;
    peer-as 65002;

    family {
        l2vpn vpls;
    }

    api {
        processes [ vpls-controller ];
    }
}

Stretched Anycast

Simple anycast without L2 extension:

#!/usr/bin/env python3
import sys
import time
import requests

# Anycast service IP
ANYCAST_IP = "1.2.3.4/32"
NEXT_HOP = "10.1.0.1"

def check_service_health():
    """Check local service health"""
    try:
        response = requests.get("http://127.0.0.1:8080/health", timeout=2)
        return response.status_code == 200
    except:
        return False

announced = False

while True:
    healthy = check_service_health()

    if healthy and not announced:
        print(f"announce route {ANYCAST_IP} next-hop {NEXT_HOP}", flush=True)
        announced = True
    elif not healthy and announced:
        print(f"withdraw route {ANYCAST_IP} next-hop {NEXT_HOP}", flush=True)
        announced = False

    time.sleep(10)

Troubleshooting

Issue: MAC Flapping

Problem: Same MAC appears in multiple locations

Debugging:

# Check for duplicate MAC
show bgp l2vpn evpn route-type 2 | grep "00:11:22:33:44:55"

# Verify Route Distinguishers are unique
exabgpcli show neighbor advertised-routes

Solution:

Ensure unique RD per VTEP
Check for VM migration loops
Verify no duplicate MAC addresses assigned

Issue: High Latency Between Sites

Problem: Slow inter-site communication

Debugging:

# Check underlay latency
ping 10.2.0.1

# Verify tunnel MTU
ip link show vxlan10

# Check for packet fragmentation
tcpdump -i eth0 -vv udp port 4789

Solution:

Optimize underlay routing
Adjust MTU for VXLAN overhead (typically -50 bytes)
Use jumbo frames if supported

Issue: Split-Brain

Problem: Both sites think they're primary

Debugging:

# Check gateway announcements from both sites
show bgp l2vpn evpn | grep "192.168.1.1"

Solution:

Implement proper health checking
Use consensus mechanism (e.g., Raft, etcd)
Add manual override capability

Production Best Practices

Monitoring

Monitor DCI health:

#!/usr/bin/env python3
from prometheus_client import start_http_server, Gauge
import sys
import json

# Metrics
dci_link_status = Gauge('dci_link_status', 'DCI link status', ['site'])
dci_endpoints = Gauge('dci_endpoints_total', 'Total endpoints', ['site'])
dci_latency = Gauge('dci_latency_ms', 'Inter-site latency', ['site'])

start_http_server(9100)

# Update metrics
while True:
    line = sys.stdin.readline().strip()
    if not line:
        break

    try:
        data = json.loads(line)
        # Parse BGP updates and update metrics
    except:
        pass

Capacity Planning

Consider DCI bandwidth requirements:

Control plane: BGP updates (typically < 1 Mbps)
Data plane: Actual workload traffic (depends on applications)
Overhead: VXLAN/GRE adds ~50 bytes per packet
Burst handling: Account for VM migration traffic

Security

Secure DCI links:

Encryption: Use IPsec or MACsec for DCI links
Authentication: Use BGP MD5 or TCP-AO
Route filtering: Accept only expected routes
Rate limiting: Protect against BGP route storms

Use Cases Data Center Interconnect

Data Center Interconnect (DCI)

Table of Contents

Overview

DCI Requirements

DCI Technologies

ExaBGP Role

DCI Architectures

Pattern 1: Active-Active DCI

Pattern 2: Active-Standby DCI

Pattern 3: Multi-Site Fabric

EVPN for DCI

EVPN Type 2: MAC/IP Advertisement

EVPN Type 5: IP Prefix Advertisement

Gateway Redundancy

VPLS for DCI

VPLS Overview

Multi-Site VPLS

Configuration Examples

EVPN DCI Configuration

VPLS DCI Configuration

Stretched Anycast

Troubleshooting

Issue: MAC Flapping

Issue: High Latency Between Sites

Issue: Split-Brain

Production Best Practices

Monitoring

Capacity Planning

Security

See Also

Address Families

Related Use Cases

Configuration

API

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!