-
Notifications
You must be signed in to change notification settings - Fork 458
Performance Tuning
Optimize ExaBGP for scale, performance, and reliability
β‘ ExaBGP is lightweight by design - most deployments run with minimal tuning
- Overview
- Performance Baseline
- BGP Session Tuning
- Route Scale Optimization
- Memory Management
- CPU Optimization
- Network Performance
- API Process Optimization
- Benchmarking
- Capacity Planning
- Best Practices
ExaBGP is designed for efficiency:
Typical resource usage (single BGP session, 100 routes):
- CPU: 2-5% on modern hardware
- Memory: 50-100 MB
- Network: < 1 Mbps
- Startup time: < 1 second
Scaling characteristics:
- BGP sessions: 100+ per instance
- Routes announced: 10,000+ per session
- Route updates: 1,000+ per second
- API processes: 10+ concurrent processes
Optimize when:
- Managing 50+ BGP sessions
- Announcing 1,000+ routes per session
- Receiving full Internet tables (800k+ routes)
- High route churn (100+ updates/second)
- Resource-constrained environments (embedded systems)
Don't optimize prematurely:
- Most deployments need no tuning
- Default settings work for 95% of use cases
- Measure first, optimize second
Before optimizing, measure current performance:
#!/bin/bash
# ExaBGP performance baseline script
echo "=== ExaBGP Performance Baseline ==="
echo
# Process info
echo "Process:"
ps aux | grep exabgp | grep -v grep | awk '{print "PID: "$2" CPU: "$3"% MEM: "$4"% RSS: "$6" KB"}'
echo
# BGP sessions
echo "BGP Sessions:"
grep "neighbor.*up" /var/log/exabgp.log | tail -5
echo
# Route counts
echo "Routes Announced:"
grep "announce route" /var/log/exabgp.log | wc -l
echo
# Memory usage
echo "Memory:"
pmap $(pgrep -f exabgp) | tail -1
echo
# Network connections
echo "Network:"
ss -tan | grep :179Run baseline test:
./baseline.sh > baseline-$(date +%Y%m%d).txtDefault values:
# ExaBGP defaults (seconds)
hold-time 180
keepalive 60For stable networks (optimize for efficiency):
neighbor 192.168.1.1 {
router-id 192.168.1.2;
local-as 65001;
peer-as 65000;
# Longer timers reduce overhead
hold-time 240; # 4 minutes (default: 180)
# keepalive = hold-time / 3 = 80 seconds
}For unstable networks (optimize for fast detection):
neighbor 192.168.1.1 {
# Shorter timers detect failures faster
hold-time 90; # 1.5 minutes
# keepalive = 30 seconds
}Trade-offs:
- Longer timers: Less CPU/network overhead, slower failure detection
- Shorter timers: Faster detection, more overhead, risk of false positives
System-level TCP tuning:
# /etc/sysctl.d/99-exabgp.conf
# Increase TCP buffer sizes
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# Enable TCP window scaling
net.ipv4.tcp_window_scaling = 1
# Reduce TIME_WAIT sockets
net.ipv4.tcp_tw_reuse = 1
# Increase connection backlog
net.core.somaxconn = 1024
# Apply changes
sysctl -p /etc/sysctl.d/99-exabgp.confConfigure connection retry behavior:
# Environment variables
export exabgp_tcp_once=false # Keep retrying (default: true)
export exabgp_tcp_delay=5 # Retry delay in seconds (default: 5)For flaky networks:
# Be patient with retries
export exabgp_tcp_delay=10Configuration for large route announcements:
neighbor 192.168.1.1 {
router-id 192.168.1.2;
local-as 65001;
peer-as 65000;
family {
ipv4 unicast;
}
# Optimize for bulk route announcements
capability {
graceful-restart; # RFC 4724 - smooth restarts
add-path send/receive; # RFC 7911 - multiple paths
}
}Batch route announcements:
β BEST: Use bulk announcements (ExaBGP 4.0+):
#!/usr/bin/env python3
"""
Optimal: Use bulk announcements for same attributes
"""
import sys
# Generate 10,000 routes
routes = [f"100.{i//256}.{i%256}.0/24" for i in range(10000)]
# Option 1: Announce ALL in one command (fastest)
# Good for routes with same attributes
sys.stdout.write(f"announce attributes next-hop self nlri {' '.join(routes)}\n")
sys.stdout.flush()
# Option 2: Announce in batches (if command line too long)
# Use batches of 1000 routes
batch_size = 1000
for i in range(0, len(routes), batch_size):
batch = routes[i:i+batch_size]
sys.stdout.write(f"announce attributes next-hop self nlri {' '.join(batch)}\n")
sys.stdout.flush()Performance: 10-100x faster than individual announce route commands.
Legacy method (ExaBGP 3.x or if attributes differ per route):
#!/usr/bin/env python3
"""
Legacy: Individual route announcements
"""
import sys
# Generate 10,000 routes
routes = [f"100.{i//256}.{i%256}.0/24" for i in range(10000)]
# Announce in batches
batch_size = 100
for i in range(0, len(routes), batch_size):
batch = routes[i:i+batch_size]
# Send batch
for route in batch:
sys.stdout.write(f"announce route {route} next-hop self\n")
# Flush after batch
sys.stdout.flush()Filter unnecessary routes at ExaBGP:
neighbor 192.168.1.1 {
# ... neighbor config ...
api {
processes [announcer];
receive {
parsed; # Receive parsed routes
update; # Receive updates
}
}
}
process filter {
run python3 /etc/exabgp/filter.py;
encoder text;
}Filter script:
#!/usr/bin/env python3
"""
Filter incoming routes to reduce memory
"""
import sys
for line in sys.stdin:
# Only process specific prefixes
if 'route' in line:
if any(prefix in line for prefix in ['10.0.0.0/8', '192.168.0.0/16']):
# Log accepted routes
sys.stderr.write(f"ACCEPT: {line}")
else:
# Drop others (saves memory)
sys.stderr.write(f"DROP: {line}")
continue
# Forward accepted lines
sys.stdout.write(line)
sys.stdout.flush()ExaBGP memory breakdown:
Component Memory Usage
-----------------------------------------
Base process ~30 MB
Per BGP session ~1-2 MB
Per announced route ~100 bytes
Per received route ~200 bytes
API process overhead ~5-10 MB each
Example calculation:
5 BGP sessions = 5-10 MB
1,000 announced routes = 0.1 MB
10,000 received routes = 2 MB
2 API processes = 10-20 MB
-----------------------------------------
Total estimate = ~50 MB
1. Disable unnecessary features:
neighbor 192.168.1.1 {
# ... neighbor config ...
# Don't receive routes if not needed
api {
processes [announcer];
receive {
# parsed; # Disable if not processing received routes
# update; # Disable if not needed
}
}
}2. Use JSON compact format (if using JSON API):
process announcer {
run python3 /etc/exabgp/announce.py;
encoder json; # More compact than text for large routes
}3. Limit route storage:
# Don't store routes in memory if not needed
import sys
# Process routes immediately without storing
for line in sys.stdin:
# Process immediately
process_route(line)
# Don't store in list/dictMonitor memory usage:
#!/bin/bash
# Monitor ExaBGP memory over time
while true; do
pid=$(pgrep -f exabgp | head -1)
if [ -n "$pid" ]; then
mem=$(ps -p $pid -o rss= | awk '{printf "%.1f", $1/1024}')
echo "$(date +%H:%M:%S) ExaBGP Memory: ${mem} MB"
fi
sleep 60
doneWhat consumes CPU:
Activity CPU Impact
-----------------------------------------
BGP keepalives Very low
Route announcements Low-moderate
Route withdrawals Low-moderate
Route processing (API) Moderate-high
JSON parsing High
Large route updates High
1. Reduce API process CPU:
#!/usr/bin/env python3
"""
Efficient health check with minimal CPU
"""
import sys
import time
import subprocess
# Cache results to avoid repeated checks
last_check_time = 0
last_check_result = False
check_interval = 5 # Only check every 5 seconds
def check_service():
"""Cached health check"""
global last_check_time, last_check_result
now = time.time()
if now - last_check_time < check_interval:
# Return cached result
return last_check_result
# Perform check
try:
result = subprocess.run(
['curl', '-sf', '-m', '2', 'http://localhost/health'],
capture_output=True,
timeout=3
)
last_check_result = result.returncode == 0
except:
last_check_result = False
last_check_time = now
return last_check_result
while True:
if check_service():
sys.stdout.write("announce route 100.10.0.100/32 next-hop self\n")
else:
sys.stdout.write("withdraw route 100.10.0.100/32\n")
sys.stdout.flush()
# Sleep to reduce CPU (critical!)
time.sleep(10)2. Use efficient data structures:
# β SLOW: List lookup
routes = ['10.0.0.0/8', '192.168.0.0/16', ...]
if route in routes: # O(n) lookup
process()
# β
FAST: Set lookup
routes = {'10.0.0.0/8', '192.168.0.0/16', ...}
if route in routes: # O(1) lookup
process()3. Avoid tight loops:
# β BAD: No sleep = 100% CPU
while True:
check_service()
# β
GOOD: Sleep between checks
while True:
check_service()
time.sleep(5)Profile Python API processes:
# Install cProfile
pip3 install cProfile
# Run with profiling
python3 -m cProfile -o profile.stats /etc/exabgp/healthcheck.py
# Analyze results
python3 -m pstats profile.stats
> sort cumulative
> stats 101. Interface MTU:
# Check current MTU
ip link show eth0
# Increase MTU if possible (reduces packet overhead)
ip link set eth0 mtu 9000 # Jumbo frames (if supported)2. Disable unnecessary services on BGP interface:
# Dedicated BGP interface
# Disable unnecessary protocols
ethtool -K eth0 tso off # TCP segmentation offload
ethtool -K eth0 gso off # Generic segmentation offload3. Use direct routing:
# Ensure BGP peers are directly connected or via L2
# Avoid routing BGP over complex topologies
ip route get 192.168.1.1Track BGP network metrics:
#!/bin/bash
# Monitor BGP network traffic
interface="eth0"
while true; do
# BGP traffic on port 179
rx=$(iptables -L -v -n -x | grep "dpt:179" | awk '{print $2}')
tx=$(iptables -L -v -n -x | grep "spt:179" | awk '{print $2}')
echo "$(date +%H:%M:%S) BGP RX: $rx TX: $tx"
sleep 60
doneBest practices for API scripts:
1. Minimize I/O operations:
# β
GOOD: Batch output
messages = []
for route in routes:
messages.append(f"announce route {route} next-hop self")
sys.stdout.write('\n'.join(messages) + '\n')
sys.stdout.flush()
# β BAD: Flush after every route
for route in routes:
sys.stdout.write(f"announce route {route} next-hop self\n")
sys.stdout.flush() # Too many flushes!2. Use bulk announcements (ExaBGP 4.0+):
# β
BEST: Use bulk announcements for same attributes
# Announce 1000 routes with same attributes in ONE command
prefixes = [f"10.{i//256}.{i%256}.0/24" for i in range(1000)]
sys.stdout.write(f"announce attributes next-hop self nlri {' '.join(prefixes)}\n")
sys.stdout.flush()
# β SLOW: Individual announcements
# 1000 separate commands = 10-100x slower
for i in range(1000):
sys.stdout.write(f"announce route 10.{i//256}.{i%256}.0/24 next-hop self\n")
sys.stdout.flush()Performance gain: 10-100x faster for bulk operations. Single command reduces parsing overhead significantly.
See also: Bulk Announcements Documentation
3. Use buffered I/O:
import sys
import io
# Use buffered stdout
output = io.TextIOWrapper(sys.stdout.buffer, line_buffering=False)
# Write messages
output.write("announce route 10.0.0.0/8 next-hop self\n")
output.flush() # Flush when ready4. Avoid external commands:
# β SLOW: External commands
import subprocess
result = subprocess.run(['curl', 'http://localhost'])
# β
FAST: Python libraries
import urllib.request
result = urllib.request.urlopen('http://localhost')Limit concurrent API processes:
# Don't run too many API processes
neighbor 192.168.1.1 {
api {
# Limit to essential processes only
processes [healthcheck]; # Not 10 different processes
}
}Create comprehensive benchmark:
#!/bin/bash
# ExaBGP benchmark suite
echo "=== ExaBGP Performance Benchmark ==="
echo "Started: $(date)"
echo
# Test 1: Startup time
echo "Test 1: Startup time"
start=$(date +%s.%N)
exabgp --test /etc/exabgp/exabgp.conf > /dev/null 2>&1
end=$(date +%s.%N)
echo "Config parse time: $(echo "$end - $start" | bc) seconds"
echo
# Test 2: Route announcement rate (individual commands)
echo "Test 2: Route announcement rate - Individual commands (1000 routes)"
cat > /tmp/bench-announce-individual.py <<'EOF'
#!/usr/bin/env python3
import sys
for i in range(1000):
sys.stdout.write(f"announce route 100.{i//256}.{i%256}.0/24 next-hop self\n")
sys.stdout.flush()
EOF
chmod +x /tmp/bench-announce-individual.py
start=$(date +%s.%N)
/tmp/bench-announce-individual.py
end=$(date +%s.%N)
duration=$(echo "$end - $start" | bc)
rate=$(echo "1000 / $duration" | bc)
echo "Duration: $duration seconds"
echo "Rate: $rate routes/second"
echo
# Test 3: Route announcement rate (bulk announcements)
echo "Test 3: Route announcement rate - Bulk announcements (1000 routes)"
cat > /tmp/bench-announce-bulk.py <<'EOF'
#!/usr/bin/env python3
import sys
prefixes = [f"100.{i//256}.{i%256}.0/24" for i in range(1000)]
sys.stdout.write(f"announce attributes next-hop self nlri {' '.join(prefixes)}\n")
sys.stdout.flush()
EOF
chmod +x /tmp/bench-announce-bulk.py
start=$(date +%s.%N)
/tmp/bench-announce-bulk.py
end=$(date +%s.%N)
duration=$(echo "$end - $start" | bc)
rate=$(echo "1000 / $duration" | bc)
echo "Duration: $duration seconds"
echo "Rate: $rate routes/second"
echo "Note: Bulk announcements (ExaBGP 4.0+) significantly faster"
echo
# Test 4: Memory usage under load
echo "Test 4: Memory usage"
pid=$(pgrep -f exabgp | head -1)
if [ -n "$pid" ]; then
echo "Current RSS: $(ps -p $pid -o rss= | awk '{print $1/1024}') MB"
echo "Current VSZ: $(ps -p $pid -o vsz= | awk '{print $1/1024}') MB"
fi
echo
# Test 4: CPU usage
echo "Test 4: CPU usage (10 second average)"
if [ -n "$pid" ]; then
cpu=$(ps -p $pid -o %cpu= | head -1)
echo "CPU: $cpu%"
fi
echo
echo "Completed: $(date)"Simulate high route churn:
#!/usr/bin/env python3
"""
Load test: Rapid route announcements/withdrawals
"""
import sys
import time
routes = [f"100.{i//256}.{i%256}.0/24" for i in range(1000)]
# Measure announcement speed
start = time.time()
for i in range(10): # 10 iterations
# Announce all
for route in routes:
sys.stdout.write(f"announce route {route} next-hop self\n")
sys.stdout.flush()
time.sleep(1)
# Withdraw all
for route in routes:
sys.stdout.write(f"withdraw route {route}\n")
sys.stdout.flush()
time.sleep(1)
end = time.time()
duration = end - start
total_updates = 1000 * 10 * 2 # routes Γ iterations Γ (announce+withdraw)
rate = total_updates / duration
sys.stderr.write(f"Updates: {total_updates}\n")
sys.stderr.write(f"Duration: {duration:.2f}s\n")
sys.stderr.write(f"Rate: {rate:.0f} updates/sec\n")Benchmark different configurations:
#!/bin/bash
# Compare text vs JSON API
echo "=== Text API ==="
time exabgp /etc/exabgp/config-text.conf &
PID1=$!
sleep 30
kill $PID1
echo "=== JSON API ==="
time exabgp /etc/exabgp/config-json.conf &
PID2=$!
sleep 30
kill $PID2
# Compare memory/CPUResource requirements by deployment size:
Small deployment (1-5 BGP sessions, <100 routes):
CPU: 1 core @ 1 GHz
Memory: 256 MB
Network: 1 Mbps
Disk: 100 MB
Medium deployment (5-20 BGP sessions, <1,000 routes):
CPU: 2 cores @ 2 GHz
Memory: 512 MB
Network: 10 Mbps
Disk: 500 MB
Large deployment (20-50 BGP sessions, <10,000 routes):
CPU: 4 cores @ 2.5 GHz
Memory: 2 GB
Network: 100 Mbps
Disk: 1 GB
Extra-large deployment (50+ BGP sessions, receiving full tables):
CPU: 8 cores @ 3 GHz
Memory: 8 GB
Network: 1 Gbps
Disk: 5 GB
Plan for growth:
#!/usr/bin/env python3
"""
Capacity planning calculator
"""
# Current metrics
current_sessions = 10
current_routes_announced = 100
current_routes_received = 1000
# Growth projections (next 12 months)
growth_factor = 2.0 # 100% growth
# Projected metrics
projected_sessions = current_sessions * growth_factor
projected_routes_announced = current_routes_announced * growth_factor
projected_routes_received = current_routes_received * growth_factor
# Memory calculation
base_memory = 30 # MB
session_memory = projected_sessions * 2 # MB per session
route_ann_memory = (projected_routes_announced * 100) / 1024 / 1024 # MB
route_rcv_memory = (projected_routes_received * 200) / 1024 / 1024 # MB
total_memory = base_memory + session_memory + route_ann_memory + route_rcv_memory
print(f"Projected BGP sessions: {projected_sessions:.0f}")
print(f"Projected routes announced: {projected_routes_announced:.0f}")
print(f"Projected routes received: {projected_routes_received:.0f}")
print(f"Estimated memory requirement: {total_memory:.0f} MB")
print(f"Recommended server: {total_memory * 2:.0f} MB RAM") # 2x headroomConfiguration:
- Use appropriate BGP timers for your network
- Enable only necessary address families
- Disable route reception if not processing routes
- Use graceful restart for smooth failovers
API Processes:
- Always include
time.sleep()in loops - Use bulk announcements (
announce attributes ... nlri) for same-attribute routes (ExaBGP 4.0+) - Batch route announcements when possible
- Flush stdout after batch, not after each route
- Use efficient data structures (sets, dicts)
- Avoid external commands (use Python libraries)
- Cache results to avoid redundant checks
System:
- Tune TCP parameters for your workload
- Use appropriate MTU (jumbo frames if supported)
- Monitor resource usage over time
- Implement log rotation
- Use dedicated interface for BGP if possible
Monitoring:
- Track CPU and memory usage
- Monitor BGP session stability
- Alert on resource exhaustion
- Benchmark periodically
- Capacity plan for growth
Don't:
β Run API processes without sleep
β Flush stdout after every single route
β Use external commands when Python libraries exist
β Store all routes in memory unnecessarily
β Receive routes if not processing them
β Use overly aggressive BGP timers
β Run dozens of API processes simultaneously
β Parse logs with grep | awk | sed chains (use Python)
β Ignore resource monitoring
β Over-optimize before measuring
Diagnosis:
# Check which process is consuming CPU
top -p $(pgrep -f exabgp)
# Check API processes
ps aux | grep -E 'python|exabgp' | grep -v grepCommon causes:
- API process tight loop (missing
sleep) - High route churn
- JSON parsing overhead
- Inefficient health checks
Fix:
# Add sleep to API process
while True:
do_work()
time.sleep(5) # β Critical!Diagnosis:
# Check memory details
pmap $(pgrep -f exabgp)
# Check for memory leaks
watch -n 5 'ps aux | grep exabgp | grep -v grep'Common causes:
- Receiving large route tables
- Storing routes in API process
- Memory leak in API script
- Too many API processes
Fix:
# Disable route reception if not needed
api {
receive {
# parsed; # Disable if not processing
}
}Diagnosis:
# Measure convergence time
time_start=$(date +%s.%N)
# Trigger route change
time_end=$(date +%s.%N)
echo "Convergence: $(echo "$time_end - $time_start" | bc)s"Common causes:
- Long BGP timers
- Router-side processing delays
- Network latency
- API process delays
Fix:
# Reduce hold-time (carefully!)
hold-time 90; # Down from 180- Monitoring - Monitor ExaBGP performance
- Debugging - Troubleshooting guide
- Service HA - High availability patterns
Need help optimizing? Join our Slack community β
π» Ghost written by Claude (Anthropic AI)
π Home
π Getting Started
π§ API
π‘οΈ Use Cases
π Address Families
βοΈ Configuration
π Operations
π Reference
- Architecture
- BGP State Machine
- Communities (RFC)
- Extended Communities
- BGP Ecosystem
- Capabilities (AFI/SAFI)
- RFC Support
π Migration
π Community
π External
- GitHub Repo β
- Slack β
- Issues β
π» Ghost written by Claude (Anthropic AI)