-
Notifications
You must be signed in to change notification settings - Fork 98
Description
Summary
After the Babylon v4 upgrade, we propose conducting a top-down node benchmark as part of the optimization efforts for v5. The goal is to measure node performance under realistic workloads, identify key bottlenecks, and provide actionable insights for further optimization.
Proposal
We suggest applying a top-down benchmark analysis approach to the Babylon node:
- Methodology
- Perform benchmarking during block sync (catch-up) mode.
- In this mode, consensus logic is skipped, so the measurements primarily capture execution and storage overhead.
- This provides a clean view of where CPU cycles and I/O are consumed.
- Measurement Tools
-
Use pprof to capture CPU usage and resource consumption.
- pprof
- Captures aggregated CPU and memory profiling.
- Highlights which functions consume the most resources over time.
- pprof
-
runtime/trace: captures fine-grained event traces such as goroutine scheduling, network blocking, syscall latency, garbage collection pauses, and channel operations. This provides visibility into why overhead occurs (e.g., lock contention, scheduling stalls), complementing the aggregated view from pprof.
-
Collect data over a full catch-up run from historical blocks up to the latest height.
-
Custom Logging
- For detailed analysis of bottlenecks revealed by
pproforruntime/trace, we can add custom log points in critical functions. - Typical insertion points include:
- Function entry/exit (to measure execution latency)
BeginBlockerandEndBlocker(to measure per-block overhead)AnteHandler(to capture transaction preprocessing overhead such as signature checks, fee deduction, and sequence number validation)- Transaction handlers like
AddFinalitySig(to capture DB read/write counts and duration) VoteExtensionprocessing (to measure overhead from extending votes with additional metadata, often involving serialization and validation)
- By correlating custom logs with profiling data, we can analyze fine-grained operation costs (e.g., per-signature verification latency, state read amplification).
- For detailed analysis of bottlenecks revealed by
-
- Scope
- Start with the v4 fixed binary after the upgrade is complete.
- Compare results against prior benchmarks (e.g., v2.2.0) to track progress and identify regressions or improvements.
- Expected Hotspots
- From prior v2.2.0 catch-up simulations,
AddFinalitySigwas observed as a significant contributor to CPU overhead. - This path involves repeated state reads, such as:
- GetBlock (fetching block data)
- GetTimestampedPubRandCommitForHeiight (fetching committed randomness)
- These are expected to remain critical areas for optimization in v5.
- From prior v2.2.0 catch-up simulations,
Additional Considerations
- As the blockchain state grows, certain operations may scale proportionally with state size.
- If execution time for these operations increases linearly with state growth, it could lead to longer block times and delayed finalization.
- It is therefore critical to:
- Use custom logging to identify which operations scale with state size.
- Ensure that even as blocks accumulate, block execution time does not grow significantly, preventing performance degradation over time.