perf: top-down benchmark testing for v4

# Summary

After the **Babylon v4 upgrade**, we propose conducting a **top-down node benchmark** as part of the optimization efforts for v5. The goal is to measure node performance under realistic workloads, identify key bottlenecks, and provide actionable insights for further optimization.

# Proposal

We suggest applying a **top-down benchmark analysis approach** to the Babylon node:

1. **Methodology**
    - Perform benchmarking during **block sync (catch-up) mode**.
    - In this mode, consensus logic is skipped, so the measurements primarily capture **execution and storage overhead**.
    - This provides a clean view of where CPU cycles and I/O are consumed.
2. **Measurement Tools**
    - Use **pprof** to capture CPU usage and resource consumption.
        - **pprof**
            - Captures aggregated CPU and memory profiling.
            - Highlights *which functions* consume the most resources over time.
    - **runtime/trace**: captures fine-grained event traces such as goroutine scheduling, network blocking, syscall latency, garbage collection pauses, and channel operations. This provides visibility into *why* overhead occurs (e.g., lock contention, scheduling stalls), complementing the aggregated view from pprof.
    
    - Collect data over a full catch-up run from historical blocks up to the latest height.
    - **Custom Logging**
        - For detailed analysis of bottlenecks revealed by `pprof` or `runtime/trace`, we can add **custom log points** in critical functions.
        - Typical insertion points include:
            - Function entry/exit (to measure execution latency)
            - `BeginBlocker` and `EndBlocker` (to measure per-block overhead)
            - `AnteHandler` (to capture transaction preprocessing overhead such as signature checks, fee deduction, and sequence number validation)
            - Transaction handlers like `AddFinalitySig` (to capture DB read/write counts and duration)
            - `VoteExtension` processing (to measure overhead from extending votes with additional metadata, often involving serialization and validation)
        - By correlating custom logs with profiling data, we can analyze **fine-grained operation costs** (e.g., per-signature verification latency, state read amplification).
3. **Scope**
    - Start with the **v4 fixed binary** after the upgrade is complete.
    - Compare results against prior benchmarks (e.g., v2.2.0) to track progress and identify regressions or improvements.
4. **Expected Hotspots**
    - From prior v2.2.0 catch-up simulations, **`AddFinalitySig`** was observed as a significant contributor to CPU overhead.
    - This path involves repeated state reads, such as:
        - [GetBlock](https://github.com/babylonlabs-io/babylon/blob/77c301cc72b9851bd0ff9054370781028cf5a6c9/x/finality/keeper/msg_server.go#L74) (fetching block data)
        - [GetTimestampedPubRandCommitForHeiight](https://github.com/babylonlabs-io/babylon/blob/77c301cc72b9851bd0ff9054370781028cf5a6c9/x/finality/keeper/msg_server.go#L136) (fetching committed randomness)
    - These are expected to remain critical areas for optimization in v5.
    

### Additional Considerations

- As the **blockchain state grows**, certain operations may scale proportionally with state size.
- If execution time for these operations increases linearly with state growth, it could lead to **longer block times and delayed finalization**.
- It is therefore critical to:
    - Use custom logging **to identify which operations** scale with state size.
    - Ensure that even as blocks accumulate, **block execution time does not grow significantly**, preventing performance degradation over time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: top-down benchmark testing for v4 #1742

Summary

Proposal

Additional Considerations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

perf: top-down benchmark testing for v4 #1742

Description

Summary

Proposal

Additional Considerations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions