Skip to content

Conversation

@LouisTsai-Csie
Copy link
Collaborator

@LouisTsai-Csie LouisTsai-Csie commented Nov 7, 2025

🗒️ Description

Enhance the code generator for BenchmarkTest wrapper:

  • Add initial balance setting for JumpLoopGenerator, which already supports in the other code generator, this could simplify test_create.
  • Add safety checks for ExtCallGenerator, when calculating the max iterations, it should deduct the stack delta that comes from setup section, or it might lead to stack overflow issue.
  • Most of the contract will be filled with repeated pattern, this PR further fills the remaining space with STOP opcode. This could help simplify some cases like test_codecopy & test_return_revert.
  • For ExtCallGenerator, there are two different account: target contract and loop contract. The loop contract will repeatedly calls into target contract via STATICCALL. For some operations that involves CALLDATA interactions, we should forward the CALLDATA from loop contract to target contract via (1) forward the CALLDATA from loop contract and (2) Configure the memory from CALLDATA in target contract.

Test Case Enhancement:

  • test_ext_account_query_warm: Add initial balance / initial storage parametrization.
  • test_blockhash: Add extra cases for fixed block index and dynamic block index
  • test_pc_op: Add missing benchmark test
  • test_keccak: Add this case to parametrize memory size / offset for gas repricing effort.
  • test_tload: Optimize the benchmark by switching to ExtCallGenerator
  • test_block_full_of_ether_transfers: Add scenario that the receiver is not an empty account.

🔗 Related Issues or PRs

N/A.

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
  • Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

@LouisTsai-Csie LouisTsai-Csie force-pushed the benchmark/add-missing-cases branch from f9fdb17 to 01e4d6b Compare November 10, 2025 10:02
@LouisTsai-Csie LouisTsai-Csie self-assigned this Nov 10, 2025
@LouisTsai-Csie LouisTsai-Csie added E-medium Experience: of moderate difficulty P-medium C-refactor Category: refactor A-test-benchmark Area: Tests Benchmarks—Performance measurement (eg. `tests/benchmark/*`, `p/t/s/e/benchmark/*`) labels Nov 10, 2025
@LouisTsai-Csie LouisTsai-Csie force-pushed the benchmark/add-missing-cases branch from 01e4d6b to 1b44dea Compare November 10, 2025 10:05
)


def test_pc_op(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We miss the benchmark for PC opcode

Comment on lines +70 to +72
@pytest.mark.parametrize("mem_alloc", [b"", b"ff", b"ff" * 32])
@pytest.mark.parametrize("offset", [0, 31, 1024])
def test_keccak(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of finding the optimal length, add a new case that parametrized initial memory layout and access offset. Request by gas repricing effort.

],
)
@pytest.mark.parametrize("balance", [0, 1])
def test_block_full_of_ether_transfers(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a new scenario for ether transfer benchmark: whether the receiver is an empty account.


benchmark_test(
code_generator=JumpLoopGenerator(
code_generator=ExtCallGenerator(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By updating fro JumpLoopGenerator to ExtCallGenerator, we could get rid of the POP opcode each operation, enhancing the TLOAD density.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might break the cases using CALLVALUE because we are now running this in the context of the subcall, and since we are calling using STATICCALL to call that subcontract, CALLVALUE is always going to be zero.
Perhaps replacing this with CALLDATASIZE, since we are passing along the data from the transaction?

# Create 256 dummy blocks to fill the blockhash window.
blocks = [Block()] * 256

block_number = Op.AND(Op.GAS, 0xFF) if index is None else index
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding extra BLOCKHASH operation for benchmarking, as it was one of the slowest operations.
Cases:

  • Valid block index
  • Invalid block index
  • Dynamic block index

Comment on lines 375 to 377
empty_account: bool,
initial_balance: bool,
initial_storage: bool,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add new test scenario for gas repricing effort:

  • Accessing empty / non-empty account
  • Accessing account contains zero / non-zero balance
  • Accessing account contains zero / non-zero storage

@LouisTsai-Csie LouisTsai-Csie marked this pull request as ready for review November 10, 2025 10:38
@marioevz marioevz self-requested a review November 10, 2025 17:03
Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

In general, I liked the refactoring and changes, I've just left comments on multiple files just to make everything more polished.

code = setup + Op.JUMPDEST + repeated_code * max_iterations + cleanup
code += Op.JUMP(len(setup)) if len(setup) > 0 else Op.PUSH0 + Op.JUMP
# Pad the code to the maximum code size.
code += Op.STOP * (max_code_size - len(code))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this optional via a new flag in the BenchmarkCodeGenerator class:

code_padding_opcode: Op | None = None

So only tests that need to pad the code to the max size do this.

sender=pre.fund_eoa(),
benchmark_test(
code_generator=JumpLoopGenerator(
setup=setup, attack_block=attack_block
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
setup=setup, attack_block=attack_block
setup=setup, attack_block=attack_block, code_padding_opcode=Op.STOP

Following previous suggestion.

)
@pytest.mark.parametrize(
"absent_target",
"empty_account",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meaning of "empty_account" seems confusing IMO. From what I'm reading below, it seems like it's rather "empty_code"?

Comment on lines 388 to 390
target_addr = pre.fund_eoa(
storage={0: 0x1337} if initial_storage else {0: 0}
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unconditionally sets the balance to a non-zero value, even when initial_balance==False.

Also, {0: 0}, given the current fund_eoa logic, is still going to try to touch the storage (arguably a bug but besides the point).

We should instead build a kwargs and then call pre.fund_eoa(**kwargs) to make this cleaner.

Comment on lines 385 to 387
if not initial_balance and not initial_storage:
target_addr = pre.empty_account()
elif initial_balance or initial_storage:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should refactor this logic a bit:

Suggested change
if not initial_balance and not initial_storage:
target_addr = pre.empty_account()
elif initial_balance or initial_storage:
if not initial_balance and not initial_storage and empty_code:
target_addr = pre.empty_account()
else:

And in the else branch we construct the kwargs to either call pre.fund_eoa or pre.deploy_contract.

benchmark_test(
code_generator=ExtCallGenerator(
setup=Op.MLOAD(Op.SELFBALANCE) + Op.POP,
setup=Op.POP(Op.MLOAD(Op.SELFBALANCE)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I'm guessing this helps with the stack push/pop elements calculations.

attack_block = Op.POP(Op.TLOAD(Op.DUP1))
code_key_mut = Op.POP + Op.GAS
code_val_mut = Op.TSTORE(Op.DUP2, Op.GAS)
setup = Op.GAS + Op.TSTORE(Op.DUP2, Op.GAS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While trying to read this code, I realize that key_mut -> fixed_key, val_mut -> fixed_value, and flipping the logic, would make it easier to read IMO.


benchmark_test(
code_generator=JumpLoopGenerator(
code_generator=ExtCallGenerator(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might break the cases using CALLVALUE because we are now running this in the context of the subcall, and since we are calling using STATICCALL to call that subcontract, CALLVALUE is always going to be zero.
Perhaps replacing this with CALLDATASIZE, since we are passing along the data from the transaction?

)
executable_code = mem_preparation + opcode(size=return_size)
code = executable_code
if return_non_zero_data:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is lost because we are now by default returning code full of Op.STOP, which is all zeros. This can also be addressed by using the code_padding_opcode I mentioned in a previous comment (doing code_padding_opcode=Op.INVALID instead).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, i do not notice STOP is 0x00. This is helpful.

yield receiver


@pytest.fixture
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could still work as a fixture if we parametrize balance=0 on the rest of the test cases.

@LouisTsai-Csie LouisTsai-Csie force-pushed the benchmark/add-missing-cases branch from c720a28 to 2b6823a Compare November 11, 2025 06:35
@LouisTsai-Csie
Copy link
Collaborator Author

Sorry for adding complexity to this PR, I’ve added one more commit that optimizes the DUP and CALLDATASIZE operations.

  • test_dup: The setup stack delta is now handled in ExtCallGenerator, which simplifies the test logic.
  • test_calldatasize: This test is also simplified using ExtCallGenerator, as the generator now forwards calldata to the target contract.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-test-benchmark Area: Tests Benchmarks—Performance measurement (eg. `tests/benchmark/*`, `p/t/s/e/benchmark/*`) C-refactor Category: refactor E-medium Experience: of moderate difficulty P-medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants