Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 13, 2025

📄 10% (0.10x) speedup for sweep_htlctx_output in electrum/lnsweep.py

⏱️ Runtime : 996 microseconds 907 microseconds (best of 250 runs)

📝 Explanation and details

The optimization achieves a 9% speedup by caching expensive method calls that were being repeated. Specifically:

Key changes:

  1. Cached ctx.outputs() - Stored in local variable outputs to avoid redundant method calls when accessing outputs[output_idx].value
  2. Cached ctx.txid() - Stored in local variable txid to avoid redundant computation in TxOutpoint(txid=bfh(txid), out_idx=output_idx)

Why this optimization works:

  • The ctx.outputs() and ctx.txid() methods likely involve expensive operations like transaction deserialization and hash computation (as seen in the dependency code)
  • In the original code, ctx.outputs()[output_idx].value calls outputs() every time, and bfh(ctx.txid()) calls txid() every time
  • By caching these results, we eliminate redundant function call overhead and potentially expensive recomputation

Impact based on function references:
This optimization is particularly valuable given the function references show sweep_htlctx_output is called in hot paths:

  • Lightning Network justice transactions - Called repeatedly in loops when sweeping multiple HTLC outputs during breach remediation
  • Channel force-close scenarios - Called for each HTLC output when processing commitment transactions
  • Watchtower operations - Used in automated breach response where performance matters for timely transaction broadcasting

Test case performance:
The optimization shows consistent 6-11% improvements across most test cases, with the largest gains (10-11%) in scenarios involving multiple calls or larger transactions - exactly the use cases indicated by the function references where this code runs in loops processing multiple HTLCs.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 434 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest
from electrum.lnsweep import sweep_htlctx_output


class TxOutput:
    def __init__(self, value):
        self.value = value

# Minimal Transaction stub
class Transaction:
    def __init__(self, txid, outputs):
        self._txid = txid
        self._outputs = outputs

    def outputs(self):
        return self._outputs

    def txid(self):
        return self._txid
from electrum.lnsweep import sweep_htlctx_output

# -----------------------
# Unit Tests for sweep_htlctx_output
# -----------------------

# ---- Basic Test Cases ----

def test_basic_revocation_true():
    # Test with is_revocation True, no to_self_delay
    tx = Transaction("aabbcc"*5, [TxOutput(1000), TxOutput(2000)])
    witness_script = b'\x51'  # OP_1
    privkey = b'\x01'*32
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=1,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 9.31μs -> 9.11μs (2.25% faster)
    # Check make_witness lambda
    sig = b'\x02'*64
    witness = txin.make_witness(sig)

def test_basic_revocation_false():
    # Test with is_revocation False, requires to_self_delay
    tx = Transaction("deadbeef"*8, [TxOutput(5000)])
    witness_script = b'\x52'  # OP_2
    privkey = b'\x02'*32
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=0,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=False,
        to_self_delay=123,
    ); txin = codeflash_output # 6.84μs -> 6.15μs (11.3% faster)
    sig = b'\x03'*64
    witness = txin.make_witness(sig)

def test_basic_multiple_outputs():
    # Test with multiple outputs, selecting the correct one
    tx = Transaction("cafebabe"*8, [TxOutput(1), TxOutput(2), TxOutput(3)])
    witness_script = b'\x53'
    privkey = b'\x03'*32
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=2,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 6.09μs -> 5.77μs (5.51% faster)

# ---- Edge Test Cases ----

def test_edge_output_idx_out_of_range():
    # output_idx too high should raise IndexError
    tx = Transaction("abcdef"*8, [TxOutput(100)])
    witness_script = b'\x54'
    privkey = b'\x04'*32
    with pytest.raises(IndexError):
        sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=5,
            htlctx_witness_script=witness_script,
            privkey=privkey,
            is_revocation=True,
        ) # 1.62μs -> 1.66μs (1.81% slower)

def test_edge_empty_witness_script():
    # witness_script empty should trigger assertion
    tx = Transaction("123456"*8, [TxOutput(100)])
    privkey = b'\x05'*32
    with pytest.raises(AssertionError):
        sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=0,
            htlctx_witness_script=b'',
            privkey=privkey,
            is_revocation=True,
        ) # 6.76μs -> 6.43μs (5.10% faster)

def test_edge_to_self_delay_missing_when_required():
    # is_revocation False but to_self_delay is missing
    tx = Transaction("654321"*8, [TxOutput(100)])
    witness_script = b'\x55'
    privkey = b'\x06'*32
    with pytest.raises(AssertionError):
        sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=0,
            htlctx_witness_script=witness_script,
            privkey=privkey,
            is_revocation=False,
        ) # 6.24μs -> 5.70μs (9.44% faster)

def test_edge_to_self_delay_wrong_type():
    # to_self_delay is not int
    tx = Transaction("abcdef"*8, [TxOutput(100)])
    witness_script = b'\x56'
    privkey = b'\x07'*32
    with pytest.raises(AssertionError):
        sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=0,
            htlctx_witness_script=witness_script,
            privkey=privkey,
            is_revocation=False,
            to_self_delay="not_an_int",
        ) # 6.03μs -> 5.66μs (6.59% faster)

def test_edge_zero_value_output():
    # Output value is zero
    tx = Transaction("feedface"*8, [TxOutput(0)])
    witness_script = b'\x57'
    privkey = b'\x08'*32
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=0,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 5.76μs -> 5.39μs (6.94% faster)

def test_edge_negative_value_output():
    # Output value is negative (should not happen, but test anyway)
    tx = Transaction("badc0de"*8, [TxOutput(-100)])
    witness_script = b'\x58'
    privkey = b'\x09'*32
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=0,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 5.47μs -> 5.00μs (9.37% faster)

def test_edge_privkey_empty():
    # privkey is empty bytes
    tx = Transaction("deadbeef"*8, [TxOutput(100)])
    witness_script = b'\x59'
    privkey = b''
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=0,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 5.40μs -> 5.03μs (7.33% faster)


def test_large_many_outputs():
    # Transaction with many outputs, select last one
    outputs = [TxOutput(i) for i in range(1000)]
    tx = Transaction("0123456789abcdef"*4, outputs)
    witness_script = b'\x60'
    privkey = b'\x11'*32
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=999,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 9.55μs -> 8.96μs (6.54% faster)

def test_large_unique_values():
    # Transaction with unique output values
    outputs = [TxOutput(i*3) for i in range(500)]
    tx = Transaction("abcdef0123456789"*4, outputs)
    witness_script = b'\x61'
    privkey = b'\x12'*32
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=250,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=False,
        to_self_delay=250,
    ); txin = codeflash_output # 7.09μs -> 6.55μs (8.28% faster)

def test_large_randomized_delays():
    # Transaction with randomized to_self_delay values
    outputs = [TxOutput(1000) for _ in range(100)]
    tx = Transaction("1234567890abcdef"*4, outputs)
    witness_script = b'\x62'
    privkey = b'\x13'*32
    for i in range(100):
        codeflash_output = sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=i,
            htlctx_witness_script=witness_script,
            privkey=privkey,
            is_revocation=False,
            to_self_delay=i,
        ); txin = codeflash_output # 215μs -> 195μs (10.0% faster)

def test_large_all_revocation():
    # All outputs swept as revocation
    outputs = [TxOutput(i+1) for i in range(200)]
    tx = Transaction("fedcba9876543210"*4, outputs)
    witness_script = b'\x63'
    privkey = b'\x14'*32
    for i in range(200):
        codeflash_output = sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=i,
            htlctx_witness_script=witness_script,
            privkey=privkey,
            is_revocation=True,
        ); txin = codeflash_output # 404μs -> 364μs (11.0% faster)

def test_large_all_non_revocation():
    # All outputs swept as non-revocation with increasing delays
    outputs = [TxOutput(i+2) for i in range(100)]
    tx = Transaction("0123456789abcdef"*4, outputs)
    witness_script = b'\x64'
    privkey = b'\x15'*32
    for i in range(100):
        codeflash_output = sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=i,
            htlctx_witness_script=witness_script,
            privkey=privkey,
            is_revocation=False,
            to_self_delay=i+10,
        ); txin = codeflash_output # 211μs -> 191μs (10.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from electrum.lnsweep import sweep_htlctx_output


class TxOutput:
    def __init__(self, value):
        self.value = value

def bfh(s):
    # Converts hex string to bytes
    if isinstance(s, bytes):
        return s
    return bytes.fromhex(s)

# Minimal Transaction stub for testing
class Transaction:
    def __init__(self, outputs, txid="00"*32):
        self._outputs = outputs
        self._txid = txid
    def outputs(self):
        return self._outputs
    def txid(self):
        return self._txid
from electrum.lnsweep import sweep_htlctx_output

# -------------------- UNIT TESTS --------------------

# ---- Basic Test Cases ----

def test_basic_revocation_true():
    # Test sweep with is_revocation True
    tx = Transaction([TxOutput(1000), TxOutput(2000)], txid="11"*32)
    witness_script = bfh("aa"*20)
    privkey = bfh("bb"*32)
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=1,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 5.35μs -> 5.07μs (5.67% faster)
    # make_witness lambda returns correct tuple
    sig = bfh("cc"*64)
    witness = txin.make_witness(sig)

def test_basic_revocation_false():
    # Test sweep with is_revocation False and to_self_delay
    tx = Transaction([TxOutput(5000)], txid="22"*32)
    witness_script = bfh("ff"*10)
    privkey = bfh("dd"*32)
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=0,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=False,
        to_self_delay=123
    ); txin = codeflash_output # 4.80μs -> 4.79μs (0.271% faster)
    # make_witness lambda returns correct tuple
    sig = bfh("ee"*64)
    witness = txin.make_witness(sig)

# ---- Edge Test Cases ----

def test_output_idx_zero():
    # Test output_idx at zero
    tx = Transaction([TxOutput(999)], txid="33"*32)
    witness_script = bfh("01"*32)
    privkey = bfh("02"*32)
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=0,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 4.93μs -> 4.62μs (6.66% faster)

def test_output_idx_last():
    # Test output_idx at last position
    tx = Transaction([TxOutput(1), TxOutput(2), TxOutput(3)], txid="44"*32)
    witness_script = bfh("03"*16)
    privkey = bfh("04"*32)
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=2,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=False,
        to_self_delay=456
    ); txin = codeflash_output # 4.75μs -> 4.73μs (0.338% faster)

def test_invalid_output_idx_raises():
    # Test invalid output_idx raises IndexError
    tx = Transaction([TxOutput(1)], txid="55"*32)
    witness_script = bfh("05"*16)
    privkey = bfh("06"*32)
    with pytest.raises(IndexError):
        sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=1,
            htlctx_witness_script=witness_script,
            privkey=privkey,
            is_revocation=True,
        ) # 1.55μs -> 1.55μs (0.064% slower)

def test_missing_to_self_delay_raises():
    # Test missing to_self_delay when is_revocation is False raises AssertionError
    tx = Transaction([TxOutput(10)], txid="66"*32)
    witness_script = bfh("07"*16)
    privkey = bfh("08"*32)
    with pytest.raises(AssertionError):
        sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=0,
            htlctx_witness_script=witness_script,
            privkey=privkey,
            is_revocation=False,
            # to_self_delay missing
        ) # 6.00μs -> 5.87μs (2.23% faster)

def test_invalid_to_self_delay_type_raises():
    # Test to_self_delay not int raises AssertionError
    tx = Transaction([TxOutput(10)], txid="77"*32)
    witness_script = bfh("09"*16)
    privkey = bfh("10"*32)
    with pytest.raises(AssertionError):
        sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=0,
            htlctx_witness_script=witness_script,
            privkey=privkey,
            is_revocation=False,
            to_self_delay="not_an_int"
        ) # 5.90μs -> 5.49μs (7.44% faster)

def test_empty_witness_script_raises():
    # Test empty witness_script raises AssertionError
    tx = Transaction([TxOutput(10)], txid="88"*32)
    privkey = bfh("11"*32)
    with pytest.raises(AssertionError):
        sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=0,
            htlctx_witness_script=b'',
            privkey=privkey,
            is_revocation=True,
        ) # 5.84μs -> 5.47μs (6.71% faster)

def test_non_bytes_witness_script():
    # Test witness_script as bytes, not str
    tx = Transaction([TxOutput(10)], txid="99"*32)
    privkey = bfh("12"*32)
    witness_script = b"hello world"
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=0,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 5.47μs -> 5.20μs (5.23% faster)

def test_zero_value_output():
    # Test output with zero value
    tx = Transaction([TxOutput(0)], txid="aa"*32)
    witness_script = bfh("13"*16)
    privkey = bfh("14"*32)
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=0,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 5.17μs -> 4.97μs (3.94% faster)

# ---- Large Scale Test Cases ----

def test_many_outputs():
    # Test with a transaction with many outputs
    outputs = [TxOutput(i) for i in range(1000)]
    tx = Transaction(outputs, txid="ab"*32)
    witness_script = bfh("15"*16)
    privkey = bfh("16"*32)
    # Pick a high index
    idx = 999
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=idx,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=False,
        to_self_delay=789
    ); txin = codeflash_output # 6.62μs -> 6.17μs (7.43% faster)

def test_large_witness_script():
    # Test with a large witness_script
    tx = Transaction([TxOutput(12345)], txid="cd"*32)
    witness_script = bfh("17"*500)  # 500 bytes
    privkey = bfh("18"*32)
    codeflash_output = sweep_htlctx_output(
        htlc_tx=tx,
        output_idx=0,
        htlctx_witness_script=witness_script,
        privkey=privkey,
        is_revocation=True,
    ); txin = codeflash_output # 5.89μs -> 5.50μs (7.15% faster)
    sig = bfh("19"*64)
    witness = txin.make_witness(sig)

def test_performance_many_calls():
    # Test performance with many calls (not exceeding 1000)
    tx = Transaction([TxOutput(i) for i in range(10)], txid="ef"*32)
    witness_script = bfh("20"*16)
    privkey = bfh("21"*32)
    for idx in range(10):
        codeflash_output = sweep_htlctx_output(
            htlc_tx=tx,
            output_idx=idx,
            htlctx_witness_script=witness_script,
            privkey=privkey,
            is_revocation=(idx % 2 == 0),
            to_self_delay=idx if idx % 2 == 1 else None
        ); txin = codeflash_output # 26.2μs -> 23.7μs (10.5% faster)
        # nsequence correctness
        if idx % 2 == 0:
            pass
        else:
            pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from electrum.lnsweep import sweep_htlctx_output
from electrum.transaction import Transaction
import pytest

def test_sweep_htlctx_output():
    with pytest.raises(SerializationError, match="a\\ bytes\\-like\\ object\\ is\\ required,\\ not\\ 'SymbolicByteArray'"):
        sweep_htlctx_output(htlc_tx=Transaction((v1 := b'')), output_idx=0, htlctx_witness_script=v1, privkey=v1, is_revocation=False, to_self_delay=0)

To edit these changes git checkout codeflash/optimize-sweep_htlctx_output-mhwrxshw and push.

Codeflash Static Badge

The optimization achieves a 9% speedup by caching expensive method calls that were being repeated. Specifically:

**Key changes:**
1. **Cached `ctx.outputs()`** - Stored in local variable `outputs` to avoid redundant method calls when accessing `outputs[output_idx].value`
2. **Cached `ctx.txid()`** - Stored in local variable `txid` to avoid redundant computation in `TxOutpoint(txid=bfh(txid), out_idx=output_idx)`

**Why this optimization works:**
- The `ctx.outputs()` and `ctx.txid()` methods likely involve expensive operations like transaction deserialization and hash computation (as seen in the dependency code)
- In the original code, `ctx.outputs()[output_idx].value` calls `outputs()` every time, and `bfh(ctx.txid())` calls `txid()` every time
- By caching these results, we eliminate redundant function call overhead and potentially expensive recomputation

**Impact based on function references:**
This optimization is particularly valuable given the function references show `sweep_htlctx_output` is called in hot paths:
- **Lightning Network justice transactions** - Called repeatedly in loops when sweeping multiple HTLC outputs during breach remediation
- **Channel force-close scenarios** - Called for each HTLC output when processing commitment transactions
- **Watchtower operations** - Used in automated breach response where performance matters for timely transaction broadcasting

**Test case performance:**
The optimization shows consistent 6-11% improvements across most test cases, with the largest gains (10-11%) in scenarios involving multiple calls or larger transactions - exactly the use cases indicated by the function references where this code runs in loops processing multiple HTLCs.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 13, 2025 01:52
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant