Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Oct 29, 2025

⚡️ This pull request contains optimizations for PR #1654

If you approve this dependent PR, these changes will be merged into the original PR branch fix/stitch-block-image-dimensions.

This PR will be automatically closed if the original PR is merged.


📄 274% (2.74x) speedup for manage_crops_metadata in inference/core/workflows/core_steps/fusion/detections_stitch/v1.py

⏱️ Runtime : 1.33 milliseconds 356 microseconds (best of 385 runs)

📝 Explanation and details

Optimization summary:

  • Replaced np.array([dimensions] * n) with np.full((n, 2), dimensions) for highly efficient array allocation and initialization, eliminating the intermediate list creation overhead.
  • For PARENT_ID_KEY, replaced np.array([parent_id] * n) with an np.empty(n, dtype=object) followed by .fill(parent_id), which is much faster than repeated list expansion and conversion. This also avoids creating a large Python list only to wrap it as a numpy array.
  • All other behavioral logic, exception raising, data mutation, types, and function signature are preserved exactly as requested.

Correctness verification report:

Test Status
⏪ Replay Tests 🔘 None Found
⚙️ Existing Unit Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
🌀 Generated Regression Tests 31 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import numpy as np
# imports
import pytest
from inference.core.workflows.core_steps.fusion.detections_stitch.v1 import \
    manage_crops_metadata


# Mocks for external dependencies (sv.Detections and constants)
class MockDetections:
    def __init__(self, length, data=None):
        self._len = length
        self.data = data if data is not None else {}
    def __len__(self):
        return self._len
    def __getitem__(self, key):
        # For simplicity, treat keys as direct access to .data
        return self.data[key]

# Constants as per the function
IMAGE_DIMENSIONS_KEY = "image_dimensions"
PARENT_COORDINATES_KEY = "parent_coordinates"
ROOT_PARENT_COORDINATES_KEY = "root_parent_coordinates"
PARENT_ID_KEY = "parent_id"
SCALING_RELATIVE_TO_PARENT_KEY = "scaling_relative_to_parent"
from inference.core.workflows.core_steps.fusion.detections_stitch.v1 import \
    manage_crops_metadata

# ------------------- UNIT TESTS -------------------

# 1. Basic Test Cases

def test_empty_detections_returns_input():
    """If detections is empty, should return the same object unchanged."""
    det = MockDetections(0)
    codeflash_output = manage_crops_metadata(det, offset=None, dimensions=(100, 200), parent_id="abc"); result = codeflash_output # 1.17μs -> 1.16μs (0.430% faster)

def test_no_scaling_no_parent_coords():
    """Basic case: no scaling, no parent/root coords; metadata is set correctly."""
    det = MockDetections(2)
    codeflash_output = manage_crops_metadata(det, offset=np.array([10, 20]), dimensions=(50, 60), parent_id="xyz"); result = codeflash_output # 6.46μs -> 10.3μs (37.2% slower)

def test_parent_coordinates_are_offset():
    """If parent coordinates are present, they are offset by the given offset."""
    det = MockDetections(
        2,
        data={PARENT_COORDINATES_KEY: np.array([[10, 20], [30, 40]])}
    )
    codeflash_output = manage_crops_metadata(det, offset=np.array([1, 2]), dimensions=(5, 6), parent_id="id1"); result = codeflash_output # 13.4μs -> 16.6μs (19.5% slower)
    expected = np.array([[9, 18], [29, 38]])

def test_root_parent_coordinates_are_offset():
    """If root parent coordinates are present, they are offset by the given offset."""
    det = MockDetections(
        2,
        data={ROOT_PARENT_COORDINATES_KEY: np.array([[100, 200], [300, 400]])}
    )
    codeflash_output = manage_crops_metadata(det, offset=np.array([10, 20]), dimensions=(1, 2), parent_id="root"); result = codeflash_output # 12.2μs -> 15.3μs (20.1% slower)
    expected = np.array([[90, 180], [290, 380]])

def test_both_parent_and_root_coordinates():
    """Both parent and root parent coordinates are offset."""
    det = MockDetections(
        2,
        data={
            PARENT_COORDINATES_KEY: np.array([[10, 10], [20, 20]]),
            ROOT_PARENT_COORDINATES_KEY: np.array([[100, 100], [200, 200]])
        }
    )
    codeflash_output = manage_crops_metadata(det, offset=np.array([1, 2]), dimensions=(3, 4), parent_id="pid"); result = codeflash_output # 13.4μs -> 16.5μs (18.6% slower)

# 2. Edge Test Cases

def test_offset_none_raises_value_error():
    """If detections is not empty and offset is None, should raise ValueError."""
    det = MockDetections(1)
    with pytest.raises(ValueError) as e:
        manage_crops_metadata(det, offset=None, dimensions=(10, 20), parent_id="abc") # 1.90μs -> 1.80μs (5.79% faster)

def test_scaling_relative_to_parent_key_1_0_ok():
    """If scaling_relative_to_parent is exactly 1.0, function should proceed."""
    det = MockDetections(
        2,
        data={SCALING_RELATIVE_TO_PARENT_KEY: np.array([1.0, 1.0])}
    )
    codeflash_output = manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(1, 1), parent_id="id"); result = codeflash_output # 8.80μs -> 12.3μs (28.3% slower)

def test_scaling_relative_to_parent_key_not_1_0_raises():
    """If scaling_relative_to_parent is not 1.0, should raise ValueError."""
    det = MockDetections(
        2,
        data={SCALING_RELATIVE_TO_PARENT_KEY: np.array([1.01, 1.01])}
    )
    with pytest.raises(ValueError) as e:
        manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(1, 1), parent_id="id") # 4.04μs -> 4.04μs (0.198% faster)

def test_scaling_relative_to_parent_key_almost_1_0_ok():
    """If scaling_relative_to_parent is within tolerance of 1.0, function should proceed."""
    det = MockDetections(
        2,
        data={SCALING_RELATIVE_TO_PARENT_KEY: np.array([1.00001, 1.00001])}
    )
    codeflash_output = manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(1, 1), parent_id="id"); result = codeflash_output # 8.53μs -> 12.1μs (29.5% slower)

def test_parent_id_and_dimensions_types():
    """Check that parent_id is string and dimensions is tuple of two ints."""
    det = MockDetections(1)
    codeflash_output = manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(123, 456), parent_id="parent"); result = codeflash_output # 5.11μs -> 9.12μs (44.0% slower)

def test_offset_broadcasting():
    """Test that offset with shape (2,) can be subtracted from Nx2 arrays."""
    det = MockDetections(
        3,
        data={PARENT_COORDINATES_KEY: np.array([[1, 2], [3, 4], [5, 6]])}
    )
    offset = np.array([1, 1])
    codeflash_output = manage_crops_metadata(det, offset=offset, dimensions=(1, 1), parent_id="id"); result = codeflash_output # 13.8μs -> 15.9μs (13.0% slower)

def test_offset_with_negative_values():
    """Offset can be negative, and subtraction works correctly."""
    det = MockDetections(
        2,
        data={PARENT_COORDINATES_KEY: np.array([[0, 0], [10, 10]])}
    )
    offset = np.array([-5, -5])
    codeflash_output = manage_crops_metadata(det, offset=offset, dimensions=(2, 2), parent_id="neg"); result = codeflash_output # 12.3μs -> 15.1μs (18.5% slower)

def test_offset_shape_mismatch_raises():
    """If offset shape doesn't match coordinate shape, numpy will raise."""
    det = MockDetections(
        2,
        data={PARENT_COORDINATES_KEY: np.array([[1, 2], [3, 4]])}
    )
    offset = np.array([1, 2, 3])
    with pytest.raises(ValueError):
        manage_crops_metadata(det, offset=offset, dimensions=(1, 1), parent_id="id") # 11.4μs -> 11.2μs (1.86% faster)

# 3. Large Scale Test Cases

def test_large_number_of_detections():
    """Test function with a large number of detections (performance and correctness)."""
    N = 1000
    det = MockDetections(
        N,
        data={PARENT_COORDINATES_KEY: np.random.randint(0, 1000, size=(N, 2))}
    )
    offset = np.array([5, 10])
    before = det.data[PARENT_COORDINATES_KEY].copy()
    codeflash_output = manage_crops_metadata(det, offset=offset, dimensions=(640, 480), parent_id="big"); result = codeflash_output # 283μs -> 24.7μs (1047% faster)

def test_large_scale_with_both_coords_and_scaling():
    """Large scale, both parent and root coords, scaling exactly 1.0."""
    N = 999
    det = MockDetections(
        N,
        data={
            PARENT_COORDINATES_KEY: np.random.randint(0, 500, size=(N, 2)),
            ROOT_PARENT_COORDINATES_KEY: np.random.randint(0, 1000, size=(N, 2)),
            SCALING_RELATIVE_TO_PARENT_KEY: np.ones(N)
        }
    )
    offset = np.array([2, 3])
    before_parent = det.data[PARENT_COORDINATES_KEY].copy()
    before_root = det.data[ROOT_PARENT_COORDINATES_KEY].copy()
    codeflash_output = manage_crops_metadata(det, offset=offset, dimensions=(128, 256), parent_id="large"); result = codeflash_output # 288μs -> 31.6μs (812% faster)

def test_large_scale_scaling_not_1_0_raises():
    """Large scale, scaling not 1.0, should raise ValueError."""
    N = 500
    det = MockDetections(
        N,
        data={SCALING_RELATIVE_TO_PARENT_KEY: np.full(N, 0.95)}
    )
    with pytest.raises(ValueError) as e:
        manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(1, 1), parent_id="fail") # 4.48μs -> 4.26μs (5.21% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import numpy as np
# imports
import pytest  # used for our unit tests
from inference.core.workflows.core_steps.fusion.detections_stitch.v1 import \
    manage_crops_metadata


# Minimal mock of supervision.Detections for testing
class MockDetections:
    def __init__(self, num=0, data=None):
        self.data = data if data is not None else {}
        self._num = num

    def __len__(self):
        return self._num

    def __getitem__(self, key):
        # For simplicity, return the first element in the array for the key
        return self.data[key]

# Constants as per the original function
IMAGE_DIMENSIONS_KEY = "image_dimensions"
PARENT_COORDINATES_KEY = "parent_coordinates"
PARENT_ID_KEY = "parent_id"
ROOT_PARENT_COORDINATES_KEY = "root_parent_coordinates"
SCALING_RELATIVE_TO_PARENT_KEY = "scaling_relative_to_parent"
from inference.core.workflows.core_steps.fusion.detections_stitch.v1 import \
    manage_crops_metadata

# unit tests

# -------------------- BASIC TEST CASES --------------------

def test_empty_detections_returns_input():
    # Should return input unchanged if detections is empty
    det = MockDetections(num=0)
    codeflash_output = manage_crops_metadata(det, offset=np.array([1, 2]), dimensions=(100, 200), parent_id="abc"); result = codeflash_output # 1.24μs -> 1.22μs (1.39% faster)

def test_offset_none_with_nonempty_raises():
    # Should raise ValueError if offset is None and detections is not empty
    det = MockDetections(num=1)
    with pytest.raises(ValueError, match="offset is needed"):
        manage_crops_metadata(det, offset=None, dimensions=(100, 200), parent_id="abc") # 1.87μs -> 1.92μs (2.40% slower)

def test_basic_metadata_assignment():
    # Should assign IMAGE_DIMENSIONS_KEY and PARENT_ID_KEY correctly
    det = MockDetections(num=2)
    manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(10, 20), parent_id="xyz") # 6.82μs -> 10.8μs (36.8% slower)

def test_parent_and_root_coordinates_are_offset():
    # Should subtract offset from parent and root coordinates
    parent_coords = np.array([[10, 20], [30, 40]])
    root_coords = np.array([[100, 200], [300, 400]])
    det = MockDetections(num=2, data={
        PARENT_COORDINATES_KEY: parent_coords.copy(),
        ROOT_PARENT_COORDINATES_KEY: root_coords.copy()
    })
    offset = np.array([1, 2])
    manage_crops_metadata(det, offset=offset, dimensions=(1, 1), parent_id="id") # 14.2μs -> 17.1μs (16.8% slower)

def test_no_parent_or_root_coordinates():
    # Should not fail if neither parent nor root coordinates are present
    det = MockDetections(num=1)
    manage_crops_metadata(det, offset=np.array([1, 2]), dimensions=(1, 2), parent_id="id") # 5.25μs -> 9.02μs (41.8% slower)

# -------------------- EDGE TEST CASES --------------------

def test_scaling_relative_to_parent_key_exactly_1():
    # Should not raise if scale is exactly 1.0
    det = MockDetections(num=1, data={SCALING_RELATIVE_TO_PARENT_KEY: np.array([1.0])})
    manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(1, 1), parent_id="id") # 7.72μs -> 11.1μs (30.7% slower)

def test_scaling_relative_to_parent_key_near_1():
    # Should not raise if scale is very close to 1.0 (within tolerance)
    det = MockDetections(num=1, data={SCALING_RELATIVE_TO_PARENT_KEY: np.array([1.00009])})
    manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(1, 1), parent_id="id") # 7.00μs -> 10.4μs (32.7% slower)

def test_scaling_relative_to_parent_key_out_of_tolerance_raises():
    # Should raise if scale is outside tolerance
    det = MockDetections(num=1, data={SCALING_RELATIVE_TO_PARENT_KEY: np.array([1.0002])})
    with pytest.raises(ValueError, match="Scaled bounding boxes were passed"):
        manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(1, 1), parent_id="id") # 4.11μs -> 3.92μs (5.00% faster)

def test_offset_is_zero_vector():
    # Offset of zero should not change coordinates
    parent_coords = np.array([[5, 5], [10, 10]])
    det = MockDetections(num=2, data={PARENT_COORDINATES_KEY: parent_coords.copy()})
    offset = np.array([0, 0])
    manage_crops_metadata(det, offset=offset, dimensions=(1, 1), parent_id="id") # 13.0μs -> 16.1μs (19.3% slower)

def test_dimensions_are_zero():
    # Should assign zero dimensions
    det = MockDetections(num=2)
    manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(0, 0), parent_id="id") # 5.80μs -> 8.60μs (32.6% slower)

def test_parent_id_is_empty_string():
    # Should assign empty string as parent_id
    det = MockDetections(num=2)
    manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(1, 1), parent_id="") # 6.06μs -> 8.77μs (30.9% slower)


def test_large_number_of_detections():
    # Should handle up to 1000 detections efficiently
    n = 1000
    parent_coords = np.random.randint(0, 1000, size=(n, 2))
    root_coords = np.random.randint(0, 1000, size=(n, 2))
    det = MockDetections(num=n, data={
        PARENT_COORDINATES_KEY: parent_coords.copy(),
        ROOT_PARENT_COORDINATES_KEY: root_coords.copy()
    })
    offset = np.array([5, 10])
    dimensions = (640, 480)
    parent_id = "large_test"
    manage_crops_metadata(det, offset=offset, dimensions=dimensions, parent_id=parent_id) # 291μs -> 31.3μs (829% faster)

def test_large_empty_detections():
    # Should return empty input for large empty detections
    det = MockDetections(num=0)
    codeflash_output = manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(1, 1), parent_id="id"); result = codeflash_output # 1.20μs -> 1.21μs (0.332% slower)

def test_large_scale_scaling_key():
    # Should raise for large scale with scaling key out of tolerance
    n = 500
    det = MockDetections(num=n, data={SCALING_RELATIVE_TO_PARENT_KEY: np.array([1.0005] * n)})
    with pytest.raises(ValueError):
        manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=(1, 1), parent_id="id") # 4.80μs -> 4.65μs (3.25% faster)

def test_large_scale_assignments_are_correct():
    # Should assign metadata correctly for large number of detections
    n = 999
    det = MockDetections(num=n)
    dimensions = (256, 256)
    parent_id = "parentX"
    manage_crops_metadata(det, offset=np.array([0, 0]), dimensions=dimensions, parent_id=parent_id) # 272μs -> 17.6μs (1443% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1654-2025-10-29T18.47.01 and push.

Codeflash

**Optimization summary:**
- Replaced `np.array([dimensions] * n)` with `np.full((n, 2), dimensions)` for highly efficient array allocation and initialization, eliminating the intermediate list creation overhead.
- For `PARENT_ID_KEY`, replaced `np.array([parent_id] * n)` with an `np.empty(n, dtype=object)` followed by `.fill(parent_id)`, which is much faster than repeated list expansion and conversion. This also avoids creating a large Python list only to wrap it as a numpy array.
- All other behavioral logic, exception raising, data mutation, types, and function signature are **preserved exactly** as requested.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 29, 2025
@codeflash-ai codeflash-ai bot added the 🎯 Quality: High Optimization Quality according to codeflash label Oct 29, 2025
Base automatically changed from fix/stitch-block-image-dimensions to main October 31, 2025 18:07
@codeflash-ai codeflash-ai bot closed this Oct 31, 2025
@codeflash-ai
Copy link
Contributor Author

codeflash-ai bot commented Oct 31, 2025

This PR has been automatically closed because the original PR #1654 by brunopicinin was closed.

@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr1654-2025-10-29T18.47.01 branch October 31, 2025 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant