Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 7, 2025

📄 12% (0.12x) speedup for VertexGeminiConfig._map_response_schema in litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py

⏱️ Runtime : 13.8 milliseconds 12.3 milliseconds (best of 66 runs)

📝 Explanation and details

The optimization achieves a 12% speedup by addressing two key performance bottlenecks identified in the line profiler:

Primary Optimization - Module-Level Caching:
The biggest improvement comes from moving set(get_type_hints(Schema).keys()) to a module-level constant _VALID_SCHEMA_FIELDS. The line profiler shows this operation consumed 7.1% of total runtime (3.68ms out of 52ms) and was called 17 times. By caching this computation at import time, we eliminate this repeated overhead entirely - the optimized version shows negligible time (7.1μs) for the assignment.

Secondary Optimization - Conditional Dict Processing:
The original code always called parameters.pop("$defs", {}), which creates an empty dict even when no "$defs" key exists. The optimization checks parameters.pop("$defs", None) first and only processes defs if they exist. While this saves minimal time in the profiler (the pop operation was already fast), it reduces unnecessary dict operations and makes the code more efficient for schemas without definitions.

Performance Analysis:

  • The get_type_hints() call involves expensive reflection operations that inspect the Schema TypedDict structure
  • This operation scales poorly when called repeatedly in hot paths
  • Module-level caching transforms this from O(n) per call to O(1) amortized cost

Test Case Performance:
The annotated tests show consistent 200-300% speedups across all schema types, with particularly strong gains for simple schemas (287% faster for type lists) where the get_type_hints overhead was proportionally larger. Complex schemas with many properties still benefit (6-26% improvement) as the fixed overhead is amortized across more processing work.

This optimization is especially valuable since _build_vertex_schema appears to be called from schema mapping operations that could be invoked frequently during API request processing.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 32 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 62.5%
🌀 Generated Regression Tests and Runtime
from copy import deepcopy

# imports
import pytest
from litellm.llms.vertex_ai.gemini.vertex_and_google_ai_studio_gemini import \
    VertexGeminiConfig

# Function to test: _map_response_schema
# We'll define a minimal version of VertexGeminiConfig and the dependencies to allow direct testing

DEFAULT_MAX_RECURSE_DEPTH = 20
from litellm.llms.vertex_ai.gemini.vertex_and_google_ai_studio_gemini import \
    VertexGeminiConfig

# ========== UNIT TESTS ==========

@pytest.fixture
def config():
    return VertexGeminiConfig()

# -------------------- BASIC TEST CASES --------------------

def test_simple_flat_schema(config):
    # Test a flat schema with a single string property
    schema = {
        "type": "object",
        "properties": {
            "foo": {"type": "string"}
        },
        "required": ["foo"]
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 117μs -> 34.0μs (245% faster)

def test_schema_with_enum(config):
    # Test schema with enum, including empty string (should convert to None)
    schema = {
        "type": "object",
        "properties": {
            "color": {"type": "string", "enum": ["red", "green", ""]}
        }
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 102μs -> 31.0μs (229% faster)

def test_schema_with_array(config):
    # Test schema with an array property
    schema = {
        "type": "object",
        "properties": {
            "numbers": {"type": "array", "items": {"type": "number"}}
        }
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 100μs -> 30.1μs (233% faster)

def test_schema_with_anyof(config):
    # Test schema with anyOf for a property
    schema = {
        "type": "object",
        "properties": {
            "value": {
                "anyOf": [
                    {"type": "string"},
                    {"type": "number"}
                ]
            }
        }
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 105μs -> 35.2μs (201% faster)
    anyof = result["properties"]["value"]["anyOf"]
    types = {item["type"] for item in anyof}

def test_schema_with_nullable_anyof(config):
    # Test schema with anyOf including null, should set nullable=True
    schema = {
        "type": "object",
        "properties": {
            "maybe": {
                "anyOf": [
                    {"type": "string"},
                    {"type": "null"}
                ]
            }
        }
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 103μs -> 34.0μs (205% faster)
    anyof = result["properties"]["maybe"]["anyOf"]

# -------------------- EDGE TEST CASES --------------------

def test_schema_with_empty_enum(config):
    # Enum with only empty string, should become [None]
    schema = {
        "type": "object",
        "properties": {
            "empty": {"type": "string", "enum": [""]}
        }
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 96.3μs -> 29.1μs (231% faster)

def test_schema_with_empty_items(config):
    # Array with empty items dict should become items={"type": "object"}
    schema = {
        "type": "object",
        "properties": {
            "arr": {"type": "array", "items": {}}
        }
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 95.2μs -> 29.3μs (225% faster)

def test_schema_with_type_list(config):
    # OpenAI style: type is a list, should become anyOf
    schema = {
        "type": ["string", "number"]
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 92.1μs -> 23.8μs (287% faster)
    types = {item["type"] for item in result["anyOf"]}

def test_schema_with_required_none(config):
    # required=None should be removed
    schema = {
        "type": "object",
        "properties": {"foo": {"type": "string"}},
        "required": None
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 94.4μs -> 25.8μs (266% faster)

def test_schema_with_defs_and_refs(config):
    # Schema using $defs and $ref
    schema = {
        "type": "object",
        "properties": {
            "bar": {"$ref": "#/$defs/barType"}
        },
        "$defs": {
            "barType": {
                "type": "string",
                "enum": ["a", "b"]
            }
        }
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 110μs -> 41.8μs (165% faster)
    bar = result["properties"]["bar"]

def test_schema_with_circular_ref(config):
    # Circular $defs reference should not cause infinite loop
    schema = {
        "type": "object",
        "properties": {
            "self": {"$ref": "#/$defs/selfType"}
        },
        "$defs": {
            "selfType": {
                "type": "object",
                "properties": {
                    "self": {"$ref": "#/$defs/selfType"}
                }
            }
        }
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 127μs -> 57.3μs (122% faster)

def test_schema_with_anyof_only_null(config):
    # anyOf with only null should raise ValueError
    schema = {
        "type": "object",
        "properties": {
            "foo": {
                "anyOf": [
                    {"type": "null"}
                ]
            }
        }
    }
    with pytest.raises(ValueError):
        config._map_response_schema(schema) # 87.2μs -> 18.7μs (366% faster)


def test_large_schema_many_properties(config):
    # Schema with 500 properties
    schema = {
        "type": "object",
        "properties": {f"field_{i}": {"type": "string"} for i in range(500)},
        "required": [f"field_{i}" for i in range(500)]
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 1.86ms -> 1.74ms (6.81% faster)

def test_large_schema_deep_nesting(config):
    # Schema with 50 levels of nesting (well below recursion limit)
    schema = {}
    curr = schema
    for i in range(50):
        curr["type"] = "object"
        curr["properties"] = {"inner": {}}
        curr = curr["properties"]["inner"]
    curr["type"] = "string"
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 372μs -> 294μs (26.6% faster)
    # Check that all outer levels have type "object" and propertyOrdering
    curr = result
    for i in range(50):
        curr = curr["properties"]["inner"]

def test_large_schema_many_enums(config):
    # Schema with 200 properties, each with a large enum
    schema = {
        "type": "object",
        "properties": {
            f"e{i}": {"type": "string", "enum": [str(j) for j in range(100)]}
            for i in range(200)
        }
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 9.03ms -> 8.69ms (3.90% faster)
    for i in range(200):
        enum = result["properties"][f"e{i}"]["enum"]

def test_large_schema_with_arrays(config):
    # Schema with 100 array properties, each with complex items
    schema = {
        "type": "object",
        "properties": {
            f"arr{i}": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "val": {"type": "integer"}
                    }
                }
            }
            for i in range(100)
        }
    }
    codeflash_output = config._map_response_schema(schema); result = codeflash_output # 1.28ms -> 1.16ms (9.56% faster)
    for i in range(100):
        arr = result["properties"][f"arr{i}"]
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from copy import deepcopy
# Function under test (minimal stub for testing)
# We'll define a minimal _map_response_schema function that matches the expected behavior.
# The real function is a method of VertexGeminiConfig, but for testing, we'll use a standalone version.
from typing import Any, Dict, List, Optional, Set

# imports
import pytest
from litellm.llms.vertex_ai.gemini.vertex_and_google_ai_studio_gemini import \
    VertexGeminiConfig

# We'll use the real _build_vertex_schema and supporting functions from the pasted code above.
# For the test suite, we need to define a stub for VertexGeminiConfig._map_response_schema,
# but since the logic is all in _build_vertex_schema, we'll test that directly.

# For clarity, we'll wrap the function here as a standalone for testing.
def _map_response_schema(value: dict) -> dict:
    # This is the real logic from VertexGeminiConfig._map_response_schema
    from copy import deepcopy
    old_schema = deepcopy(value)
    if isinstance(old_schema, list):
        for i, item in enumerate(old_schema):
            if isinstance(item, dict):
                old_schema[i] = _build_vertex_schema(
                    parameters=item, add_property_ordering=True
                )
    elif isinstance(old_schema, dict):
        old_schema = _build_vertex_schema(
            parameters=old_schema, add_property_ordering=True
        )
    return old_schema

# --------------------------------------------------------
# Unit Test Suite for _map_response_schema
# --------------------------------------------------------

# 1. Basic Test Cases

To edit these changes git checkout codeflash/optimize-VertexGeminiConfig._map_response_schema-mhoe8qzj and push.

Codeflash Static Badge

The optimization achieves a 12% speedup by addressing two key performance bottlenecks identified in the line profiler:

**Primary Optimization - Module-Level Caching:**
The biggest improvement comes from moving `set(get_type_hints(Schema).keys())` to a module-level constant `_VALID_SCHEMA_FIELDS`. The line profiler shows this operation consumed 7.1% of total runtime (3.68ms out of 52ms) and was called 17 times. By caching this computation at import time, we eliminate this repeated overhead entirely - the optimized version shows negligible time (7.1μs) for the assignment.

**Secondary Optimization - Conditional Dict Processing:**
The original code always called `parameters.pop("$defs", {})`, which creates an empty dict even when no "$defs" key exists. The optimization checks `parameters.pop("$defs", None)` first and only processes defs if they exist. While this saves minimal time in the profiler (the pop operation was already fast), it reduces unnecessary dict operations and makes the code more efficient for schemas without definitions.

**Performance Analysis:**
- The `get_type_hints()` call involves expensive reflection operations that inspect the `Schema` TypedDict structure
- This operation scales poorly when called repeatedly in hot paths
- Module-level caching transforms this from O(n) per call to O(1) amortized cost

**Test Case Performance:**
The annotated tests show consistent 200-300% speedups across all schema types, with particularly strong gains for simple schemas (287% faster for type lists) where the `get_type_hints` overhead was proportionally larger. Complex schemas with many properties still benefit (6-26% improvement) as the fixed overhead is amortized across more processing work.

This optimization is especially valuable since `_build_vertex_schema` appears to be called from schema mapping operations that could be invoked frequently during API request processing.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 7, 2025 05:07
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant