⚡️ Speed up method _AppConfig.asdict by 827%
#618
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 827% (8.27x) speedup for
_AppConfig.asdictinmarimo/_ast/app_config.py⏱️ Runtime :
5.81 milliseconds→627 microseconds(best of250runs)📝 Explanation and details
The optimization replaces the expensive
asdict(self)call with direct attribute access throughgetattr(self, k)while iterating overself.__dataclass_fields__.Key Performance Changes:
dataclasses.asdict(self)which creates a complete dictionary of all fields, then filters it. The optimized version directly accesses only the needed attributes.dataclasses.asdict()performs additional work like handling nested dataclasses and type conversions that aren't needed here.Why This Works:
The line profiler shows the optimization dramatically reduces time spent in the dictionary comprehension (from 89.6% to 3.6% of total time). The
asdict(self).items()call in the original code is expensive because it processes all dataclass fields, even those that will be filtered out.Test Results Impact:
The annotated tests show consistent 5-6x speedups across all scenarios:
This optimization is particularly effective for dataclasses with many fields or when called frequently, as it eliminates the overhead of the full dataclass serialization process while maintaining identical functionality.
✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
_ast/test_app.py::TestApp.test_app_config_auto_download_ast/test_app.py::TestApp.test_app_config_extra_args_ignored_ast/test_app.py::test_app_config_ast/test_app.py::test_app_config_extra_args_ignored_ast/test_app_config.py::test_app_config_asdict🌀 Generated Regression Tests and Runtime
from dataclasses import dataclass, field
from typing import Any, Optional
imports
import pytest
from marimo._ast.app_config import _AppConfig
---- Mocking dependencies for a self-contained test ----
These are minimal mocks to allow the test to run without external dependencies.
Simulate valid types for the config fields
WidthType = str # could be "compact", "full", etc.
ExportType = str # could be "html", "md", etc.
SqlOutputType = str # could be "auto", "table", etc.
from marimo._ast.app_config import _AppConfig
-------------------- UNIT TESTS --------------------
1. Basic Test Cases
def test_default_config_asdict():
"""Test asdict() on default configuration."""
cfg = _AppConfig()
codeflash_output = cfg.asdict(); d = codeflash_output # 13.6μs -> 2.23μs (509% faster)
def test_auto_download_empty_and_nonempty():
"""Test asdict() with empty and non-empty auto_download lists."""
# Empty list (default)
cfg = _AppConfig()
codeflash_output = cfg.asdict(); d = codeflash_output # 13.3μs -> 2.15μs (519% faster)
# Non-empty list
cfg2 = _AppConfig(auto_download=["md"])
codeflash_output = cfg2.asdict(); d2 = codeflash_output # 8.88μs -> 1.22μs (626% faster)
def test_asdict_return_type_and_keys():
"""Test that asdict returns a dict with correct keys and types."""
cfg = _AppConfig()
codeflash_output = cfg.asdict(); d = codeflash_output # 16.3μs -> 2.56μs (535% faster)
expected_keys = [
"width", "app_title", "layout_file", "css_file", "html_head_file", "auto_download", "sql_output"
]
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from dataclasses import asdict, dataclass, field
from typing import Any, Optional
imports
import pytest
from marimo._ast.app_config import _AppConfig
Dummy enum types to simulate the real ones for testing
class ExportType(str):
HTML = "html"
MARKDOWN = "markdown"
PDF = "pdf"
class SqlOutputType(str):
AUTO = "auto"
TABLE = "table"
RAW = "raw"
class WidthType(str):
COMPACT = "compact"
FULL = "full"
CUSTOM = "custom"
from marimo._ast.app_config import _AppConfig
unit tests
----------- Basic Test Cases -----------
def test_asdict_default_values():
"""Test asdict with all default values."""
cfg = _AppConfig()
codeflash_output = cfg.asdict(); d = codeflash_output # 14.5μs -> 2.37μs (511% faster)
def test_asdict_ignores_private_fields():
"""Test that asdict does not include private fields (starting with '_')."""
# Dynamically add a private field to the instance
cfg = _AppConfig()
cfg._experimental = 123
# asdict() should not include '_experimental'
codeflash_output = cfg.asdict(); d = codeflash_output # 12.7μs -> 1.93μs (557% faster)
def test_asdict_large_number_of_fields_with_private():
"""Test asdict ignores many private fields even if present."""
cfg = _AppConfig()
# Add many private fields
for i in range(100):
setattr(cfg, f"_private{i}", i)
codeflash_output = cfg.asdict(); d = codeflash_output # 14.3μs -> 2.39μs (497% faster)
# None of the private fields should be present
for i in range(100):
pass
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
To edit these changes
git checkout codeflash/optimize-_AppConfig.asdict-mhvkqyeoand push.