⚡️ Speed up function any_none by 148%
#347
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 148% (1.48x) speedup for
any_noneinpandas/core/common.py⏱️ Runtime :
238 microseconds→96.1 microseconds(best of123runs)📝 Explanation and details
The optimization replaces a generator expression with the
inoperator to check for None values. The original codeany(arg is None for arg in args)creates a generator that iterates through each argument, checking if it's None, then applies theany()builtin. The optimized versionNone in argsdirectly uses Python's optimizedinoperator on the tuple of arguments.Key optimization: The
inoperator for tuples is implemented in C and uses optimized comparison logic, avoiding the overhead of creating a generator object and Python-level iteration. When None is found, both approaches short-circuit, but the optimized version does so at the C level rather than Python level.Performance characteristics: The test results show consistent 60-190% speedups across all scenarios. The optimization is particularly effective for:
inoperator finds it immediatelyImpact on workloads: Based on the function reference,
any_none()is called in pandas'date_range()function for parameter validation. Sincedate_range()is a frequently used pandas function, this micro-optimization will provide measurable performance improvements in data analysis workflows that create many date ranges. The speedup is most beneficial when users pass multiple parameters todate_range(), as the validation check runs faster.Test case suitability: The optimization performs well across all test scenarios, with the largest gains in cases with many arguments or when None appears early/late in the argument list, making it broadly applicable.
✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
test_common.py::test_any_none🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-any_none-mi3z7b53and push.