Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented Nov 6, 2025

What changes were proposed in this pull request?

Add integrated tests for Scalar Pandas Iterator UDF

Why are the changes needed?

to improve test coverage
Many UDF types are only tested in python side, and are missing in the SQL side. So
if the results (tested in python) are the same but the plan was unintentionally changed, we will not be aware of such change.
This PR adds SQL_SCALAR_PANDAS_ITER_UDF

Does this PR introduce any user-facing change?

no, test-only

How was this patch tested?

ci

Was this patch authored or co-authored using generative AI tooling?

no

@zhengruifeng
Copy link
Contributor Author

merged to master

@zhengruifeng zhengruifeng deleted the add_pandas_iter_integ_test branch November 7, 2025 00:22
@pan3793
Copy link
Member

pan3793 commented Nov 7, 2025

@zhengruifeng, how many times will the newly added test cases take? sql - extended tests seem likely to timeout after this commit.

@zhengruifeng
Copy link
Contributor Author

@pan3793 I think it shouldn't take too much time, but let me revert it first to check.

zhengruifeng added a commit that referenced this pull request Nov 7, 2025
…alar Pandas Iterator UDF"

revert #52916 to check the test timeout issue

Closes #52935 from zhengruifeng/revert_pandas_iter_test.

Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
@zhengruifeng
Copy link
Contributor Author

@pan3793 reverted

https://github.com/apache/spark/actions/runs/19158064744/job/54762986982

filtered by 'Scalar Pandas Iterator UDF', the total duration of new tests is around 80 sec.

So I think it is caused by other issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants