Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented Nov 7, 2025

Reapply #52916

What changes were proposed in this pull request?

Add integrated tests for Scalar Pandas Iterator UDF

Why are the changes needed?

to improve test coverage
Many UDF types are only tested in python side, and are missing in the SQL side. So
if the results (tested in python) are the same but the plan was unintentionally changed, we will not be aware of such change.
This PR adds SQL_SCALAR_PANDAS_ITER_UDF

Does this PR introduce any user-facing change?

no, test-only

How was this patch tested?

ci

Was this patch authored or co-authored using generative AI tooling?

no

@dongjoon-hyun dongjoon-hyun changed the title Reapply "[SPARK-54218][PYTHON][SQL][TESTS] Add integrated tests for Scalar Pandas Iterator UDF" [SPARK-54218][PYTHON][SQL][TESTS] Add integrated tests for Scalar Pandas Iterator UDF Nov 7, 2025
…calar Pandas Iterator UDF"

This reverts commit 81be5fb.
@zhengruifeng zhengruifeng force-pushed the reapply_pandas_iter_test branch from fedb012 to d201425 Compare November 10, 2025 02:31
@zhengruifeng zhengruifeng marked this pull request as ready for review November 10, 2025 05:22
@zhengruifeng
Copy link
Contributor Author

cc @pan3793

@pan3793
Copy link
Member

pan3793 commented Nov 10, 2025

@zhengruifeng do you have any modifications on this? the CI seems to be stable after your previous reverting.

@zhengruifeng
Copy link
Contributor Author

@zhengruifeng do you have any modifications on this? the CI seems to be stable after your previous reverting.

no, the PR is exactly the same as the previous one

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

Since this is only for Apache Spark 4.2.0, we can merge this PR according to the PR builder's success result.

For the other CIs, we can monitor the stability.

@zhengruifeng
Copy link
Contributor Author

thanks, merged to master

@zhengruifeng zhengruifeng deleted the reapply_pandas_iter_test branch November 10, 2025 08:08
@zhengruifeng
Copy link
Contributor Author

zhengruifeng commented Nov 10, 2025

@pan3793 @dongjoon-hyun
I see the sql-extended failed again in master
https://github.com/apache/spark/actions/runs/19224906148/job/54949909383

Not sure why it is related to this PR, I cannot reproduce it in my local, I am going to revert it again...

@zhengruifeng
Copy link
Contributor Author

reverted in 21c3122

@pan3793
Copy link
Member

pan3793 commented Nov 10, 2025

this MIGHT (not sure) indicate the VM gets killed due to OOM

Session terminated, killing shell...
Error: The operation was canceled.

@dongjoon-hyun
Copy link
Member

That's too bad. Anyway, thank you for testing and reverting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants