Skip to content

Conversation

@elliot-barn
Copy link
Contributor

Updating 2 read related release data tests:
read_tfrecords
read_images_comparison_microbenchmark_single_node

Passing run here: https://buildkite.com/ray-project/release/builds/65424#_

Signed-off-by: elliot-barn <[email protected]>
@elliot-barn elliot-barn requested a review from aslonnie November 6, 2025 04:12
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request upgrades two release data tests, read_tfrecords and read_images_comparison_microbenchmark_single_node, to use Python 3.10. The changes are straightforward, adding python: "3.10" to the respective test configurations in release/release_data_tests.yaml. The modifications are consistent with other tests in the file and appear correct. Given the passing CI run, the changes look good to merge.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates two release data tests, read_tfrecords and read_images_comparison_microbenchmark_single_node, to use Python 3.10. The changes are straightforward and correct. To improve long-term maintainability, I've suggested using YAML anchors for the Python version. This will help in keeping Python versions consistent across tests and simplify future updates.

s3://anyscale-imagenet/ILSVRC/Data/CLS-LOC/ --format image --iter-bundles
- name: read_tfrecords
python: "3.10"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve maintainability and ensure consistency, you can define a YAML anchor for the Python version. This makes it easier to update the version across multiple tests in the future. You can then use this anchor in other tests being updated in this PR.

  python: &python-3-10 "3.10"

script: python read_from_uris_benchmark.py

- name: read_images_comparison_microbenchmark_single_node
python: "3.10"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve maintainability, you can use the YAML anchor python-3-10 defined for the read_tfrecords test. This ensures that the Python version is consistent and easy to update in one place.

  python: *python-3-10

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly updates two release data tests, read_tfrecords and read_images_comparison_microbenchmark_single_node, to explicitly use Python 3.10. The changes are straightforward and look good. I have added one comment regarding a potential inconsistency with other tests in the same file that could be addressed for better maintainability.

s3://anyscale-imagenet/ILSVRC/Data/CLS-LOC/ --format image --iter-bundles
- name: read_tfrecords
python: "3.10"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While adding python: "3.10" here is correct for the read_tfrecords test, it creates an inconsistency with other tests. Several other tests in this file also use the read_and_consume_benchmark.py script (e.g., read_parquet_{{scaling}}, read_images_{{scaling}}, write_parquet) but do not have an explicit Python version specified. To ensure consistent behavior and improve maintainability, consider updating all tests that use this script to Python 3.10, either in this PR or in a follow-up.

@aslonnie aslonnie added the go add ONLY when ready to merge, run all tests label Nov 6, 2025
@aslonnie
Copy link
Collaborator

aslonnie commented Nov 6, 2025

why are the other ones failing?

@ray-gardener ray-gardener bot added data Ray Data-related issues release-test release test labels Nov 6, 2025
Copy link
Collaborator

@aslonnie aslonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe investigate what is happening with the other tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests release-test release test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants