Skip to content

Conversation

@pseudo-rnd-thoughts
Copy link
Member

Description

In improving the SingleEnvRunner.make_env, I found that some of the tests could be flaky.
This PR improves the testing, in particular, to sample to ensure that the tests don't fail occasionally and the documentation to reflect this.

The primary flaky problem I found is that sample(num_timesteps=X) will not always return a total of X timesteps, rather at least X timesteps up to the number of environments more.
I'm updated the documentation to clarify this for users.

In addition, I've added tests for when neither the number of timesteps or episodes are given and for the force_reset argument

@pseudo-rnd-thoughts pseudo-rnd-thoughts requested a review from a team as a code owner November 4, 2025 14:22
@pseudo-rnd-thoughts pseudo-rnd-thoughts added rllib RLlib related issues rllib-envrunners Issues around the sampling backend of RLlib labels Nov 4, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the tests for SingleAgentEnvRunner.sample to prevent flakiness and updates the documentation to reflect the correct behavior. The changes are generally good and address the stated goals. I've identified a couple of areas in the tests where the assertions could be more precise to better validate the functionality. My review includes suggestions to tighten these assertions.

cursor[bot]

This comment was marked as outdated.

Signed-off-by: Mark Towers <[email protected]>
@pseudo-rnd-thoughts pseudo-rnd-thoughts changed the title Improve test_single_agent_env_runner to prevent flaky tests [rllib] Improve test_single_agent_env_runner to prevent flaky tests Nov 4, 2025
Signed-off-by: Mark Towers <[email protected]>
Copy link
Contributor

@simonsays1980 simonsays1980 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the refinement @pseudo-rnd-thoughts !


# Sample n timesteps.
if num_timesteps is not None:
assert num_timesteps >= 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

num_timesteps=10, num_episodes=10, random_actions=True
),
)
# Verify that an error is raised if a negative number is used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

env_runner.stop()

def test_env_context(self):
"""Tests, whether SingleAgentEnvRunner can pass kwargs to the environments correctly."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks for adding this. This was missing but is very important.

@simonsays1980 simonsays1980 added the go add ONLY when ready to merge, run all tests label Nov 14, 2025
@simonsays1980 simonsays1980 merged commit cf9f783 into ray-project:master Nov 17, 2025
6 checks passed
simonsays1980 pushed a commit that referenced this pull request Nov 17, 2025
…_env` (#58410)

## Description
Allow users to use environments that are already vectorized for
`SingleAgentEnvRunner`
With `gymnasium.make_vec`, users have the option to either use the
`SyncVectorEnv` to vectorize a base environment or to directly create a
vector environment using the `vectorize_mode: gymnasium.VectorizeMode`.
This PR utilises the `env_runners(gym_env_vectorize_mode=...)` argument
to support `VectorizeMode.VECTOR_ENTRY_POINT`

```
import gymnasium as gym

config = ...
config.env_runners(
  gym_env_vectorize_mode=gym.VectorizeMode.VECTOR_ENTRY_POINT,
)
```

An important change related to this PR is that the values accepted for
the vectorize mode is either the enum (`VectorizedMode.ASYNC`, etc) or
the enum values (`"async"`, etc) as before it was the string version was
the enum name (`"ASYNC"`) rather than the enum value itself.

## Related issues
Completion of #57643, 

## Additional information
#58397 must be merged first

We should apply a similar change to the `MultiAgentEnvRunner.make_env`

---------

Signed-off-by: Mark Towers <[email protected]>
Co-authored-by: Mark Towers <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests rllib RLlib related issues rllib-envrunners Issues around the sampling backend of RLlib

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants