Skip to content

Conversation

@EwoutH
Copy link
Member

@EwoutH EwoutH commented Oct 7, 2025

Summary

This PR updates batch_run to provide explicit control over random seeds for multiple replications. A new rng keyword argument is added that accepts a single seed value or an iterable of seed values, while the iterations parameter is deprecated.

Motive

When using batch_run() with a single seed value and multiple iterations, all iterations used the same seed, producing identical results instead of independent replications (see #2835). This made it impossible to run meaningful parameter sweeps with reproducible but varied random states across iterations.

# Previous problematic behavior
parameters = {'seed': 42}
batch_run(MyModel, parameters, iterations=10)
# All 10 iterations use seed=42 → identical results

Implementation

  • Added new rng keyword argument to batch_run that accepts either a single seed value or an iterable of seed values
  • Deprecated the iterations parameter with a DeprecationWarning directing users to the migration guide
  • Added validation to prevent using both iterations and rng simultaneously
  • The implementation inspects the model's signature to determine whether to use seed or rng as the parameter name when passing to the model
  • Each seed value in the iterable creates a separate run, giving users explicit control over reproducibility
  • Updated tests to cover the new functionality and edge cases
  • Updated the batch_run tutorial notebook with the new recommended usage
  • Added migration guide entry for Mesa 3.4.0

Usage Examples

import numpy as np
import sys

# Create 5 random seed values
rng = np.random.default_rng(42)
seed_values = rng.integers(0, sys.maxsize, size=(5,))

results = mesa.batch_run(
    MoneyModel,
    parameters=params,
    rng=seed_values.tolist(),  # Pass the 5 seed values
    max_steps=100,
    number_processes=1,
    data_collection_period=1,
    display_progress=True,
)

For a single run with a specific seed:

results = mesa.batch_run(MyModel, {}, rng=42)

Additional Notes

  • Using the deprecated iterations parameter will issue a DeprecationWarning but continues to work (passing None as the seed for each iteration, maintaining backward compatibility)
  • The results now include a seed column showing which seed was used for each run
  • This is a breaking change for users who relied on the implicit behavior of iterations - they should update their code to explicitly provide seed values via rng

@EwoutH EwoutH added bug Release notes label breaking Release notes label labels Oct 7, 2025
@github-actions
Copy link

github-actions bot commented Oct 7, 2025

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔵 +0.5% [-0.2%, +1.2%] 🔵 -0.1% [-0.2%, +0.1%]
BoltzmannWealth large 🔵 -0.3% [-1.4%, +0.8%] 🔵 +0.1% [-3.2%, +3.4%]
Schelling small 🔵 -0.9% [-1.2%, -0.6%] 🔵 -2.1% [-2.5%, -1.6%]
Schelling large 🔵 -1.5% [-2.2%, -0.6%] 🔵 -2.6% [-6.8%, +1.4%]
WolfSheep small 🔵 -0.9% [-1.3%, -0.6%] 🔵 +1.0% [+0.7%, +1.2%]
WolfSheep large 🔵 +2.1% [+1.1%, +3.0%] 🔴 +5.4% [+4.1%, +6.7%]
BoidFlockers small 🔵 -1.5% [-2.3%, -0.5%] 🔵 -0.7% [-1.0%, -0.4%]
BoidFlockers large 🔵 -1.3% [-2.2%, -0.4%] 🔵 -0.9% [-1.6%, -0.1%]

@EwoutH EwoutH requested a review from quaquel October 7, 2025 19:08
@quaquel
Copy link
Member

quaquel commented Oct 7, 2025

I agree that this needs to be fixed. However, using subsequent integers with Mersenne Twister, Python's default RNG, is a bad idea.

From wikipedia: "A consequence of poor diffusion is that two instances of the generator, started with initial states that are almost the same, will usually output nearly the same sequence for many iterations". Also, using a seed with many zeros (like 42) is actually bad as well. One option is to just use time.time() every single time and return this seed value for reproducibility.

As an aside, numpy's rng is much better and I believe we should move all mesa code over to using this while deprecating the use of python's stdlib random library.

@tpike3
Copy link
Member

tpike3 commented Oct 8, 2025

Considering how important this is, maybe we should just go all in and do the switch to numpy and its rng and then have seed options like system time and hierarchical seeding?

Does it have to be a breaking change? Could we keep the old behavior and just add a warning?

@quaquel
Copy link
Member

quaquel commented Oct 8, 2025

Does it have to be a breaking change? Could we keep the old behavior and just add a warning?

Moving the internals over should be possible as a non-breaking change.

seed_value = kwargs["seed"]
if isinstance(seed_value, (int, float)) and not isinstance(seed_value, bool):
kwargs = kwargs.copy()
kwargs["seed"] = int(seed_value) + iteration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kwargs["seed"] = int(seed_value) + iteration
kwargs["seed"] = seed_value + time.time()

This is all that is needed to ensure a much better spread of seeding values and thus better randomness.

@EwoutH
Copy link
Member Author

EwoutH commented Oct 28, 2025

As an aside, numpy's rng is much better and I believe we should move all mesa code over to using this while deprecating the use of python's stdlib random library.

Considering how important this is, maybe we should just go all in and do the switch to numpy and its rng and then have seed options like system time and hierarchical seeding?

Considering this, do we want to move forward with this PR?

return this seed value for reproducibility.

Where/how should we do this (without breaking API)?

EwoutH and others added 2 commits October 28, 2025 19:03
When using batch_run() with a single seed value and multiple iterations, all iterations were using the same seed, producing identical results instead of independent replications. This defeats the purpose of running multiple iterations.

This commit modifies _model_run_func to automatically increment the seed for each iteration (seed, seed+1, seed+2, ...) when a numeric seed is provided. This ensures:

- Each iteration produces different random outcomes
- Results remain reproducible (same base seed → same sequence)
- Backward compatibility with seed arrays (no modification if seed is already an iterable passed via parameters)
- Unchanged behavior when no seed is specified (each iteration gets random seed from OS)

The fix only applies when:
1. A 'seed' parameter exists in kwargs
2. The seed value is not None
3. The iteration number is > 0
4. The seed is a single numeric value (int/float, not bool)
@quaquel
Copy link
Member

quaquel commented Oct 28, 2025

Considering this, do we want to move forward with this PR?

Shifting to numpy rng requires changes in e.g., CellCollection and AgentSet, it's independent from this PR.

Where/how should we do this (without breaking API)?

The return is List[Dict[str, Any]], so in prinicple you could just insert a seed kwarg into the dict.

Alternatively, you can keep stuff as is and just document the behavior.

Or, perhaps even better: raise a ValueError if seed and iterations don't match. So, if you do iterations=10 and seed=[5,] you raise a ValueError because the number of iterations and the number of seeds don't match.

In my view, we might even consider deprecating iterations in favor of only seed, where seed is either a single SeedLike or a list of SeedLike.

@quaquel quaquel mentioned this pull request Nov 10, 2025
4 tasks
@quaquel
Copy link
Member

quaquel commented Nov 12, 2025

@EwoutH, I have updated this PR as discussed yesterday. I added a new kewword argument rng and deprecated iterations. I have also updated the tests accordingly.

While reviewing the code, I noticed the current use of run_id and iteration in the return value from batch_run. run_id is just the sum total of runs, while iteration gives the iteration-id of particular experiment. I am inclined to modify this (probably in a separate PR). In my mind, it makes sense to have 3 pieces of information: an identifier for the experiment (i.e., the exact parameter settings), an identifier for the iteration/replication (this might also just be the seed value instead of an integer starting from 0), and possibly an identifier for which run in total it is. Currently, batch_run only gives you the second and third, so grouping by experiment can be a bit tricky, while this is critical when calculating, e.g., the average across replications.

@EwoutH EwoutH added the deprecation When a new deprecation is introduced label Nov 12, 2025
Copy link
Member Author

@EwoutH EwoutH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Since this is now a new, breaking feature, we should do it the proper way.

Could you add a section in the migration guide and link to it in the deprecationwarning? See #2872 for a recent example.

We also have to be careful about the order of our arguments. Some people use, positional arguments (stupid, I know), which we’re changing once we remove “iterations”.

Also we should properly explain this in the tutorial (can be a separate PR).

@quaquel
Copy link
Member

quaquel commented Nov 12, 2025

  1. It's technically not breaking because iterations will continue to work. It just issues a deprecation warning.
  2. Yes, I'll take a look at adding it to the migration guide.
  3. If you use arguments instead of keyword arguments, you deserve what you get :). Also, iterations is still there and still functions, but with a warning. So there is no problem.
  4. Yes, I was planning to update the docs.

@EwoutH
Copy link
Member Author

EwoutH commented Nov 16, 2025

Good stuff.

I was thinking (not for this PR), wouldn’t it be useful if the results wasn’t just a dataframe, but an object?

@quaquel
Copy link
Member

quaquel commented Nov 16, 2025

I was thinking (not for this PR), wouldn’t it be useful if the results wasn’t just a dataframe, but an object?

I am not sure. With the workbench, I have never seen the need for anything other than a dataframe with the experiments and a dictionary with the outcome names as keys and a numpy array as values. It might be different, however, if you want to store agent-level data over time. Still, for those use cases, you don't want to keep everything in memory.

@quaquel
Copy link
Member

quaquel commented Nov 17, 2025

@EwoutH, this is ready for review. I added the docs, migration guide, and fixed the tests.

Copy link
Member Author

@EwoutH EwoutH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and complete, thanks a lot. I have a few minor comments/suggestions.

quaquel and others added 4 commits November 17, 2025 12:37
@EwoutH
Copy link
Member Author

EwoutH commented Nov 17, 2025

Thanks, I think I can resolve the last open comments and merge.

Can you make sure the PR description represents the final state of this PR?

@quaquel
Copy link
Member

quaquel commented Nov 18, 2025

updated the original starting post

@quaquel quaquel added the trigger-benchmarks Special label that triggers the benchmarking CI label Nov 18, 2025
@github-actions
Copy link

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔵 +2.0% [+1.2%, +2.7%] 🔵 +0.2% [+0.1%, +0.3%]
BoltzmannWealth large 🔵 -0.3% [-1.1%, +0.3%] 🔵 -0.1% [-0.9%, +0.8%]
Schelling small 🔵 +0.5% [-0.3%, +1.4%] 🔵 +0.5% [+0.2%, +0.8%]
Schelling large 🔵 -2.1% [-2.5%, -1.8%] 🔵 +1.1% [+0.7%, +1.4%]
WolfSheep small 🔵 -0.2% [-0.4%, -0.0%] 🔵 -0.8% [-0.9%, -0.6%]
WolfSheep large 🔵 +0.7% [+0.1%, +1.3%] 🔵 -0.5% [-1.2%, +0.2%]
BoidFlockers small 🔵 +2.7% [+2.1%, +3.4%] 🔵 +1.1% [+0.8%, +1.3%]
BoidFlockers large 🔵 +2.1% [+1.4%, +2.7%] 🔵 +0.4% [+0.1%, +0.6%]

@quaquel quaquel removed the trigger-benchmarks Special label that triggers the benchmarking CI label Nov 26, 2025
Copy link
Member

@quaquel quaquel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready to go. @EwoutH, I'll leave the merge up to you.

@EwoutH
Copy link
Member Author

EwoutH commented Nov 26, 2025

Thanks, I will give it a final sweep tonight.

@EwoutH
Copy link
Member Author

EwoutH commented Nov 26, 2025

I fixed a small layout thing in the migration guide and extended the PR description a bit.

@quaquel one last thing, I noticed the parameter inspection logic:

model_parameters = inspect.signature(Model).parameters
rng_kwarg_name = "rng"
if "seed" in model_parameters:
    rng_kwarg_name = "seed"

This inspects the base Model class, which has both seed and rng in its signature:

def __init__(
    self,
    *args: Any,
    seed: float | None = None,
    rng: RNGLike | SeedLike | None = None,
    **kwargs: Any,
) -> None:

Since seed is always present in Model, the condition is always True, so rng_kwarg_name will always be "seed". The code effectively behaves the same as just hardcoding rng_kwarg_name = "seed".

Was the intent to inspect model_cls (the user's model) instead of Model (the base class) to adapt to different model signatures? Or is always using "seed" the desired behavior and this could be simplified?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Release notes label bug Release notes label deprecation When a new deprecation is introduced

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants