Skip to content

Conversation

@sujalgawas
Copy link
Contributor

@sujalgawas sujalgawas commented Nov 27, 2025

Summary

Introduces the Sugarscape G1MT (Growback 1, Metabolism, Trade) simulation to the examples directory. This is a port of the classic Epstein & Axtell artificial society model, adapted to use Large Language Models (LLMs) via mesa-llm. The simulation demonstrates emergent economic behavior and survival instincts using Gemini-2.5-flash

Motive

The goal was to replicate a classic Agent-Based Model (ABM) using Generative AI to test if LLM agents can handle biological survival constraints (hunger/metabolism) and economic decision-making (trading based on Marginal Rate of Substitution) without hard-coded rule sets for behavior.

This implementation also serves as a benchmark for running mesa-llm with Google's Gemini Free Tier, addressing specific challenges regarding API rate limits and concurrency.

Implementation

I attempts to mirror the structure of the epstein_civil_violence example while implementing the specific logic of Sugarscape.

Agents (agents.py):

    Trader: An LLMAgent equipped with STLTMemory. It calculates its own Marginal Rate of Substitution (MRS) based on inventory levels and updates its internal_state string to "feel" hunger or resource abundance.

    Resource: A standard CellAgent (non-LLM) that regrows sugar/spice capacity by 1 unit per step.

Tools (tools.py):

    move_to_best_resource: Scans the agent's vision radius for the highest resource concentration. Crucially, this tool includes logic to automatically harvest resources upon movement to ensure agent survival.

    propose_trade: Allows agents to exchange Sugar for Spice (and vice-versa) if their calculated MRS indicates a mutually beneficial trade.

Reasoning: Uses ReActReasoning to allow agents to observe their inventory, determine a plan (e.g., "I am starving, I need sugar"), and select the appropriate tool.

Optimization for Free Tier:

    Configured for low agent count (initial_traders=2) to respect Token Per Minute (TPM) limits.

    Sequential stepping (threading disabled in config) to avoid 429 Too Many Requests errors common with the Gemini Flash free tier.

Usage Examples

To run the visualization with Solara:
Bash

cd examples/sugarscrap_g1mt
solara run app.py

To run the headless simulation (CLI):

python -m examples.sugarscrap_g1mt.model

Configuration (Gemini Free Tier): In app.py, the defaults are set to ensure stability:

model_params = {
"initial_traders": 2, # Low count to prevent rate limiting
"parallel_stepping": False, # Sequential processing for safety
"llm_model": "gemini/gemini-2.5-flash-lite",
}

If you're modifying the visualisation, add before/after screenshots. -->
Screenshot 2025-11-27 105147

Additional Notes

Survival Stability: Early iterations resulted in mass starvation. The current logic ensures agents successfully harvest food. A run of 50+ steps confirmed that agents can reach a state of self-sufficiency, with wealth (Sugar/Spice) increasing over time.

Models Tested: Verified working on gemini-2.5-flash and gemini-2.5-flash-lite.

Latency: Due to sequential processing required for the free tier, simulation steps may take ~500ms to several seconds depending on the chain-of-thought length.

Summary by CodeRabbit

  • New Features
    • Added a SugarScape simulation example with interactive visualization interface.
    • Introduced trader and resource agents with autonomous decision-making and trading capabilities.
    • Traders can move across the grid, harvest resources, and propose trades with other agents based on economic principles.
    • Real-time tracking of inventory levels and resource metrics through an interactive dashboard.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 27, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This pull request introduces a complete SugarScape simulation system featuring LLM-based trading agents. It includes agent definitions (Trader and Resource), a Mesa-based model framework, visualization via Solara, and behavior tools for resource gathering and inter-agent trading.

Changes

Cohort / File(s) Summary
Core Package & Initialization
examples/sugarscrap_g1mt/__init__.py
Adds top-level import to trigger tool registration via examples.sugarscrap_g1mt.tools module.
Agent Definitions
examples/sugarscrap_g1mt/agents.py
Introduces TraderState enum, Trader class (LLMAgent-based with STLTMemory, MRS calculation, metabolism, and step lifecycle), and Resource class (Mesa CellAgent with replenishment). Establishes global trader_tool_manager and resource_tool_manager for tool integration.
Agent Behavior Tools
examples/sugarscrap_g1mt/tools.py
Adds move_to_best_resource() (locates and harvests nearby resources) and propose_trade() (compares MRS and executes mutually beneficial trades between agents). Both registered with trader tool manager.
Simulation Model
examples/sugarscrap_g1mt/model.py
Introduces SugarScapeModel (Mesa Model subclass) that initializes MultiGrid, spawns Resource and Trader agents, registers reporters for metrics tracking (sugar, spice, MRS, counts), implements step() with agent shuffling and DataCollector integration, and provides a non-graphics run demonstration.
Visualization Interface
examples/sugarscrap_g1mt/app.py
Defines Mesa Solara visualization with model parameter configuration (seed, trader/resource counts, grid size, reasoning backend, LLM model, vision range, parallel stepping), custom portrayal functions for Trader/Resource rendering, chart tracking of Total_Sugar and Total_Spice, and SolaraViz server initialization.

Sequence Diagram

sequenceDiagram
    participant Model
    participant Trader
    participant Resource
    participant Tools

    Model->>Trader: initialize<br/>(sugar, spice, metabolism)
    Model->>Resource: initialize<br/>(capacity, regrowth)
    
    loop Each Simulation Step
        Model->>Model: shuffle agents
        
        alt Trader Step
            Trader->>Trader: consume metabolism
            Trader->>Trader: update_internal_metrics
            Trader->>Trader: generate observation
            Trader->>Trader: aplan (LLM reasoning)
            
            alt move_to_best_resource
                Trader->>Tools: find best resource cell
                Tools->>Resource: query current_amount
                Trader->>Resource: harvest (drain to 0)
                Trader->>Trader: increase sugar/spice
            end
            
            alt propose_trade
                Trader->>Trader: query another trader
                Tools->>Trader: compare MRS values
                alt MRS favorable
                    Trader->>Trader: exchange sugar/spice
                    Tools->>Trader: confirm trade
                else
                    Tools->>Trader: reject trade
                end
            end
        end
        
        alt Resource Step
            Resource->>Resource: replenish current_amount
        end
        
        Model->>Model: collect data<br/>(sugar, spice, MRS, counts)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Key focus areas:
    • agents.py: Verify MRS calculation logic and safeguards for division-by-zero; validate LLMAgent memory integration and step/astep symmetry
    • tools.py: Ensure move_to_best_resource correctly filters vision range and propose_trade properly validates agent types and trade conditions before state mutations
    • model.py: Check DataCollector reporter definitions align with agent metrics; validate agent placement and initialization order
    • app.py: Confirm portrayal functions handle edge cases (missing attributes, empty grids) and parameter passing to model constructor
    • Integration: Verify tool manager registration in __init__.py properly wires tools to agents before simulation starts

Poem

🐰 A trader hops through sugared fields, seeking spice so fine,
LLM minds make deals and moves, when MRS does align,
Resources regrow with care, on grids where agents roam,
From sugar trades to charts that glow—a simulation home!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'Sugarscrap g1mt' is vague and uses an acronym/code name without clearly describing the main change or feature being added to the repository. Consider a more descriptive title that explains the main contribution, such as 'Add Sugarscape G1MT example with LLM agents' or 'Implement Sugarscape simulation with LLM-based trading agents'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Nov 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.27%. Comparing base (89801c8) to head (366323a).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #35   +/-   ##
=======================================
  Coverage   86.27%   86.27%           
=======================================
  Files          19       19           
  Lines        1311     1311           
=======================================
  Hits         1131     1131           
  Misses        180      180           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@colinfrisch
Copy link
Collaborator

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 27, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@colinfrisch colinfrisch added the documentation Improvements or additions to documentation label Nov 27, 2025
@colinfrisch
Copy link
Collaborator

colinfrisch commented Nov 27, 2025

@sujalgawas it looks good, great job ! I'm going to take a bit more time to review this, as there is quite a bit of code here. In the meantime, could you add a README.md file with some explanation and a few insights (a bit like you did in this PR). Don't hesitate to add a few meaningful screenshots from the steps displayed in the terminal if you think it's relevant as well as some comparison to the original g1mt simulation from mesa ;)

Also the coderabbit review should arrive in a few minutes, it usually gives some good insights.

Thanks again for the quality of your contribution :)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
examples/sugarscrap_g1mt/app.py (1)

32-46: Default parallel_stepping=True conflicts with stated goal of sequential stepping

You set "parallel_stepping": True here, but the PR description emphasizes sequential stepping on Gemini Free Tier to avoid 429s. For consistency and safer defaults, it’s worth flipping this to False and letting users opt in to parallel stepping via the UI.

examples/sugarscrap_g1mt/tools.py (1)

10-15: Duplicate TYPE_CHECKING import block

You import LLMAgent under TYPE_CHECKING twice back‑to‑back. One of these blocks can be removed with no behavior change.

examples/sugarscrap_g1mt/agents.py (1)

13-16: TraderState enum appears unused

TraderState is defined but not referenced elsewhere in the provided context. If you don’t plan to use it for reporters or UI, consider removing it (or documenting intended usage) to keep the example minimal.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c35af05 and f1e7469.

📒 Files selected for processing (5)
  • examples/sugarscrap_g1mt/__init__.py (1 hunks)
  • examples/sugarscrap_g1mt/agents.py (1 hunks)
  • examples/sugarscrap_g1mt/app.py (1 hunks)
  • examples/sugarscrap_g1mt/model.py (1 hunks)
  • examples/sugarscrap_g1mt/tools.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
examples/sugarscrap_g1mt/model.py (3)
examples/sugarscrap_g1mt/agents.py (6)
  • Resource (143-162)
  • Trader (19-140)
  • calculate_mrs (66-78)
  • update_internal_metrics (80-102)
  • step (104-122)
  • step (155-159)
mesa_llm/reasoning/reasoning.py (1)
  • Reasoning (53-115)
mesa_llm/recording/record_model.py (1)
  • record_model (45-112)
examples/sugarscrap_g1mt/app.py (3)
examples/sugarscrap_g1mt/agents.py (2)
  • Resource (143-162)
  • Trader (19-140)
examples/sugarscrap_g1mt/model.py (1)
  • SugarScapeModel (14-106)
mesa_llm/reasoning/react.py (1)
  • ReActReasoning (17-137)
examples/sugarscrap_g1mt/tools.py (3)
examples/sugarscrap_g1mt/agents.py (3)
  • Resource (143-162)
  • Trader (19-140)
  • calculate_mrs (66-78)
mesa_llm/tools/tool_decorator.py (1)
  • tool (310-392)
mesa_llm/llm_agent.py (1)
  • LLMAgent (23-241)
🔇 Additional comments (3)
examples/sugarscrap_g1mt/__init__.py (1)

1-1: Side‑effect import for tool registration looks good

Importing examples.sugarscrap_g1mt.tools at package import time is a reasonable way to ensure tool registration happens before any agents/models use the tool managers.

examples/sugarscrap_g1mt/model.py (1)

119-127: The repository is currently inaccessible via shell clone, and web search results don't provide specific verification for this version of mesa-llm. However, I can analyze the review comment's internal consistency:

Internal analysis of the review comment:

The review comment makes these specific claims:

  1. SugarScapeModel.__init__ takes reasoning: type[Reasoning] (a class, not instance)
  2. The parameter is passed to Trader.create_agents
  3. In LLMAgent, it's called as reasoning(agent=self) (instantiating the class)
  4. The code passes reasoning=Reasoning() (an instance of abstract base)
  5. This will fail because the instance is not callable

Plausibility assessment:

The logic is internally consistent: if the signature expects type[Reasoning] and the code tries to call it later like reasoning(agent=self), then passing an instance would indeed fail with a TypeError. The review comment provides:

  • Clear type mismatch explanation
  • Specific runtime failure mode (instance not callable)
  • A concrete fix (pass ReActReasoning class instead)
  • Reference to app.py as a working example

Limitation:

Without repository access, I cannot verify:

  • The actual signature of SugarScapeModel.__init__
  • Whether Reasoning is actually abstract
  • Whether ReActReasoning exists and is correct
  • How app.py actually uses it

Given that the repository is inaccessible and the review comment's claims are logical and self-consistent but unverified, I must flag this for manual verification.


Headless run passes a Reasoning instance where a class is expected

SugarScapeModel.__init__ takes reasoning: type[Reasoning] and passes it through to Trader.create_agents, where LLMAgent expects to call it like reasoning(agent=self). In the __main__ example you instead pass an instance:

reasoning=Reasoning(),  # instance of the abstract base

This will fail at runtime because the instance is not callable (and Reasoning is intended as a base, not a concrete planner).

Consider updating the demo to pass a concrete reasoning class, consistent with app.py, e.g.:

@@ if __name__ == "__main__":
-    from mesa_llm.reasoning.reasoning import Reasoning
+    from mesa_llm.reasoning.react import ReActReasoning

     model = SugarScapeModel(
         initial_traders=5,
         initial_resources=20,
         width=10,
         height=10,
-        reasoning=Reasoning(),
+        reasoning=ReActReasoning,
         llm_model="openai/gpt-4o-mini",
         vision=2,
     )
examples/sugarscrap_g1mt/agents.py (1)

56-64: ReActReasoning class does not exist in mesa-llm; verify the example location and verify whether the referenced code actually uses the pattern described.

After extensive verification attempts, I found that:

  1. ReActReasoning class not found: Web searches of the projectmesa/mesa-llm repository confirm that no class named ReActReasoning exists in the codebase.

  2. Example location unclear: The file path examples/sugarscrap_g1mt/agents.py referenced in the review cannot be located. Sugarscape examples appear to be in the mesa-examples or main mesa repositories, not mesa-llm.

  3. Core concern unverifiable: Since ReActReasoning doesn't exist in mesa-llm, the claim that "ReActReasoning's plan() replaces self.agent.llm.system_prompt" cannot be verified and may be based on incorrect assumptions about the framework's architecture.

The review comment references code patterns and classes that do not appear to exist in the referenced repository. Before addressing the suggested refactoring, verify:

  • The correct repository containing the example code
  • The actual reasoning class or pattern being used in the agents
  • Whether the system_prompt duplication concern still applies given the actual implementation

Likely an incorrect or invalid review comment.

Comment on lines +48 to +56
harvested = 0
cell_contents = agent.model.grid.get_cell_list_contents(best_cell)
for obj in cell_contents:
if isinstance(obj, Resource):
harvested = obj.current_amount
obj.current_amount = 0 # Harvest all available resource

agent.sugar += harvested
agent.spice += harvested
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Harvesting from multiple Resource agents in a cell loses resources

In move_to_best_resource, if more than one Resource happens to occupy best_cell (possible given random placement), you:

harvested = obj.current_amount
obj.current_amount = 0

inside a loop, so harvested is overwritten instead of accumulated. Earlier resources are zeroed but their value never reaches the trader.

Consider accumulating the total:

-        harvested = 0
+        harvested = 0
         cell_contents = agent.model.grid.get_cell_list_contents(best_cell)
         for obj in cell_contents:
             if isinstance(obj, Resource):
-                harvested = obj.current_amount
-                obj.current_amount = 0  # Harvest all available resource
+                harvested += obj.current_amount
+                obj.current_amount = 0  # Harvest all available resource
🤖 Prompt for AI Agents
In examples/sugarscrap_g1mt/tools.py around lines 48 to 56, the loop over
cell_contents assigns harvested = obj.current_amount for each Resource and thus
overwrites previous values, losing resources when multiple Resource agents
occupy the same cell; change the logic to accumulate a running total (e.g.,
initialize harvested = 0 once, then inside the loop do harvested +=
obj.current_amount and set obj.current_amount = 0) and then add that total to
agent.sugar and agent.spice so all resources harvested from the cell are
correctly transferred to the agent.

Comment on lines +78 to +94
other_agent = next(
(a for a in agent.model.agents if a.unique_id == other_agent_id), None
)

if other_agent is None:
return f"Agent {other_agent} not found."

if not isinstance(other_agent, Trader):
return f"agent {other_agent_id} is not a valid trader."

# Simple trade acceptance logic for demonstration
if other_agent.calculate_mrs() > agent.calculate_mrs():
agent.sugar -= sugar_amount
agent.spice += spice_amount
other_agent.sugar += sugar_amount
other_agent.spice -= spice_amount
return f"agent {agent.unique_id} traded {sugar_amount} sugar for {spice_amount} spice with agent {other_agent_id}."
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Trade helper has incorrect error message and no balance checks

Two issues in propose_trade:

  1. When the target agent isn’t found, the message uses other_agent instead of other_agent_id, so it always prints None:
    if other_agent is None:
        return f"Agent {other_agent} not found."
  2. There are no checks that agent and other_agent actually have enough sugar/spice for the proposed amounts; this can drive inventories negative and immediately kill traders via the metabolism checks.

A minimal fix could be:

     other_agent = next(
         (a for a in agent.model.agents if a.unique_id == other_agent_id), None
     )

     if other_agent is None:
-        return f"Agent {other_agent} not found."
+        return f"agent {other_agent_id} not found."

     if not isinstance(other_agent, Trader):
         return f"agent {other_agent_id} is not a valid trader."

     # Simple trade acceptance logic for demonstration
-    if other_agent.calculate_mrs() > agent.calculate_mrs():
-        agent.sugar -= sugar_amount
-        agent.spice += spice_amount
-        other_agent.sugar += sugar_amount
-        other_agent.spice -= spice_amount
+    if sugar_amount <= 0 or spice_amount <= 0:
+        return "sugar_amount and spice_amount must be positive."
+
+    if agent.sugar < sugar_amount or other_agent.spice < spice_amount:
+        return (
+            f"agent {agent.unique_id} or agent {other_agent_id} "
+            "does not have enough resources for this trade."
+        )
+
+    if other_agent.calculate_mrs() > agent.calculate_mrs():
+        agent.sugar -= sugar_amount
+        agent.spice += spice_amount
+        other_agent.sugar += sugar_amount
+        other_agent.spice -= spice_amount
         return f"agent {agent.unique_id} traded {sugar_amount} sugar for {spice_amount} spice with agent {other_agent_id}."
🤖 Prompt for AI Agents
In examples/sugarscrap_g1mt/tools.py around lines 78 to 94, the propose_trade
helper returns the wrong value when the target agent isn't found and lacks
balance validation: change the not-found message to use other_agent_id, and
before mutating inventories validate that sugar_amount and spice_amount are
non-negative and that both agent and other_agent have at least the required
sugar and spice respectively; if any check fails, return a clear error string
(e.g., insufficient sugar/spice or invalid amounts) and only perform the
sugar/spice transfers when all checks pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants