Skip to content

[Question] How to handle MultiDiscrete action spaces in TorchRL #3197

@AnastasiaPsarou

Description

@AnastasiaPsarou

I have created a custom Parallel API PettingZoo environment with MultiDiscrete action spaces. The env.action_spec() function succeeds.

I am using the Multi-Agent PPO tutorial of TorchRL, but I’m struggling to understand how to modify the architecture so it supports MultiDiscrete action spaces. Specifically, I’d like to know how to correctly adapt the MultiAgentMLP, TensorDictModule, and ProbabilisticActor so that the policy network outputs a MultiDiscrete (or equivalently, MultiCategorical) action distribution for each agent.

Should I create number of ProbabilisticActor modules as the length of the MultiDiscrete action space? In the case where a single ProbabilisticActor module is used, which distribution class should replace Categorical to support a MultiDiscrete action space? Is there an existing script or tutorial in TorchRL that demonstrates how to handle MultiDiscrete action spaces (or MultiCategorical distributions) in a multi-agent setup?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions