[Question] How to handle MultiDiscrete action spaces in TorchRL

I have created a custom Parallel API PettingZoo environment with **MultiDiscrete action spaces**. The _env.action_spec()_ function succeeds. 

I am using the **Multi-Agent PPO tutorial of TorchRL**, but I’m struggling to understand how to modify the architecture so it supports **MultiDiscrete action spaces**. Specifically, I’d like to know how to correctly adapt the `MultiAgentMLP`, `TensorDictModule`, and `ProbabilisticActor` so that the policy network outputs a `MultiDiscrete` (or equivalently, `MultiCategorical`) action distribution for each agent.

Should I create number of `ProbabilisticActor` modules as the length of the MultiDiscrete action space? In the case where a single `ProbabilisticActor` module is used, which distribution class should replace `Categorical` to support a MultiDiscrete action space? Is there an existing script or tutorial in TorchRL that demonstrates how to handle `MultiDiscrete` action spaces (or `MultiCategorical` distributions) in a multi-agent setup?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] How to handle MultiDiscrete action spaces in TorchRL #3197

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] How to handle MultiDiscrete action spaces in TorchRL #3197

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions