You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some multi-agent environments, like VmasEnv, stack all of the tensors for observations, rewards, etc. for different agents that have identical specs. For instance, in one of these stacked environments, if there are 2 agents that each have 8 observations, the observation spec might look like this:
In contrast, other environments, like UnityMLAgentsEnv, have separate keys for each agent, even if the agents' specs are identical. For instance, with 2 agents that each have 8 observations, the observation spec might look like this:
It is not easy to apply the same training script to two environments that use these two different formats. For instance, applying the multi-agent PPO tutorial to a Unity env is not straightforward.
Solution
If we had an environment transform that could stack all the data from different keys, we could convert an environment that uses the unstacked format into an environment that uses the stacked format. Then it should be straightforward to use the same (or almost the same) training script on the two different environments.
Alternatives
Additional context
Checklist
I have checked that there is no similar issue in the repo (required)
The text was updated successfully, but these errors were encountered:
Have you taken a look at the group_map argument? When set to MarlGroupMapType.ALL_IN_ONE_GROUP the environment should return all agents in a single group (when possible, otherwise in more than one group).
If you are using this setting, then imo there's an issue in the implementation of the grouping of agents in UnityMLAgentsEnv.
Motivation
Some multi-agent environments, like
VmasEnv
, stack all of the tensors for observations, rewards, etc. for different agents that have identical specs. For instance, in one of these stacked environments, if there are 2 agents that each have 8 observations, the observation spec might look like this:In contrast, other environments, like
UnityMLAgentsEnv
, have separate keys for each agent, even if the agents' specs are identical. For instance, with 2 agents that each have 8 observations, the observation spec might look like this:It is not easy to apply the same training script to two environments that use these two different formats. For instance, applying the multi-agent PPO tutorial to a Unity env is not straightforward.
Solution
If we had an environment transform that could stack all the data from different keys, we could convert an environment that uses the unstacked format into an environment that uses the stacked format. Then it should be straightforward to use the same (or almost the same) training script on the two different environments.
Alternatives
Additional context
Checklist
The text was updated successfully, but these errors were encountered: