Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Transform that stacks data for agents with identical specs #2566

Open
1 task done
kurtamohler opened this issue Nov 14, 2024 · 1 comment · May be fixed by #2567
Open
1 task done

[Feature Request] Transform that stacks data for agents with identical specs #2566

kurtamohler opened this issue Nov 14, 2024 · 1 comment · May be fixed by #2567
Assignees
Labels
enhancement New feature or request

Comments

@kurtamohler
Copy link
Collaborator

Motivation

Some multi-agent environments, like VmasEnv, stack all of the tensors for observations, rewards, etc. for different agents that have identical specs. For instance, in one of these stacked environments, if there are 2 agents that each have 8 observations, the observation spec might look like this:

Composite(
    group_0: Composite(
        observation: UnboundedContinuous(
            shape=torch.Size([2, 8]),
            ...),
        shape=torch.Size([2])),
    shape=torch.Size([]))

In contrast, other environments, like UnityMLAgentsEnv, have separate keys for each agent, even if the agents' specs are identical. For instance, with 2 agents that each have 8 observations, the observation spec might look like this:

Composite(
    group_0: Composite(
        agent_0: Composite(
            observation: UnboundedContinuous(
                shape=torch.Size([8]),
                ...),
            shape=torch.Size([])),
        agent_1: Composite(
            observation: UnboundedContinuous(
                shape=torch.Size([8]),
                ...),
            shape=torch.Size([])),
        shape=torch.Size([])),
    shape=torch.Size([]))

It is not easy to apply the same training script to two environments that use these two different formats. For instance, applying the multi-agent PPO tutorial to a Unity env is not straightforward.

Solution

If we had an environment transform that could stack all the data from different keys, we could convert an environment that uses the unstacked format into an environment that uses the stacked format. Then it should be straightforward to use the same (or almost the same) training script on the two different environments.

Alternatives

Additional context

Checklist

  • I have checked that there is no similar issue in the repo (required)
@kurtamohler kurtamohler added the enhancement New feature or request label Nov 14, 2024
@kurtamohler kurtamohler self-assigned this Nov 14, 2024
@kurtamohler kurtamohler linked a pull request Nov 14, 2024 that will close this issue
7 tasks
@thomasbbrunner
Copy link
Contributor

thomasbbrunner commented Nov 15, 2024

Have you taken a look at the group_map argument? When set to MarlGroupMapType.ALL_IN_ONE_GROUP the environment should return all agents in a single group (when possible, otherwise in more than one group).

If you are using this setting, then imo there's an issue in the implementation of the grouping of agents in UnityMLAgentsEnv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants