An OpenAI gym wrapper for PyReason to use in a reinforcement learning Grid World setting.
This is an OpenAI Gym environment for reinforcement learning in a grid world setting using PyReason as a simulator.
- There are two teams: Red and Blue
- There are two bases: Red Base and Blue Base
- There are a certain number of agents in each team
There are 9 actions an agent can take:
- Move Up
- Move Down
- Move Left
- Move Right
- Shoot Up
- Shoot Down
- Shoot Left
- Shoot Right
- Do Nothing
The objecive of the game is to kill all enemy agents or make their health=0
. The game will terminate (or signal done=True
when this happens). This objective can be changed in the is_done()
function in grid_world.py
to determine when the game should be over.
The reward function is currently not defined A Reward of 0
is given at each step. You can modify this in the _get_rew
function in grid_world.py
Make sure pyreason==1.5.1
has been installed using the instructions found here
Clone the repository, and install:
git clone https://github.com/lab-v2/pyreason-gym
pip install -e pyreason-gym
NOTE: Do not install this package using setup.py
--this will not work. Use the instructions above to install.
To run the environment and get a feel for things you can run the test.py
file which will perform random actions in the grid world for 50 steps.
python test.py
This Grid World scenario needs a graph in GraphML format to run. A graph file has already been generated in the graphs folder. However if you wish to change certain parameters such as
- Number of agents per team
- Start locations of the agents
- Obstacle locations in the grid
- The Grid World size (height, width)
- The locations of the Red and Blue bases
You will need to re-generate the graph file using the generate_graph.py
script by changing the appropriate parameters. This will generate the graph in the appropriate location for PyReason to find. NOTE: This is optional if you just want to try out the package--you can use the graph file already provided.
This is an OpenAI Gym custom environment. More on OpenAI Gym:
The interface is just like a normal Gym environment. To create an environment and start using it, insert the following into your Python script. Make sure you've Installed this package before this.
import gym
import pyreason_gym
env = gym.make('PyReasonGridWorld-v0')
# Reset the environment
obs, _ = env.reset()
# Take a random action and get observation, rewards, done signal etc.
# This will sample a random action from the action space of the environment
action = env.action_space.sample()
obs, rew, done, _, _ = env.step(action)
# Keep using `env.step(action)` and `env.reset()` to get observations and run the grid world game.
A Tutorial on how to interact with gym environments can be found here
The action space is currently a list for each team with discrete numbers representing each action:
- Move Up is represented by
0
- Move Down is represented by
1
- Move Left is represented by
2
- Move Right is represented by
3
- Shoot Up is represented by
4
- Shoot Down is represented by
5
- Shoot Left is represented by
6
- Shoot Right is represented by
7
- Do Nothing is represented by
8
A sample action with 1
agent per team is of the form:
# Sample action. The list will increase with the number of agents per team
action = {
'red_team': [0],
'blue_team': [2]
}
# Send the action to the environment
obs, rew, done, _, _ = env.step(action)
Observations contain information about each player's position in the grid ([x,y]
), their health
as well as blue and red bullet
information including the position of the bullet in the grid ([x,y]
) and its direction.
A sample observation with 1
agent per team is a dictionary of the form:
observation = {
'red_team': [{'pos': [1,3], 'health': [1]}],
'blue_team': [{'pos': [7,2], 'health': [1]}],
'red_bullets': [{'pos': [2,3], 'dir': 1}, {'pos': [5,3], 'dir': 3}],
'blue_bullets': [{'pos': [7,1], 'dir': 2}]
}
Information about agent positions, health, bullet positions and direction can be extracted from this observation space.
There are a few render modes supported:
human
- Creates a PyGame visualization of the grid world and actionsNone
- No rendering, interaction only through actions and observationsrgb_array
- An RGB array of the screen that would have been displayed usingrender_mode='human'
. This can be used alongside CNNs etc.
These can be used when creating the environment:
env = gym.make('PyReasonGridWorld-v0', render_mode='human')
# Or
env = gym.make('PyReasonGridWorld-v0', render_mode=None)
# Or
env = gym.make('PyReasonGridWorld-v0', render_mode='rgb_array')
If you're using render_mode='rgb_array
you have to call env.render(observation)
after observation = env.step()
to get the rgb data.
If you've generated the graph using the generate_graph.py
script with a custom grid size
and custom number of agents per team
, you can pass these parameters to the grid world while creating the environment:
env = gym.make('PyReasonGridWorld-v0', grid_size=8, num_agents_per_team=1)
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
If you used this software in your work please cite our paper
@inproceedings{aditya_pyreason_2023,
title = {{PyReason}: Software for Open World Temporal Logic},
booktitle = {{AAAI} Spring Symposium},
author = {Aditya, Dyuman and Mukherji, Kaustuv and Balasubramanian, Srikar and Chaudhary, Abhiraj and Shakarian, Paulo},
year = {2023}}
This repository is licensed under BSD-3-Clause
Dyuman Aditya - [email protected]
Kaustuv Mukherji - [email protected]
Paulo Shakarian - [email protected]