-
Notifications
You must be signed in to change notification settings - Fork 0
Example: build an Agent
All agents extends a common class: the interface Agent. This class contains all the methods that the MatchManager class state machine will call during a game.
These methods that need to be implemented are the following four.
This method is used to choose the next action to perform. The method will receive the current Game Board and Game Status and the agent will have to generate an Actions.
Choice of the next action is basically what an Agent need to do. There it is possible to use simple algorithms, heuristics, or complex Deep Learning methods. The important thing is to return a valid Action
.
As an example, the basic MLAgent, an abstract agent that allow the implementation of different kind of scoring functions based on Machine Learning, implements a very simple approach to find the next Action to perform.
First, it generates all the possible actions for all the figures:
all_actions = []
for figure in state.getFiguresCanBeActivated(self.team):
actions = [self.gm.actionPassFigure(figure)] + \
self.gm.buildAttacks(board, state, figure) + \
self.gm.buildMovements(board, state, figure)
all_actions += actions
Then checks if there are available actions. If not, it is useful to raise a ValueError
exception: when raised, the GameManager will catch it and consider that the player cannot do anything and pass to the other team. This mechanism is particular useful with the responses
or when a player has less figures, hence choices, than the other player.
if not all_actions:
logger.warning('No actions available: no action given')
raise ValueError('No action given')
Then finally the agent use its internal methods to check for the next best action to perform. Once again, if no actions is found the ValueError
exception is raised.
# assign score to each available action
scores = self.scores(board, state, all_actions)
# find optimal action
bestScore, bestAction = self.bestAction(scores)
if not bestAction:
logger.warning('No best action found: no action given')
raise ValueError('No action given')
Check the implementation of this method for the MLAgent as an example.
This method is practically the same as chooseAction(board, state)
, but it will be called for a Response.
As an example, in the MLAgent the method is implemented exactly as the chooseAction()
but on a different set of "actions":
all_actions = []
for figure in state.getFiguresCanBeActivated(self.team):
actions = [self.gm.actionPassResponse(self.team)] + \
self.gm.buildResponses(board, state, figure)
all_actions += actions
Check the implementation of this method for the MLAgent as an example and compare it with the chooseresponse() to compare the difference and similarities.
Some scenarios (like Junction) permit to a player to choose where to place its initial figures. In this method the agent is allowed to freely move its units in a limited area. The changes applied to the state are kept and used as the game starts.
The first thing that the implementation of this method does, is to find the placement area
and the figures from the current state
:
x, y = np.where(state.placement_zone[self.team] > 0)
figures = state.getFigures(self.team)
It is best practice to perform a deepcopy
of the state
of the game and do all the computation needed to find the best initial position on the copy.
s = deepcopy(state)
When the positions are definitive, just use the state.moveFigure(figure, dst)
method directly on the original state
:
for j in range(len(figures)):
figure = figures[j]
dst = Hex(x[optimal_position[j]], y[optimal_position[j]]).cube()
state.moveFigure(figure, dst=dst)
Check the implementation of this method for the GreedyAgent as an example.
Some scenarios allow the choice between group of figures. These are fixed position of different "color" on a scenario that an agent can choose to use. The list of available colors is given directly by the state
object:
colors = list(state.choices[self.team].keys())
And the choice are done by directly call the state.choose(team, color)
method on the state
object.
state.choose(self.team, color)
Check the implementation of this method for the GreedyAgent as an example.
Avoid the polluting the GameState
object by doing a deepcopy of it:
from utils.copy import deepcopy
new_state = deepcopy(state)
Use the internal GameManager utility class to test the result of an action:
s1, outcome = self.gm.activate(board, state, action)
To generate the actions for a figure
figurePass = self.gm.actionPassFigure(figure)
figureMove = self.gm.actionMove(board, state, figure, destination=Cube(1,2,3)
figureAttack = self.gm.actionAttack(board, state, figure, target, weapon)
To build the available actions:
movements = self.gm.buildMovements(board, state, figure)
attacks = self.gm.buildAttacks(board, state, figure)
responses = self.gm.buildResponses(board, state, figure)
When one wants to analyze the performance or the behavior of an agent it is useful to keep track and check the history of the actions. The Agent interface offers three methods to store and retrieve the history of actions done by an agent:
Usage is pretty straightforward. When an agent find the best action or response, use the register()
method:
...
self.register(state, [bestAction])
...
Then to generate the Pandas' DataFrame of the history, use the createDataFrame()
method:
df = red_agent.createDataFrame()
The basic implementation is very... limited. For this reason many agent implementation build a store()
method that wraps the register()
method.
As an example, we can check the implementation of the AlphaBeta agent:
def store(self, state: GameState, bestScore: float, bestAction: Action) -> None:
data = [bestScore, type(bestAction).__name__, self.maxDepth]
self.register(state, data)
Instead of use the raw data
list, the wrapper takes as arguments some useful information (the score of the best action and the action itself) and build some extra values. To avoid issues with the generation fo the Pandas' DataFrame, the dataFrameInfo()
method is also extended with the nemae of the additional columns:
def dataFrameInfo(self) -> List[str]:
return super().dataFrameInfo() + ['score', 'action', 'depth']
- Rules Players
- Match Structure
- Game Status
- Line of Fire (LOF) and Line of Sight (LOS)
- Actions
- Game Board