Monte-Carlo Tree Search (MCTS) basic implementation.
The basic idea is to implement a tic tac toe player using MCTS and match it up against my previously written reinforcement learning agent (Q-learning). I'm only doing that because I know that the q-learner can learn to play optimally, so I can use it as a baseline player to evaluate and validate the MCTS player.
For now the idea is to have a very simple MCTS policy, later I'll extend it to have more sophisticated exploration/exploitation policies, and look into other features.
Detailed writeup is available here: https://thevoid.ghost.io/monte-carlo-tree-search/