Skip to content

wwetnaroh/Gomoku

Repository files navigation

EECE E6892-final-project Gomoku application based on AlphaZero

Final Project of EECE E6892@Columbia University
author: Xiaotian Geng, Jing Peng

Brief introduction of our project

Our project will implement AlphaZero algorithm in playing a board game Gomoku. Furthermore, we will explore the influence of different factors like numbers of training iterations, model architecture, numbers of simulated games and calculation of Upper Confidence Bound(UCB) have on our trained AI model. And we will realize this by comparing the winnining percentage by changing different factors. Besides we will implement an application for users to visualize the Gomoku gaming results and allow users to upload their own RL models to compare the performance of different models. Man-AI gaming mode will also be developed for users to play Gomoku game with RL based AI:

  • Model comparison: Users can upload any models they intend to compare; Our application will present the process of how those two models play gomoku against each other, and also the final result will be given.
  • Human-model competition: This will be an interaction between our users and their uploaded model, and users can play gomoku with their trained model. Our technology stack is as follows: we will build our across-platform desktop application with electron and react in front-end, and use hooks for data interaction, also use antd framework for UI design; and using flask to design the back-end API.

Description of code

  • Gomoku-client: Front-end of the whole project.
  • Gomoku-server: Back-end of the whole project.
    • PVKeras: Code implementation of model built by Keras. They are original 3-layer model, 4-layer CNN model, 4-layer RNN model, and Resnet-18 model.
    • GomokuBoard Used to generate gomoku board and visualize the board.
    • PVPytorch: Code implementation of model built by Pytorch.
    • model: files of trained models
    • play.py: this file is used to do AI vs AI Gomoku Competition and explore winning strategy.
    • train.py: this file is used to train models.

Run the application

Environment requirement

  • Python version 3.+
  • Nodejs installed
  • Pytorch installed

Note: only test on macos for now

You can execute the shell script to start both the front-end and back-end

  • start the front-end
sh start_frontend.sh
  • start the back-end
sh start_backend.sh

Demo

videoplayback.mp4

Related docs

  • DeepMind's paper on AlphaZero: {Silver, David, et al. "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play." Science 362.6419 (2018): 1140-1144.
  • Alphago: Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." nature 529.7587 (2016): 484-489.
  • Alphago Zero: Silver, David, et al. "Mastering the game of go without human knowledge." nature 550.7676 (2017): 354-359.
  • DeepMind published a new paper detailing MuZero, a new algorithm able to generalise on AlphaZero work, playing both Atari and board games without knowledge of the rules or representations of the game: Schrittwieser, Julian, et al. "Mastering atari, go, chess and shogi by planning with a learned model." Nature 588.7839 (2020): 604-609.
  • Comparision with other Monte Carlo tree search searches and training: Silver, David, et al. "Mastering chess and shogi by self-play with a general reinforcement learning algorithm." arXiv preprint arXiv:1712.01815 (2017).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published