I am applying the deep learning models Monte Carlo First Visit-Prediction, and Monte Carlo Exploring Starts (ES) a type of on-policy control, to the game of Blackjack.
This repo contains a Monte Carlo simulation for the game of Blackjack using the gym
environment.
gym
numpy
matplotlib
seaborn
tqdm
pathlib
pickle
The main goal of this simulation is to determine the best possible action (either to hit or stick) based on the current hand of the player, the visible card of the dealer, and whether or not the player has a usable ace.
The repository contains functions to:
- Play a single game of blackjack.
- Define dealer and player policies.
- Execute Monte Carlo on-policy.
- Run Monte Carlo with exploring starts.
- Visualization of the results.
The code in this repository primarily leverages the following algorithms and tools:
-
Monte Carlo Method: Used for estimating the value of states in the Blackjack environment. This algorithm generates samples from the state space to estimate state values.
-
OpenAI's
gym
Library: This is a toolkit for developing and comparing reinforcement learning algorithms. In our code, we use theBlackjackEnv
from the toy text environments provided bygym
. -
Seaborn and Matplotlib: These Python data visualization libraries are used for visualizing the results of the Monte Carlo simulations.
-
Numpy: This library supports large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays.
To get started with the toolset, ensure you have all the dependencies installed. You can generally install them using pip:
pip install gym numpy seaborn matplotlib
- Set up the environment:
import gym.envs.toy_text.blackjack as bj
env = bj.BlackjackEnv()
env.reset()
env.action_space.sample()
env.observation_space[0].n
env.observation_space[1].n
env.observation_space[2].n
env.seed(42)
print('Initial state:', env.reset())
print('Playing one game...')
play(env, player_policy)
run_monte_carlo_on_policy()
run_monte_with_exploring_starts(num_episodes_es)
plot_monte_carlo_on_policy(states, titles)
plot_monte_carlo_with_exploring_starts(policy_values, titles)
Pull requests are welcome. For major changes, please open an issue first to discuss what you'd like to change.