DeepLearning MonteCarloFirstVisit ExploringStarts

I am applying the deep learning models Monte Carlo First Visit-Prediction, and Monte Carlo Exploring Starts (ES) a type of on-policy control, to the game of Blackjack.

Blackjack Monte Carlo Simulation

This repo contains a Monte Carlo simulation for the game of Blackjack using the gym environment.

Dependencies

gym
numpy
matplotlib
seaborn
tqdm
pathlib
pickle

Overview

The main goal of this simulation is to determine the best possible action (either to hit or stick) based on the current hand of the player, the visible card of the dealer, and whether or not the player has a usable ace.

The repository contains functions to:

Play a single game of blackjack.
Define dealer and player policies.
Execute Monte Carlo on-policy.
Run Monte Carlo with exploring starts.
Visualization of the results.

Primary Algorithm Toolset

The code in this repository primarily leverages the following algorithms and tools:

Monte Carlo Method: Used for estimating the value of states in the Blackjack environment. This algorithm generates samples from the state space to estimate state values.
OpenAI's gym Library: This is a toolkit for developing and comparing reinforcement learning algorithms. In our code, we use the BlackjackEnv from the toy text environments provided by gym.
Seaborn and Matplotlib: These Python data visualization libraries are used for visualizing the results of the Monte Carlo simulations.
Numpy: This library supports large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays.

To get started with the toolset, ensure you have all the dependencies installed. You can generally install them using pip:

pip install gym numpy seaborn matplotlib

How to Use

Set up the environment:

import gym.envs.toy_text.blackjack as bj
env = bj.BlackjackEnv()

Reset the environment:

env.reset()

Sample action and observation spaces:

env.action_space.sample()
env.observation_space[0].n
env.observation_space[1].n
env.observation_space[2].n

Play a single game:

env.seed(42)
print('Initial state:', env.reset())
print('Playing one game...')
play(env, player_policy)

Execute Monte Carlo simulations:

run_monte_carlo_on_policy()
run_monte_with_exploring_starts(num_episodes_es)

Visualize the results:

plot_monte_carlo_on_policy(states, titles)
plot_monte_carlo_with_exploring_starts(policy_values, titles)

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you'd like to change.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
2023_08_mitch___h_deep_learning_models_Monte_Carlo_First_Visit-Predicti.png		2023_08_mitch___h_deep_learning_models_Monte_Carlo_First_Visit-Predicti.png
Blackjack-3.ipynb		Blackjack-3.ipynb
Figure5.1.jpg		Figure5.1.jpg
FirstVisitMCPrediction.jpg		FirstVisitMCPrediction.jpg
MonteCarlo_OnPolicy_ExploringStarts.jpg		MonteCarlo_OnPolicy_ExploringStarts.jpg
README.md		README.md
YouTubeVideo		YouTubeVideo
mitch___h_Blackjack_Monte_Carlo_Simulation._Miami_vice._downtow_fc409bf4-e1eb-46f2-b70a-7a0030be15f5.png		mitch___h_Blackjack_Monte_Carlo_Simulation._Miami_vice._downtow_fc409bf4-e1eb-46f2-b70a-7a0030be15f5.png
monte_carlo_es.jpg		monte_carlo_es.jpg
optimal_policy_and_state_value_function_blackjack.jpg		optimal_policy_and_state_value_function_blackjack.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepLearning MonteCarloFirstVisit ExploringStarts

Blackjack Monte Carlo Simulation

Dependencies

Overview

Primary Algorithm Toolset

How to Use

Reset the environment:

Sample action and observation spaces:

Play a single game:

Execute Monte Carlo simulations:

Visualize the results:

Contributing

About

Releases

Packages

Languages

mitch-henderson/DeepLearning_MonteCarloFirstVisit_ExploringStarts

Folders and files

Latest commit

History

Repository files navigation

DeepLearning MonteCarloFirstVisit ExploringStarts

Blackjack Monte Carlo Simulation

Dependencies

Overview

Primary Algorithm Toolset

How to Use

Reset the environment:

Sample action and observation spaces:

Play a single game:

Execute Monte Carlo simulations:

Visualize the results:

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages