Skip to content

I used Machine Learning to make a Logistic Regression model using scikit-learn, pandas, numpy, seaborn and matplotlib to predict the results of FIFA 2018 World Cup.

Notifications You must be signed in to change notification settings

itsmuriuki/FIFA-2018-World-cup-predictions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

FIFA-2018-World-cup-predictions

I used Machine Learning to make a Logistic Regression model using scikit-learn, pandas, numpy, seaborn and matplotlib to predict the results of FIFA 2018 World Cup.

FIFA World Cup 2018 Winner Predictions

Goal

  1. The goal is to use Machine Learning to predict who is going to win the FIFA World Cup 2018.

  2. Predict the outcome of individual matches for the entire competition.

  3. Run simulation of the next matches i.e quarter finals, semi finals and finals.

These goals present a unique real-world Machine Learning prediction problem and involve solving various Machine Learning tasks: data integration, feature modelling and outcome prediction.

Data

I used two data sets from Kaggle - Results of the matches since 1930 and the World Cup 2018 Dataset. I used results of historical matches since the beginning of the championship (1930) for all participating teams.

Environment and tools

  1. Jupyter Notebook
  2. Numpy
  3. Pandas
  4. Seaborn
  5. Matplotlib
  6. Scikit-learn

I chose Logistic Regression in my model and got an accuracy of 57% on the training set and 55% accuracy on the test set. I also used the FIFA ranking as of April 2018 dataset and a dataset containing the fixture of the group stages of the tournament.

According to this model Brazil is likely to win this World Cup.

Areas of further Research/ Improvement

  1. Dataset - to improve dataset you could use FIFA, the game not the organisation, to assess the quality of each team player.

  2. A confusion matrix would be great to analyse which games the model got wrong.

  3. We could ensemble that is, we could try stacking more models together to improve the accuracy.

About

I used Machine Learning to make a Logistic Regression model using scikit-learn, pandas, numpy, seaborn and matplotlib to predict the results of FIFA 2018 World Cup.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published