This repository contains source code and data for our project done as part of the Data Analytics (UE19CS312) course at PES University.
Our goal is to understand the impact of Internet inclusivity and accessibility on developmental indices to understand how increased access to the Internet can impact different spheres of life in a particular country. These could include economic growth, the degree of inequality in a country, the happiness of the people and in many other ways.
The particular developmental indices we studied were:
- GINI Coefficient
- World Democracy Index
- Global Peace Index
- Corruption Perceptions Index
- UN E-Government Development Index
This data is collected by the The Economist every year. The dataset used for this project can be found here: https://www.kaggle.com/kwamsahortor/internet-inclusivity-index-2017-2021 .
The literature survey that was conducted before finalising the problem statement is present above, and it consists of the summaries of related work that we studied, as well as a report of the EDA (cleaning and visualisation), and the finalised problem statement with additional context.
The EDA Notebook needs to be run on the above dataset to obtain the cleaned and transformed dataset. This dataset must be loaded on the Models_DA Notebook to build and evaluate the four different models that were used to study this data:
- MLR
- Lasso Regression
- Ridge Regression
- Elastic-Net Regression
The evaluation of the above four models, as well as the inferences we drew from the final models are detailed in the final report.