predicting_customer_churn_12GB_AWS_EMR

Sparkify is an imaginary music app company and this dataset is provided as part of Udacity course. We do analysis on a small dataset followed by a larger dataset (12GB).

We analyse this datset and train a model to predict customers that are likely to churn. We see their last sessions, thumbs up, thumbs down, avg session timings and many other features to train model.

Tools Used:

PySpark
AWS EMR

Steps :

We first do analysis on small subset of data provided.
We then uses AWS EMR for 12 GB dataset.
We used 4 cluster and uploaded our script file on S3.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
sparkify_churn.py		sparkify_churn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

predicting_customer_churn_12GB_AWS_EMR

About

Releases

Packages

Languages

jaswant7/predicting_customer_churn_12GB_AWS_EMR

Folders and files

Latest commit

History

Repository files navigation

predicting_customer_churn_12GB_AWS_EMR

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages