Movie-Genre Analysis with Topic-Specific Pagerank

The project shows how to calculate continuous genre scores for movies by using ideas similar to Topic-Specific Pagerank and Trustrank from a rating dataset (movielens-25m). It was my course project for graduate level Dynamic and Social Network Analysis course in Bilkent University.

For more information about the approach, you can read this blog post.

Introduction

The standard representation of movie genres is categorical. When we view a movie's information, we see something like Western, Sci-fi. For each genre, the film either contains that genre or not. Can we convert this categorical representation into a continuous vector? Instead of saying exist or not exist for a genre, can we assign a real number that shows the effect of that genre for a particular movie?

The figure below shows one benefit of computing continuous score vectors for each genre. Each row in the figure shows genre scores of a movie, and each column shows scores of the movies for a genre. It allows us to compare different genre scores of a movie (What is the dominant genre of this movie?) and compare different films for a particular genre (Is movie A or movie B more sci-fi movie?).

Setup & Run

Prerequisites: C++11 and Python 3.9

I used Python 3.9 but any Python distribution 3.6+ should be fine.

Clone the repository and go into the directory
Install OpenMP libraries:

sudo apt install libomp-dev

Install python dependencies:

pip install -r requirements.txt

Build C++ codes in cpp directory with make. After building the codes, you should an see executable file named generate.

cd cpp
make

Open the jupyter notebook file movie-genre-analysis.ipynb with jupyter notebook and follow the code.

cd ../
jupyter notebook

Examples

One application of the resulting genre vectors is displaying genre pie-chart of the movies.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
cpp		cpp
figures		figures
legacy		legacy
mga		mga
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
article.pdf		article.pdf
movie-genre-analysis.ipynb		movie-genre-analysis.ipynb
requirements.txt		requirements.txt
run_pagerank.py		run_pagerank.py
setup.py		setup.py
slides.pdf		slides.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movie-Genre Analysis with Topic-Specific Pagerank

Introduction

Setup & Run

Examples

About

Releases

Packages

Languages

License

seljukgulcan/movie-genre-analysis-with-pagerank

Folders and files

Latest commit

History

Repository files navigation

Movie-Genre Analysis with Topic-Specific Pagerank

Introduction

Setup & Run

Examples

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages