Advanced-Database-Systems

This project was implemented for the purposes of the undergrad. course Advamced Topics of Database Systems @ECE, NTUA, GR. Given a large dataset of movies and informations about them, the purpose of the exercise was to:

Use spark framework to build queries about certain queries both in RDD API and SPARK SQL.
Support use of .csv and .parquet files for the SQL queries
Compare the time needed to get a response from thw query, for all possible setups (RDD/SQL) and .csv/.parquet (only in SQL).

=============================================================================================

Create a function that implements repartition join
Create a function tha implements broadcast join
Compare running time of the above join on given data.

SETUP

All queries were running on a cluster of two nodes (master/slave) each having 2GB RAM. The VM's were assigned by Okeanos project @NTUA.

TODO

Query Description will be uploaded in english and in greek :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Advanced-Database-Systems

SETUP

TODO

Files

README.md

Latest commit

History

README.md

File metadata and controls

Advanced-Database-Systems

SETUP

TODO