Skip to content

web-archive-group/WALK-PageRank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WALK-PageRank

This repository modifies WALK templates in order to run PageRank on collections. Dynamic PageRank is probably too intensive, so for at least a beginning point we will run 15 iterations per collection.

This is using Warcbase's Spark Network Analysis, developed by the amazing team of Alice Ran Zhou (University of Waterloo), Jeremy Wiebe (University of Waterloo), Shane Martin (York University), and Eric Oosenbrug (York University), in part during the Archives Unleashed hackathon.

Visualizing Them

  1. Clone the repository
  2. In the /link-vis directory, you can run the following shell script if you have Python 3.

./display.sh <collection>

Where collection is one of:

ALBERTA_alberta_education_curriculum-graph
ALBERTA_alberta_floods_2013-graph
ALBERTA_alberta_oil_sands-graph
ALBERTA_canadian_business_grey_literature-graph
ALBERTA_edmonton_public_library-graph
ALBERTA_energy_environment-graph
ALBERTA_fort_mcmurray_wildfire_2016-graph
ALBERTA_government_information-graph
ALBERTA_hcf_alberta_online_encyclopedia-graph
ALBERTA_health_sciences_grey_literature-graph
ALBERTA_heritage_community_foundation-graph
ALBERTA_humanities_computing-graph```

i.e. ./display.sh ALBERTA_alberta_floods_2013-graph

Graph will be on http://localhost:4321.

Next Step

Making these legible. :)

About

Calculating PageRank on WALK Collections

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published