Dayviews-Scraper

A scraper that downloads all the posts from a non-password-protected dayviews (formerly known as bilddagboken) account.

Dependencies

There's only one major dependency for this script: [PhantomJS][http://phantomjs.org/]. Make sure to install it, otherwise nothing's going to run.

Installation

Installing the Dayviews-Scraper can be super easy, or super hard - it all depends on your experience with the command line and basic sys-admin work. Here are the steps:

Install PhantomJS. On windows, this means that you need to download the latest executable from phantomjs.org and adding it to your path variable. On mac, this is easiest done by simply running the following command from your terminal:

brew update && brew install phantomjs
Clone this repo.
Run the script like I tell you below in "Usage".

Usage

Run "phantomjs scrape.js http://dayviews.com/username/firstImageId/" from your terminal. Make sure that you're standing in the right folder.

Arguments

First argument (required): URL to entrypoint.

Second argument (optional): Offset number

Known issues

The script gives a typeerror on some pageloads when evaluating a third party ad script.
The script sometimes mistake a facebook (or casumo) js-url for the actual image we want. No worries - the script will try again.
Special characters (Å, Ä, Ö) is not yet supported. This is prioritized.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.gitignore		.gitignore
README.md		README.md
scrape.js		scrape.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dayviews-Scraper

Dependencies

Installation

Usage

Arguments

Known issues

About

Releases

Packages

Languages

cupofjoakim/Dayviews-Scraper

Folders and files

Latest commit

History

Repository files navigation

Dayviews-Scraper

Dependencies

Installation

Usage

Arguments

Known issues

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages