A scraper that downloads all the posts from a non-password-protected dayviews (formerly known as bilddagboken) account.
There's only one major dependency for this script: [PhantomJS][http://phantomjs.org/]. Make sure to install it, otherwise nothing's going to run.
Installing the Dayviews-Scraper can be super easy, or super hard - it all depends on your experience with the command line and basic sys-admin work. Here are the steps:
-
Install PhantomJS. On windows, this means that you need to download the latest executable from phantomjs.org and adding it to your path variable. On mac, this is easiest done by simply running the following command from your terminal:
brew update && brew install phantomjs
-
Clone this repo.
-
Run the script like I tell you below in "Usage".
Run "phantomjs scrape.js http://dayviews.com/username/firstImageId/" from your terminal. Make sure that you're standing in the right folder.
First argument (required): URL to entrypoint.
Second argument (optional): Offset number
- The script gives a typeerror on some pageloads when evaluating a third party ad script.
- The script sometimes mistake a facebook (or casumo) js-url for the actual image we want. No worries - the script will try again.
- Special characters (Å, Ä, Ö) is not yet supported. This is prioritized.