Skip to content

Latest commit

 

History

History
60 lines (37 loc) · 1.28 KB

README.md

File metadata and controls

60 lines (37 loc) · 1.28 KB

Save to web.archive.org logo


Like my work?

Tip me


Description

Scrapes the given website for internal links and saves the found ones into web.archive.org

Installation

I assume you have already installed go. (Go installation manual)

Dependencies

Download the dependecies via go get

Execute the following two commands:

go get -u github.com/simonfrey/proxyfy
go get -u github.com/PuerkitoBio/goquery

Download tool

Just clone the git repo

git clone https://github.com/simonfrey/save_to_web.archive.org.git

Execution

Navigate into the directory of the git repo.

Execute with:

Please Replace http[s]://[yourwebsite.com] with the url of the website you want to scrape and save.

go run main.go http[s]://[yourwebsite.com]

****Additional commandline arguments:

-p for proxyfing the requests

-i for also crawling internal urls (e.g. /test/foo)

So if you want to use the tool with also crawling interal links and use a proxy for that it would be the following command

go run main.go -p -i http[s]://[yourwebsite.com]