A web service for turning HTML pages into traversable JSON documents
Very early stage development. If you have any feature requests just create an issue on the project
Running the server locally
lein uberjar
docker build -t falkor .
docker run -t falkor
# Visit http://localhost:5000
- Better error handling
- CORS
- Query filtering (return only certain attributes)
- Fetching multiple elements in a single request ( e.g [h1 > a, .subtitle] )
Get all the title links from the Reddit.com home page
https://falkor-api.herokuapp.com/api/query?url=http://reddit.com&query=a.title
Grab all the news stories from Digg.com
https://falkor-api.herokuapp.com/api/query?url=http://digg.com&query=.story-title%20a
Extract all the images from Digg.com
https://falkor-api.herokuapp.com/api/query?url=http://digg.com&query=img[src]
Filters to remove some of the attribute cruft
For example if we just want to extract the text for an element and ignore the other attributes
&filter=[text]
Copyright © 2015 Forward Digital Limited
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.