This repository contains source code of AI-based structured web data extractor.
- 👨💻 Author: Jan Joneš
- 📜 Thesis: PDF, assignment, submission, slides
- 🚀 Demo: live, Docker Hub, examples below
- 🗃️ Data: SWDE with visuals
- 📂
awe/
: Python module (data manipulation and machine learning). Seeawe/README.md
. - 📂
js/
: Node.js app (visual attribute extractor and inference demo). Seejs/README.md
. - 📂
docs/
- 📂
dev/
- 📄
data.md
: dataset preparation. - 📄
extractor.md
: running the visual extractor. - 📄
train.md
: training instructions. - 📄
release.md
: release instructions. - 📂
demo/
- 📂
docker pull janjones/awe-demo
docker run --rm -it -p 3000:3000 janjones/awe-demo
Open a web browser and navigate to http://localhost:3000/.
For more details, see docs/demo/run.md
.
docker pull janjones/awe-gradient
docker run --rm -it -v awe:/storage -p 3000:3000 janjones/awe-gradient bash
Then, run inside the Docker container:
git clone https://github.com/jjonescz/awe .
git clone https://github.com/jjonescz/swde-visual data/swde
python -m awe.training.params
python -m awe.training.train
# Model is trained, now you can run the demo.
cd js
pnpm install
DEBUG=1 pnpm run server
For more details, see
Generated by the live demo.