This project has been developed to learn the ETL process using the website "https://books.toscrape.com".
I've developed three scripts that scrape in the following order: one book, all books from a chosen category, and all books from all categories. These books are then saved into a CSV file.
Requests BeautifulSoup Urllib3
git clone https://github.com/PlantBasedStudio/WebScraping.git
cd WebScraping
python3 -m venv venv source venv/bin/activate
pip install -r requirements.txt
scrap_a_book: Scrapes a single book and generates a CSV file (change the URL to use it). scrap_a_category: Scrapes all books from a chosen category and generates a CSV file (change the URL to use it). scrap_all_books_per_category: Scrapes all books from the website and generates a CSV file per category.
PlantBasedStudio : https://github.com/PlantBasedStudio