Skip to content

nitrospaz/python_web_scraper

Repository files navigation

About -

This project was to make a Python script to scrape data from a given URL and store the data in a CSV file. 

It handles multiple pages, creates a books.csv file, handles errors, and has unit tests.

The url that is scraped is http://books.toscrape.com/
It is hardcoded in the main.py file.

this works on a standard python installation, no need to run any install commands.

Prerequisites -

1. standard python install
2. internet connection

Instructions to run -

1. Download the zip with all the files.
2. Extract all files.
3. Navigate to the folder with the extracted files in the command line.
4. (optional) Run the tests 
	- py scrape_v2_test_short.py
	- py scrape_v2_test_long.py
	- They should both pass.
5. Run main.py
	- py main.py
	- Should generate books.csv in the same directory with 1000 rows of information.

About

python web scraper - books example with data pulled from http://books.toscrape.com/

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages