GitHub - nitrospaz/python_web_scraper: python web scraper - books example with data pulled from http://books.toscrape.com/

About -

This project was to make a Python script to scrape data from a given URL and store the data in a CSV file. 

It handles multiple pages, creates a books.csv file, handles errors, and has unit tests.

The url that is scraped is http://books.toscrape.com/
It is hardcoded in the main.py file.

this works on a standard python installation, no need to run any install commands.

Prerequisites -

1. standard python install
2. internet connection

Instructions to run -

1. Download the zip with all the files.
2. Extract all files.
3. Navigate to the folder with the extracted files in the command line.
4. (optional) Run the tests 
	- py scrape_v2_test_short.py
	- py scrape_v2_test_long.py
	- They should both pass.
5. Run main.py
	- py main.py
	- Should generate books.csv in the same directory with 1000 rows of information.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
packages		packages
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
requirements_generator.py		requirements_generator.py
scrape_v2.py		scrape_v2.py
scrape_v2_test_long.py		scrape_v2_test_long.py
scrape_v2_test_short.py		scrape_v2_test_short.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

nitrospaz/python_web_scraper

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages