Work-in-progress code to scrape and then parse contracting data from departments' Proactive Disclosure website.
- PHP 5.7+
- Composer, which can be downloaded from https://getcomposer.org/download/
- Clone the repository.
- In the folder, run composer to install the "Guzzle" dependency with,
composer update
You're ready to go!
The scrapers are located in contracts-scraper.php, which can be run with composer run-script scrape
By default, it will download 2 quarters and 2 contract files from each department that has a scraper function.
Parsing data - to extract data from the HTML files downloaded with the scraper - are located in contracts-parser.php, which can be run with composer run-script parse
To keep track of which departments are scraped/parsed, check out this spreadsheet.