Crawlers for various websites mostly news providers

Basic idea, fetch content of a web-page and examine

the text present, extracting matching keywords/text

eg by file extension name or domain.

Once links are extracted, if files, they are

downloaded, or queued up on the cloud for workers to

actually perform the downloads.

To use the local based downloader:

++ Works on any version of Python >= 2.X

python fileDownloader.py
To use the cloud based job queuer:

++ So far built for Python3.X

python3 targetForCloud.py

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
classifier		classifier
resty @ b0769d8		resty @ b0769d8
routing @ 277248c		routing @ 277248c
solos		solos
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
RobotParser.py		RobotParser.py
acmDl.py		acmDl.py
downloadImages.py		downloadImages.py
fileDownloader.py		fileDownloader.py
oxy		oxy
routeUtils.py		routeUtils.py
shardy.py		shardy.py
test.txt		test.txt
utils.py		utils.py

Provide feedback