-
-
Notifications
You must be signed in to change notification settings - Fork 1
Scrapers list
benoit74 edited this page Oct 8, 2024
·
5 revisions
Note: this list might be out-dated. Check openZIM's github repositories for updates.
-
zimit a generic website to ZIM scraper using Webrecorder Browsertrix Crawler and
warc2zim
below. - warc2zim for converting Web ARChive files.
- ted for TED.
- youtube for YouTube.
- nautilus for generic files collections (PDFs, ...).
- ifixit for iFixit guides.
- devdocs for DevDocs documentations.
- WIP: libretexts for LibreTexts libraries.
- wikihow for wikiHow websites.
- sotoki for Stack Exchange projects.
- gutenberg for Project Gutenberg.
- openedx for MOOCs hosted on Open edX instances.
- kolibri for Kolibri channels.
- Archived: education-numerique for Éducation & Numérique.
- python-libzim, the libzim binding for Python.
- python-scaperlib, common scraping tools for Python scrapers.
- zimfarm, platform spawning scrape runs for all scrapers.
- WIP: librechef to create a Kolibri channel from libretext websites.
- mwoffliner for MediaWiki.
- phet for PhET.
- Archived: zip2zim for static HTML websites (online service).
- node-libzim, the libzim binding for nodejs.