sotoki
(stackoverflow to kiwix) is an OpenZIM scraper to create offline versions of Stack Exchange websites such as stack overflow.
It is based on Stack Exchange's Data Dumps hosted by The Internet Archive.
sotoki
works off a domain
that you must provide. That is the domain-name of the stackexchange website you want to scrape. Run sotoki --list-all
to get a list of those
Note: when running off the git repository, you'll need to download a few external dependencies that we pack in Python releases. Just run python src/sotoki/dependencies.py
docker run -v my_dir:/output openzim/sotoki sotoki --help
sotoki
is a Python3 software. If you are not using the Docker image, you are advised to use it in a virtual environment to avoid installing software dependencies on your system.
python3 -m venv env # Create virtualenv
source env/bin/Activate # Activate the virtualenv
pip3 install sotoki # Install dependencies
sotoki --help # Display kolibri2zim help
Call deactivate
to quit the virtual environment.
See requirements.txt
for the list of python dependencies.