Archive Binge

AB was a webcomic aggregator and reader. As the original developer is no longer able to work on the project, the source code is made available here for use, reproduction, modification, display, distribution, and community contribution. Please see the license for more details on what you may do with this source code. If you use this code, you must provide access to the source code, whether it be linking to this repo (if unmodified), or linking to your own public repo.

Requirements

PHP 7.3+
Python 2.7
MySQL (Preferably MariaDB 10.2+)

Installation

git clone [email protected]:Respheal/archivebinge.git
cd ./archivebinge/
sudo apt-get install python-dev python-pip libxml2-dev libxslt1-dev zlib1g-dev libffi-dev libssl-dev
python -m virtualenv ./crawler/crawlerenv
source ./crawler/crawlerenv/bin/activate
pip install -U pip
pip install -r ./crawler/requirements.txt

Before use, you should update the following strings throughout the codebase. For the frontend site, these variables are currently managed in includes/conf.inc.php, but may need to be updated in Python scripts

Any instance of:

'/full/path/to/archivebinge/crawler/crawlerenv/' should be updated to the virtual env created above
DATABASE_HOST should be your database host (probably 'localhost')
DATABASE_USER should be your database user
DATABASE_PASSWORD should be your database password
DATABASE_NAME should be your database name
SECRET_KEY should be a unique key used for encryption

Files to rename:

mv ./crawler/minisup.sample.py ./crawler/minisup.py
mv ./crawler/supervisor.sample.py ./crawler/supervisor.py
mv ./includes/conf.inc.sample.php ./includes/conf.inc.php
mv ./includes/tos.inc.sample.php ./includes/tos.inc.php
mv ./includes/privacy.inc.sample.php ./includes/privacy.inc.php

In includes/conf.inc.php, update the SUPPORT_EMAIL, FEEDBACK_EMAIL, and ABUSE_EMAIL variables to your contact information.

To use the social media logins, you will need to configure their OAuth settings in includes/conf.inc.php:

Facebook:
See: https://developers.facebook.com/docs/facebook-login/web/

Twitter:
See: https://developer.twitter.com/en/docs/authentication/guides/log-in-with-twitter

Google:
See: https://developers.google.com/identity/protocols/oauth2

Lastly, although you may create a database yourself to your own specifications, I've included a dump of an empty database which you may import: ./mysql_dump/ab_database.sql

Usage Notes

Whichever user has an ID of 1 in the database is the admin user.
I make absolutely no promises about the functionality, readability, usability of any code. Use at your own risk.
Not all files included are necessary for functionality (see: some tutorial files that got left in)
Updates to this repo may be pushed to archivebinge.com and are available for use in derivative sites and/or repos

Crons

In order to collect comic updates, AB relies on two crons:

23,53 * * * * cd /path/to/public_html/crawler; ./minisup.py 2>> /path/to/public_html/crawler/minisuplog
*/15 * * * * cd /path/to/public_html/crawler; ./supervisor.py 2>> /path/to/public_html/crawler/supervisorlog

minisup.py collects updates for existing comics. supervisor.py collects updates for newly-added comics. You may set them to run at whatever intervals you like. Do check on them occasionally though--some comics may trigger an infinite-crawl bug, resulting in multiple processes, which may result in server resource overages.

Crawlers

All scrapy crawlers are stored under ./crawler/archivebinger/spiders. You can run the spiders manually like so:

.crawler/crawlerenv/bin/scrapy crawl typefinder -a starturl='https://comic.com/first-page' -a secondurl='https://comic.com/second-page' -a cid='crawldata.json'

This spider will find a reference to the second page on the first page of a comic and note that for future crawling. The output file, crawldata.json, contains variables used in the following:

.crawler/crawlerenv/bin/scrapy crawl superbinge -a starturl="https://comic.com/any-page" -a position="inner" -a tag="rel" -a identifier="next"

This will launch a crawler through all of the pages of the referenced comic.

License

This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at https://mozilla.org/MPL/2.0/

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
assets		assets
blog		blog
crawler		crawler
includes		includes
mysql_dump		mysql_dump
reader		reader
.gitattributes		.gitattributes
.gitignore		.gitignore
.htaccess		.htaccess
404.php		404.php
LICENSE		LICENSE
README.md		README.md
abwidget.php		abwidget.php
abwidget2.php		abwidget2.php
browse.php		browse.php
claim.php		claim.php
comicinfo.php		comicinfo.php
contact.php		contact.php
crawlcheck.php		crawlcheck.php
crawlcheckedit.php		crawlcheckedit.php
crawledit.php		crawledit.php
dashboard.php		dashboard.php
edit.php		edit.php
faq.php		faq.php
favicon.ico		favicon.ico
index.php		index.php
index.php.maint		index.php.maint
login.php		login.php
logout.php		logout.php
pageedit.php		pageedit.php
privacy.php		privacy.php
profile.php		profile.php
readerlist.php		readerlist.php
robots.txt		robots.txt
search.php		search.php
submit.php		submit.php
tags.php		tags.php
tos.php		tos.php
warnings.php		warnings.php
widgets.php		widgets.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Archive Binge

Requirements

Installation

Usage Notes

Crons

Crawlers

License

About

Releases

Packages

Contributors 2

Languages

License

Respheal/archivebinge

Folders and files

Latest commit

History

Repository files navigation

Archive Binge

Requirements

Installation

Usage Notes

Crons

Crawlers

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages