ImageMonkey is a free, public open source dataset. With all the great machine learning frameworks available it's pretty easy to train pre-trained Machine Learning models with your own image dataset. However, in order to do so you need a lot of images. And that's usually the point where it get's tricky. You either have to create the training images yourself or scrape them together from various datasources. ImageMonkey aims to solve this problem, by providing a platform where users can drop their photos, tag them with a label, and put them into public domain.
There are basically two ways to set up your own ImageMonkey
instance. You can either set up everything by hand, which gives you the flexibility to choose your own linux distribution, monitoring tools and scrips or you could use our Dockerfile
to spin up a new ImageMonkey
instance within just a few minutes.
The docker image is for development only - do NOT use it in production!
The following section contains some notes on how to set up your own instance to host ImageMonkey yourself. This should only give you an idea how you could configure your system. Of course you are totally free in choosing a different linux distribution, tools and scripts. If you are only interested in how to compile ImageMonkey, then you can jump directly to the Build Application section
Info: Some commands are distribution (Debian 10) specific and may not work on your system.
- create a new user
imagemonkey
withadduser imagemonkey
- disable root login via ssh by changing the
PermitRootLogin
line in/etc/ssh/sshd_config
toPermitRootLogin no
) - block all ports except port 22, 443 and 80 (on eth0) with:
#!bash
iptables -P INPUT DROP && iptables -A INPUT -i eth0 -p tcp --dport 22 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 443 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 80 -j ACCEPT
- allow all established connections with:
#!bash
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A OUTPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
- allow all loopback access with:
#!bash
iptables -A INPUT -i lo -j ACCEPT
iptables -A OUTPUT -o lo -j ACCEPT
- install
iptables-persistent
to load firewall rules at startup - save firewall rules with:
iptables-save > /etc/iptables/rules.v4
- verify that rules are loaded with
iptables -L
- install PostgreSQL
- edit
/etc/postgresql/9.6/main/postgresql.conf
and setlisten_addresses = 'localhost'
- restart PostgreSQL service with
service postgresql restart
to apply changes - create database by applying schema
/env/postgres/schema.sql
withpsql -f schema.sql
- create new postgres user
monkey
by executing the following in psql:
CREATE USER monkey WITH PASSWORD 'your_password';
\connect imagemonkey
GRANT ALL PRIVILEGES ON DATABASE imagemonkey to monkey;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO monkey;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO monkey;
GRANT USAGE ON SCHEMA blog TO monkey;
-
test if newly created user works with:
psql -d imagemonkey -U monkey -h 127.0.0.1
-
populate labels with
go run populate_labels.go common.go web_secrets.go
-
add donation image provider with
insert into image_provider(name) values('donation');
-
build
temporal_table
extension, as described here: https://github.com/arkhipov/temporal_tables -
connect to imagemonkey database and execute
CREATE EXTENSION temporal_tables;
-
connect to imagemonkey database and execute
CREATE EXTENSION uuid-ossp;
-
connect to imagemonkey database and execute
CREATE EXTENSION postgis;
-
apply
defaults.sql
-
apply
indexesl.sql
-
apply sql functions from
env/functions
directory -
apply sql stored procedures from
env/stored_procs
directory
- install redis with
apt-get install redis-server
- make sure that redis only listens on localhost
- change redis.conf and set
maxmemory
(e.g: 500mb) and setmaxmemory-policy
toallkeys-lru
Windows:
- install MSYS2
- open MSYS2 terminal and install zlib and pkgconfig via pacman
- download vipslib v8.6.5 from https://github.com/libvips/libvips/releases
- set PKG_CONFIG_PATH environment variable to the folder where vips.pc resides. e.q:
PKG_CONFIG_PATH=/c/Users/Bernhard/Downloads/vips-dev-w64-all-8.6.5/vips-dev-8.6/lib/pkgconfig
- build bimg with:
/c/Go/bin/go get -u gopkg.in/h2non/bimg.v1
Linux:
- install nginx with
apt-get install nginx
- install nginx-extras with
apt-get install nginx-extras
- install letsencrypt certbot with
apt-get install certbot
- add a A-Record DNS entry which points to the IP address of your instance
- run
certbot certonly
to obtain a certificate for your registered domain - modify
conf/nginx/nginx.conf
and replaceimagemonkey.io
andapi.imagemonkey.io
with your own domain names, copy it to/etc/nginx/nginx.conf
and reload nginx withservice nginx reload
Minimal required Go version: v1.11.10
- install git with
apt-get install git
- install golang with
apt-get install golang
- clone repository
- set GOPATH with
export GOPATH=$HOME/go
- set GOBIN with
export GOBIN=$HOME/bin
- install all dependencies with
go get -d ./...
- install API application with
go install api.go api_secrets.go common.go imagedb.go
- install API application with
go install web.go web_secrets.go common.go imagedb.go
- copy
wordlists/en/misc.txt
to/home/imagemonkey/wordlists/en/misc.txt
- create donation directories with:
mkdir -p /home/imagemonkey/donations
mkdir -p /home/imagemonkey/unverified_donations
- install supervisor with
apt-get install supervisor
- add
imagemonkey
user to supervisor group withadduser imagemonkey supervisor
- create logging directories with
mkdir -p /var/log/imagemonkey-api
,mkdir -p /var/log/imagemonkey-web
mkdir -p /var/log/imagemonkey-statworker
,mkdir -p /var/log/imagemonkey-bot
,mkdir -p /var/log/imagemonkey-blog-subscription-worker
,mkdir -p /var/log/imagemonkey-data-processor
,mkdir -p /var/log/imagemonkey-labelsdownloader
,mkdir -p /var/log/imagemonkey-trending-labels-worker
- copy
conf/supervisor/imagemonkey-api.conf
to/etc/supervisor/conf.d/imagemonkey-api.conf
- copy
conf/supervisor/imagemonkey-web.conf
to/etc/supervisor/conf.d/imagemonkey-web.conf
- copy
conf/supervisor/imagemonkey-statworker.conf
to/etc/supervisor/conf.d/imagemonkey-statworker.conf
- copy
conf/supervisor/imagemonkey-blog-subscription-worker.conf
to/etc/supervisor/conf.d/imagemonkey-blog-subscription-worker.conf
- copy
conf/supervisor/imagemonkey-bot.conf
to/etc/supervisor/conf.d/imagemonkey-bot.conf
- copy
conf/supervisor/imagemonkey-labels-downloader.conf
to/etc/supervisor/conf.d/imagemonkey-labels-downloader.conf
- copy
conf/supervisor/imagemonkey-trending-labels-worker.conf
to/etc/supervisor/conf.d/imagemonkey-trending-labels-worker.conf
- add
EnvironmentFile=/etc/environment
to the service section of the systemctl supervisor config file (see https://stackoverflow.com/questions/47083582/supervisor-not-using-etc-environment) - run
systemctl daemon-reload
andsystemctl restart supervisor
- run
supervisorctl reread && supervisorctl update && supervisorctl restart all
on imagemonkey-playground instance
- install
rsync
withapt-get install rsync
- create a new user
backupuser
withadduser backupuser
(use a strong password) - change to user
backupuser
withsu backupuser
and create a new SSH key withssh-keygen -t ed25519 -a 100
- copy SSH public key to imagemonkey instance with:
ssh-copy-id -i ~/.ssh/your_generated_id.pub backupuser@imagemonkey-host
- give
backupuser
permissions to write to/home/playground/donations
with:chgrp backupuser /home/playground/donations && chmod g+rwx /home/playground/donations
- add a new cronjob for the user
backupuser
with:crontab -u backupuser -e
and add the following line (runs rsync every 15min):
*/15 * * * * rsync -a [email protected]:/home/imagemonkey/donations/ /home/playground/donations/