This project is built on Big Data Europe - Hadoop in Docker.
-
Clone the repository
-
Go to the project's main directory and set up a python virtualenv:
python3 -m venv ./venv
-
Activate the virtualenv:
source venv/bin/activate
-
Install all the dependencies into the virtualenv:
python3 -m ensurepip --default-pip python3 -m pip install -r requirements.txt
-
Build Hadock images
python3 hadock.py install
-
Setup config
python3 hadock.py setup $HADOOP_DIST_HOME
$HADOOP_DIST_HOME is generally located in $HADOOP_REPOSITORY/hadoop-dist/target/hadoop-$VERSION
-
Run Hadock
python3 hadock.py run