This repo is designed to house code related to Tarteel machine learning related tasks. 🔬
Specifically, things like:
- Model selection ✅
- Preprocessing of data 🔉
- Model training, validation, and and iteration 🔁
- Demos 🚀
Code here is mostly experimental so check back regularly for updates.
If you found this repo helpful, please keep it's contributors in your duaa 🙌.
🔥 To see our technology live in action, visit tarteel.io. 🔥
We use Python 3.7 for our development.
However, any Python above 3.6 should work.
For audio pre-processing, we use ffmpeg
and ffprobe
.
Make sure you install these using your system package manager.
Mac OS
brew install ffmpeg
Linux
sudo apt install ffmpeg
Then install the Python dependencies from requirements.txt
.
pip3 install -r requirements.txt
Use the -h
/--help
flag for more info on how to use each script.
This repo is structured as follows:
Root
download.py
: Download the Tarteel datasetcreate_train_test_split.py
: Create train/test/validation split csv files.generate_alphabet|vocabulary.py
: Generate all unique letters/ayahs in the Quran in a text file.generate_csv_deepspeech.py
: Create a CSV file for training with DeepSpeech.
Check out the wiki for instructions on how to download and pre-process the data, as well as how to start training models.
Check out CONTRIBUTING.md
to start contributing to Tarteel-ML!