BirdCLEF 2022 training code for group 2A.
We recommend using a new anaconda environment with python 3.10. First clone the repo with git clone
and then create the environment with
conda create -n birdclef python=3.10`
conda activate birdclef
Our pipeline was tested under several Ubuntu environments. Using Windows is not recommended as there are problematic libraries that don't work well when using it.
Install the required modules with pip
pip install -r requirements.txt
The quickest way would be to download our premade dataset directly
from Kaggle.
Put the dataset in a datasets
folder in the project root directory.
Alternatively you can preprocess the data yourself. First download the
original dataset
from Kaggle and unpack it to a separated folder in data_processing
. We chose the folder
name birdclef-2022
.
Then change the root
and the out_folder
variable in generate_specs.py
to
the train_audio
path (found in the 2022 dataset) and your desired output folder name
respectively. Provide also a root folder where there are background noise audio files for
data augmentation or uncomment the noise processing.
Start the dataset generation by moving to the processing directory and executing the script with
cd data_processing
python3 generate_specs.py
Note that this takes a very long time. Afterwards move the generated folders inside the chosen
spectrogram root folder (data and noise) to the Kaggle
dataset folder (birdclef-2022
in our case).
Since we changed the file structure, we must recalculate the metadata and provide group
information for the k-fold splitting. Change the directory to model
. Then change
the root variable in the generate_augmented_df.py
to your dataset root folder with the
csv files and the generated data and noise folders.
cd model
python3 generate_augmented_df.py
The dataset is now ready for training.
Run the training with the following command
python3 train.py --data_path ../datasets/birdclef-2022 --load_weights False
the --data_path
argument is the relative path to the dataset. The --load_weights
is for
loading precalculated weights during pretraining on another dataset like the 2021 data
(Dataset we used).
For 3-fold models we have example weights that match the default config.
All the necessary configurations can be set in the config
file. To use WANDB provide
your api key in the config file and set up the correct project settings in the train script.