Calm your dog with machine learning

Listen for barks

To listing continuously for dog barks and play rain.wav when a bark is detected. I plan on being able to pass in the microphone location. python ornithokrites.py --stream 'stream'

Set up the database

createdb calm_dog
psql -d calm_dog

CREATE TABLE data (
    text_label      text,
    label           integer,
    features        float8[]
);

To restore the database: pg_restore --verbose --clean --no-acl --no-owner -h localhost -U pi -d calm_dog calm_dog.dump

To dump the database: pg_dump --username pi --verbose --clean --no-owner --no-acl --format=c calm_dog > calm_dog2.dump

Load categorized data into db

Sound files must be in either /barks or /non_barks. Until it's fixed, we need first to add padding to the beginning and end of sound clips. This is because the sound clips from the stream recording are very short and aren't picked up as segments. Edit this add_wav_padding.sh file and run for both barks and non_barks.

./add_wav_padding.sh

If you have many sound clips to load, it's much faster to join wav clips together before processing. I found it errors with files above ~12 mb though. sox *.wav joined-barks.wav

To load categorized sound clips into the database. python ornithokrites.py -d data/categorized_data/

Generate a new model

To regenerate the scaler and model. The scaler is for normalizing the sound. python create_model.py

Python debugger

import code; code.interact(local=dict(globals(), **locals()))

How it works (Based on the wonderful Ornithokrites project by Lukasz Tracewski)

After the recordings are ready following steps take place:

Apply high-pass filter. This step will reduce strength of any signal below 1500 Hz. Previous experiments have demonstrated that kiwi rarely show any vocalization below this value. It also helps to eliminate bird calls which are of no interest to the user, e.g. morepork.
Find Regions of Interest (ROIs), defined as any signal different than background noise. Since length of a single kiwi call is roughly constant, ROI length is fixed to one second. First onsets are found by calculating local energy of the input spectral frame and taking those above certain dynamically-assessed threshold. Then from the detected onset a delay of -0.2s is taken to compensate for possible discontinuities. End of ROI is defined as +0.8s after beginning of the onset, summing to 1s interval. The algorithm is made sensitive, since the potential cost of not including kiwi candidate in a set of ROIs is much higher then adding noise-only ROI.
Reduce noise. Since ROIs are identified, Noise-Only Regions (NORs) can be estimated as anything outside ROIs (including some margin). Based on NORs spectral subtraction is performed: knowing noise spectrum we can try to eliminate noise over whole sample.
Calculate Audio Features Those features will serve as a kiwi audio signature, allowing to discriminate kiwi male from female - and a kiwi from any other animals. Audio Features are calculated with Yaafe library. On its project page a complete description of above-mentioned features can be found. For each ROI following features are calculated:
- spectral flatness
- perceptual spread
- spectral rolloff
- spectral decrease
- spectral shape statistics
- spectral slope
- Linear Predictive Coding (LPC)
- Line Spectral Pairs (LSP)
Perform kiwi identification. At this stage Audio Features are extracted from the recording. Based on these, a Machine Learning algorithm, that is Support Vector Machine (SVM), will try to classify ROI as kiwi male, kiwi female and not a kiwi. Additional rules are then applied, employing our knowledge on repetitive character of kiwi calls. Only in case a sufficiently long set of calls is identified, the kiwi presence is marked.

Setup

Following libraries are used:

Aubio 4.0 - a great tool designed for the extraction of annotations from audio signals.
Yaafe 0.64 - an audio features extraction toolbox with load of features to choose from.
scikit-learn 0.14.1 - powerful Machine Learning library.
NumPy 1.8.1, SciPy 0.13.3 - canonical Python numerical libraries.

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
calming_sounds		calming_sounds
models		models
preprocessors		preprocessors
report		report
tests		tests
.gitignore		.gitignore
COPYING		COPYING
LICENSE		LICENSE
README.md		README.md
README_orig.md		README_orig.md
add_wav_padding.sh		add_wav_padding.sh
configuration.py		configuration.py
create_model.py		create_model.py
features.config		features.config
features.py		features.py
features_list.txt		features_list.txt
identification.py		identification.py
install_with_apt.sh		install_with_apt.sh
noise_reduction.py		noise_reduction.py
noise_subtraction.py		noise_subtraction.py
ornithokrites.py		ornithokrites.py
recordings_io.py		recordings_io.py
reporting.py		reporting.py
requirements.txt		requirements.txt
s3connection.py		s3connection.py
segmentation.py		segmentation.py
utilities.py		utilities.py
wavelets.py		wavelets.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Calm your dog with machine learning

Listen for barks

Set up the database

Load categorized data into db

Generate a new model

Python debugger

How it works (Based on the wonderful Ornithokrites project by Lukasz Tracewski)

Setup

About

Licenses found

Releases

Packages

Contributors 3

Languages

License

Licenses found

tlynam/calm-barking-dog

Folders and files

Latest commit

History

Repository files navigation

Calm your dog with machine learning

Listen for barks

Set up the database

Load categorized data into db

Generate a new model

Python debugger

How it works (Based on the wonderful Ornithokrites project by Lukasz Tracewski)

Setup

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages