AudioFeature

This is a Swift port of the featurization portion of FAIR's wav2letter++, including implementations & tests for PowerSpectrum, Mfsc & Mfcc. These functions are part of a larger system described in their 2018 paper.

Background

I could not find a good spectrogram implementation in Swift, so I decided to port the /feature section of W2l. This will likely never be as fast as the C++ version, but I'm hoping to get as close as I can to performance parity.

Usage/Notes

This relies on BaseMath and SwiftyMKL for vector math. Adding the following flags to your SwiftPM command will yield the best performance. (See BaseMath documenation for details).

-Xswiftc -Ounchecked -Xcc -ffast-math -Xcc -O2 -Xcc -march=native

You will also need to have fftw, libsndfile and MKL installed and visible to the compiler & linker. The SwiftyMKL Makefile has a target that will download and uzip the appropriate Intel libraries for convenience.

Mfsc and Mfcc support Double and Float. For example:

let input = try! loadSound("/any/file/name.wav", as: Float.self)
let mfsc = Mfsc<Float>()
mfsc.apply(on: input)

// or

let input = try! loadSound("/any/file/name.wav", as: Double.self)
let mfcc Mfcc<Double>()
mfcc.apply(on: input)

Benchmarks

To run the benchmark for MFCC:

$ swift run -Xswiftc -Ounchecked -Xcc -ffast-math -Xcc -O3 -Xcc -march=native -c release

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
AudioFeature		AudioFeature
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AudioFeature

Background

Usage/Notes

Benchmarks

About

Releases

Packages

Languages

realdoug/AudioFeature

Folders and files

Latest commit

History

Repository files navigation

AudioFeature

Background

Usage/Notes

Benchmarks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages