Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Options for TimbreTrack() #23

Open
TimZiemer opened this issue Sep 16, 2024 · 1 comment
Open

Options for TimbreTrack() #23

TimZiemer opened this issue Sep 16, 2024 · 1 comment

Comments

@TimZiemer
Copy link
Member

Using TImbreTrack(), I can extract Spectral Centroid, Spectral Spread, Spectral Flux, Roughness, Sharpness and SPL (actually RMS?) from audio files. Is the audio file split into 1-second frames with 500 ms overlap? Or 2^(15) samples? Can I pass any options, like frame size, hop length, windowing function for the Fourier analysis, etc.? The documentation does not provide any details.

@Teagum
Copy link
Member

Teagum commented Sep 24, 2024

Is the audio file split into 1-second frames with 500 ms overlap? Or 2^(15)

TimbreTrack uses the following default parameters defined in its constructor:

https://github.com/ifsm/comsar/blob/aeb45d03409e223ff417d8d9345e7b128fc3a3af/src/comsar/tracks/_timbre.py#L21C1-L23C59

Is the audio file split into 1-second frames with 500 ms overlap? Or 2^(15) samples?

Both apollon, and comsar expect window size and overlap parameters to be given in SAMPLES. So, n_overlap=1024 defines windows with 1024 samples overlap.

Can I pass any options ...

Yes, you can. Use the stft_params parameter in the constructor of TimbreTrack:

from apollon.signal.container import StftParams
# from apollon.signal.models import StftParams    # depending on your apollon version

params = StftParams(fps=44100, window="hann", n_perseg=2**13, n_overlap=2**12, extend=True, pad=True)

track = TimbreTrack(params)

The window parameter accepts SciPy's standard window names, but currently no homegrown functions or additional parameters. extend=True extends the input array on both sides with half a window length of zeros. This enables centering point estimates and mitigates fade in/out artifacts. pad=True additionally zero pads the input to match the window specs exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants