Skip to content

Speaker Recognition

arvind0422 edited this page Jul 5, 2017 · 4 revisions

Speaker Recognition

The built system could classify 10 voices with 100% accuracy tested on the ELSDSR dataset with 90 seconds of test speech per speaker.

Workflow

Steps

  1. Data Collection
  2. Feature Extraction
  3. Model Building & Training
  4. Classification

Future Extensions

  1. Confusion Estimate.
  2. Create a threshold to identify if the speaker is new.
  3. User Interface to Train & Test using Tkinter.
  4. Testing with Real Life / Noisy Data.

Applications

  1. Attendance in Classrooms.
  2. Voice Based Security in addition to Biometrics.

Installations

  1. NumPy: numpy
  2. Hidden Markov Models: hmmlearn
  3. Speech Signal Analysis: librosa

References